Skip to content

Commit a3c7356

Browse files
committed
Correction after dev review
Part of #1753
1 parent 08509cc commit a3c7356

File tree

2 files changed

+74
-56
lines changed

2 files changed

+74
-56
lines changed

doc/book/app_server/luajit_memprof.rst

Lines changed: 72 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -2,21 +2,15 @@
22
LuaJIT memory profiler
33
======================
44

5-
Stating from version :doc:`2.7.1 </release/2.7.1>`, Tarantool
6-
has the built-in module called ``memprof`` that implements a LuaJIT memory
7-
profiler and a profile parser. The profiler provides
5+
Starting from version :doc:`2.7.1 </release/2.7.1>`, Tarantool
6+
has the built-in module called ``misc.memprof`` that implements the LuaJIT memory
7+
profiler and the profile parser (further, *profiler*). The profiler provides
88
a memory allocation report that helps analyse Lua code and find out the places
9-
that put the most pressure on the Lua garbage collector.
9+
that put the most pressure on the Lua garbage collector (GC).
1010

11-
//?where to put this paragraph//
12-
Usually developers are not interested in information about allocations
13-
inside built-ins. So if a Lua built-in function is called from a Lua function,
14-
all allocations are attributed to this Lua function.
15-
Otherwise, this event is attributed to a C function.
16-
17-
//?where to put this paragraph//
18-
Tail call optimization does not create a new call frame, so all allocations
19-
inside the function called via CALLT/CALLMT bytecodes are attributed to its caller.
11+
.. contents::
12+
:local:
13+
:depth: 2
2014

2115
.. _profiler_usage:
2216

@@ -37,11 +31,11 @@ Collecting binary profile
3731
~~~~~~~~~~~~~~~~~~~~~~~~~
3832

3933
To collect a binary profile for a particular part of the Lua code,
40-
you need to place this part between two ``memprof`` functions,
34+
you need to place this part between two ``misc.memprof`` functions,
4135
namely, ``misc.memprof.start()`` and ``misc.memprof.stop()``, and then execute
4236
the code under Tarantool.
4337

44-
Below is a piece of simple Lua code named ``test.lua`` to illustrate this.
38+
Below is a chunk of simple Lua code named ``test.lua`` to illustrate this.
4539

4640
.. _profiler_usage_example01:
4741

@@ -57,15 +51,28 @@ Below is a piece of simple Lua code named ``test.lua`` to illustrate this.
5751
end
5852
5953
local t = {}
60-
for _ = 1, 1e5 do
54+
for i = 1, 1e5 do
6155
-- table.insert is the built-in function and all corresponding
62-
-- allocations are reported in the scope of main chunk.
56+
-- allocations are reported in the scope of the main chunk.
6357
table.insert(t,
64-
append('q', _)
58+
append('q', i)
6559
)
6660
end
6761
local stp, err = misc.memprof.stop()
6862
63+
.. note::
64+
65+
Usually, the information about allocations inside Lua built-ins are not really
66+
useful for the developer. That's why if a Lua built-in function is called from
67+
a Lua function, the profiler attributes all allocations to the Lua function.
68+
Otherwise, this event is attributed to a C function.
69+
70+
Tail call optimization doesn't create a new call frame, so all allocations
71+
inside the function called via the ``CALLT/CALLMT`` `bytecodes <http://wiki.luajit.org/Bytecode-2.0>`_
72+
are attributed to its caller.
73+
74+
Example above illustrates these cases.
75+
6976
Starting profiler in Lua code:
7077

7178
.. code-block:: lua
@@ -161,9 +168,8 @@ Tarantool generates a profiling report and closes the session.
161168
162169
.. note::
163170

164-
A report can look differently for the same piece of Lua code depending
165-
on the OS used. On MacOS, the report data
166-
//?can be influenced by the LuaJIT GC64 running//.
171+
On MacOS , a report is different because Tarantool and LuaJIT are built
172+
with the GC64 mode enabled for this OS.
167173

168174
Let's examine the report structure. A report has three sections:
169175

@@ -181,39 +187,38 @@ An event record has the following format:
181187
@<filename>:<function_line>, line <line_number>: <number_of_events> <allocated> <freed>
182188
183189
* <filename>—a name of the file containing Lua code.
184-
* <function_line>—a number of the line where the function generating the event
185-
is declared. Sometimes <function_line> is ``0``. It means that
186-
the function generating the event is the //?main/entire code of //?file/script itself.
187-
This is exactly the case in the :ref:`example above <profiler_usage_example01>`.
188-
Comments in the code explain why it happens for each of the functions.
189-
* <line_number>—a number of the line where the event is detected.
190+
* <function_line>—the line number where the function generating the event
191+
is declared. In some of the cases, allocations are attributed not to
192+
the declared function but to the main chunk. In this case, the <function_line>
193+
is set to ``0``. See the :ref:`code chunk above<profiler_usage_example01>`
194+
with the explanation in the comments for some examples.
195+
* <line_number>—the line number where the event is detected.
190196
* <number_of_events>—a number of events for this code line.
191-
* <allocated>—bytes allocated in memory during the //?event/events.
192-
* <freed>—bytes freed in memory during //?event/events.
197+
* <allocated>—amount of memory allocated during all the events, bytes.
198+
* <freed>—amount of memory freed during all the events, bytes.
193199

194-
``Overrides`` shows what allocation has been overridden.
200+
The ``Overrides`` label shows what allocation has been overridden.
195201

196202
.. _profiler_usage_internal_jitoff:
197203

198-
``INTERNAL`` indicates that this event is caused by internal LuaJIT structures.
199-
200-
//!the note below really needs to be reviewed thoroughly//
204+
The ``INTERNAL`` label indicates that this event is caused by internal LuaJIT
205+
structures.
201206

202207
.. note::
203208

204209
Important note regarding the ``INTERNAL`` label and the recommendation
205210
of switching the JIT compilation off (``jit.off()``): this version of the
206-
profiler doesn't support verbose reporting for allocations //?on/for
211+
profiler doesn't support verbose reporting for allocations on
207212
`traces <https://en.wikipedia.org/wiki/Tracing_just-in-time_compilation#Technical_details>`_.
208-
If some memory allocations are made //?on/for a trace,
213+
If memory allocations are made on a trace,
209214
the profiler can't associate the allocations with the part of Lua code
210215
that generated the trace. In this case, the profiler labels such allocations
211216
as ``INTERNAL``.
212217

213218
So, if the JIT compilation is on,
214219
new traces will be generated and there will be a mixture of events labeled
215220
``INTERNAL`` in the profiling report : some of them are really caused by
216-
internal LuaJIT structures, but some of them are caused by allocations //?on/for
221+
internal LuaJIT structures, but some of them are caused by allocations on
217222
traces.
218223

219224
If you want to have more definite report without new trace allocations,
@@ -249,11 +254,13 @@ a Q&A format.
249254
inside C code?
250255

251256
**Answer (A)**: The profiler reports only allocation events caused by the Lua
252-
allocation functions. All Lua-related allocations, like table or string creation
257+
allocator. All Lua-related allocations, like table or string creation
253258
are reported. But the profiler doesn't report allocations made by ``malloc()``
254259
or other non-Lua allocators. You can use ``valgrind`` to debug them.
255260

256-
**Q**: Why is there so many ``INTERNAL`` allocations in my profiling report?
261+
|
262+
263+
**Q**: Why are there so many ``INTERNAL`` allocations in my profiling report?
257264
What does it mean?
258265

259266
**A**: ``INTERNAL`` means that these allocations/reallocations/deallocations are
@@ -262,20 +269,26 @@ Currently, the memory profiler doesn't report verbosely allocations of objects
262269
that are made during trace execution. Try to :ref:`add jit.off() <profiler_usage_internal_jitoff>`
263270
before profiler start.
264271

272+
|
273+
265274
**Q**: Why is there some reallocations/deallocations without the ``Overrides``
266275
section?
267276

268277
**A**: These objects can be created before the profiler starts. Adding
269278
``collectgarbage()`` before the profiler's start enables to collect all
270279
previously allocated objects that are dead when the profiler starts.
271280

281+
|
282+
272283
**Q**: Why some objects are not collected during profiling? Is it
273284
a memory leak?
274285

275286
**A**: LuaJIT uses incremental Garbage Collector (GC). A GC cycle may not be
276287
finished at the moment of the profiler's stop. Add ``collectgarbage()`` before
277288
stopping the profiler to collect all the dead objects for sure.
278289

290+
|
291+
279292
**Q**: Can I profile not just a current chunk but the entire running application?
280293
Can I start the profiler when the application is already running?
281294

@@ -330,7 +343,7 @@ investigated with the help of the memory profiler reports.
330343
.. code-block:: lua
331344
:linenos:
332345
333-
jit.off() -- More verbose reports.
346+
jit.off() -- Prevent allocations on new traces.
334347
335348
local function concat(a)
336349
local nstr = a.."a"
@@ -384,28 +397,29 @@ you will get the following profiling report:
384397
385398
The reasonable questions regarding the report can be:
386399

387-
* Why are there no allocations related to the ``concat()`` function?
388-
* Why the amount of allocations is not a round number?
389-
* Why are there approximately 20K allocations instead of 10K?
400+
* Why are there no allocations related to the ``concat()`` function?
401+
* Why the amount of allocations is not a round number?
402+
* Why are there approximately 20K allocations instead of 10K?
390403

391404
First of all, LuaJIT doesn't create a new string if the string with the same
392-
payload exists. It is called the string interning. So, when the string is
393-
created via
394-
the ``format()`` function, there is no need to create the same string via
395-
the ``concat()`` function, and LuaJIT just use the previous one.
405+
payload exists. This is called the string interning. So, when the string is
406+
created via the ``format()`` function, there is no need to create the same
407+
string via the ``concat()`` function, and LuaJIT just use the previous one.
396408

397-
This is the reason of //?unpretty amount of allocations: Tarantool creates some
409+
That is also the reason why the amount of allocations is not the round numbber
410+
as can be expected from the cycle operator ``for i = 1, 10000...``:
411+
Tarantool creates some
398412
strings for internal needs and built-in modules, so some strings already exist.
399413

400414
But why are there so many allocations? It's almost twice as big as the expected
401415
amount. This is because the ``string.format()`` built-in function creates
402416
another string necessary for the ``%s`` identifier, so there are two allocations
403417
for each iteration: for ``tostring(i)`` and for ``string.format("%sa", string_i_value)``.
404-
You can see the difference in the behaviour by adding the
418+
You can see the difference in behaviour by adding the
405419
``local _ = tostring(i)`` line between lines 21 and 22.
406420

407-
Let's comment the 22nd line, namely, ``local f = format(i)``
408-
(by adding ``--`` at the line start) to take a look at the ``concat()`` function.
421+
To profile only the ``concat()`` function, comment the line 22, namely,
422+
``local f = format(i)`` and run the profiler.
409423

410424
The profiler's output is the following:
411425

@@ -424,10 +438,10 @@ The profiler's output is the following:
424438
@format_concat.lua:3, line 4: 1 0 32768
425439
426440
427-
**Q**: But what will change if JIT compilation is enabled?
441+
**Q**: But what will change if the JIT compilation is enabled?
428442

429-
**A**: Let's comment the first line of the code, namely, ``jit.off()`` to see what
430-
will happen. Now, there are only 56 allocations in the report, and all other
443+
**A**: Let's comment the first line of the code, namely, ``jit.off()`` and run
444+
the profiler . Now, there are only 56 allocations in the report, and all other
431445
allocations are JIT-related (see also the related
432446
`dev issue <https://github.com/tarantool/tarantool/issues/5679>`_):
433447

@@ -450,7 +464,11 @@ This happens because a trace is compiled after 56 iterations, and the
450464
JIT-compiler removed the unused ``c`` variable from the trace, and, therefore,
451465
the dead code of the ``concat()`` function is eliminated.
452466

453-
Let's now profile only the ``format()`` function with JIT enabled.
467+
Next, let's profile only the ``format()`` function with JIT enabled.
468+
For that, keep the lines 1 and 23 commented (``jit.off()`` and
469+
``local c = concat(i)`` respectively), uncomment the line 22
470+
(``local f = format(i)``), and run the profiler.
471+
454472
The profiler's output is the following:
455473

456474
.. code-block:: console

doc/release/2.7.1.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -61,8 +61,8 @@ Core
6161
LuaJIT
6262
~~~~~~
6363

64-
- Introduced the :doc:`LuaJIT memory profiler </book/app_server/luajit_memprof>`
65-
(gh-5442) and the profile parser (gh-5490).
64+
- Introduced the LuaJIT memory profiler (gh-5442) and the profile parser
65+
(gh-5490). Read more: :doc:`/book/app_server/luajit_memprof`.
6666

6767
Lua
6868
~~~

0 commit comments

Comments
 (0)