Skip to content

Commit dbd6b87

Browse files
committed
Implement changes after dev review
Part of #1753
1 parent a3c7356 commit dbd6b87

File tree

2 files changed

+88
-84
lines changed

2 files changed

+88
-84
lines changed

doc/book/app_server/luajit_memprof.rst

Lines changed: 85 additions & 82 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,8 @@ LuaJIT memory profiler
33
======================
44

55
Starting from version :doc:`2.7.1 </release/2.7.1>`, Tarantool
6-
has the built-in module called ``misc.memprof`` that implements the LuaJIT memory
7-
profiler and the profile parser (further, *profiler*). The profiler provides
6+
has the builtin module called ``misc.memprof`` that implements the LuaJIT memory
7+
profiler (further, *profiler*). The profiler provides
88
a memory allocation report that helps analyse Lua code and find out the places
99
that put the most pressure on the Lua garbage collector (GC).
1010

@@ -42,6 +42,7 @@ Below is a chunk of simple Lua code named ``test.lua`` to illustrate this.
4242
.. code-block:: lua
4343
:linenos:
4444
45+
-- Prevent allocations on traces.
4546
jit.off()
4647
local str, err = misc.memprof.start("memprof_new.bin")
4748
-- Lua doesn't create a new frame to call string.rep, and all allocations
@@ -52,49 +53,28 @@ Below is a chunk of simple Lua code named ``test.lua`` to illustrate this.
5253
5354
local t = {}
5455
for i = 1, 1e5 do
55-
-- table.insert is the built-in function and all corresponding
56+
-- table.insert is the builtin function and all corresponding
5657
-- allocations are reported in the scope of the main chunk.
5758
table.insert(t,
5859
append('q', i)
5960
)
6061
end
6162
local stp, err = misc.memprof.stop()
6263
63-
.. note::
64-
65-
Usually, the information about allocations inside Lua built-ins are not really
66-
useful for the developer. That's why if a Lua built-in function is called from
67-
a Lua function, the profiler attributes all allocations to the Lua function.
68-
Otherwise, this event is attributed to a C function.
69-
70-
Tail call optimization doesn't create a new call frame, so all allocations
71-
inside the function called via the ``CALLT/CALLMT`` `bytecodes <http://wiki.luajit.org/Bytecode-2.0>`_
72-
are attributed to its caller.
73-
74-
Example above illustrates these cases.
75-
7664
Starting profiler in Lua code:
7765

7866
.. code-block:: lua
7967
8068
local str, err = misc.memprof.start(FILENAME)
8169
82-
where ``FILENAME`` is a name of the binary file where profile events are written.
83-
The writer for this function performs ``fwrite()`` for each call retrying
84-
in case of ``EINTR``. When the profiling is stopped, ``fclose()`` is called.
70+
where ``FILENAME`` is a name of the binary file where profiling events are written.
8571

8672
If it is not possible to open a file for writing or the profiler fails to start,
8773
``misc.memprof.start()`` returns ``nil`` on failure. Also, in this case
8874
the function returns an error message as the second result and
8975
a system-dependent error code as the third result.
9076
Otherwise, it returns ``true``.
9177

92-
.. note::
93-
94-
It is recommended to switch the JIT compilation off by calling ``jit.off()``
95-
before the profiler start. Refer to the following
96-
:ref:`explanation <profiler_usage_internal_jitoff>` for details.
97-
9878
Stopping profiler in Lua code:
9979

10080
.. code-block:: lua
@@ -110,8 +90,9 @@ Otherwise, it returns ``true``.
11090

11191
.. _profiler_usage_generate:
11292

113-
To generate the ``memprof_new.bin`` file with the memory profile in binary format,
114-
execute the code under Tarantool:
93+
To generate the file with memory profile in binary format
94+
(in the :ref:`example above <profiler_usage_example01>`,
95+
it's ``memprof_new.bin``), execute the code under Tarantool:
11596

11697
.. code-block:: tarantoolconsole
11798
@@ -121,6 +102,26 @@ Tarantool collects the allocation events in ``memprof_new.bin``, puts
121102
the file in its :ref:`working directory <cfg_basic-work_dir>`, and closes
122103
the session.
123104

105+
The :ref:`code example <profiler_usage_example01>` above also illustrates the memory
106+
allocation logic in some of the cases that are important to understand for further
107+
:ref:`reading <profiler_usage_parse>` and :ref:`analysing <profiler_analysis>`
108+
a profiling report:
109+
110+
* Line 2: It is recommended to switch the JIT compilation off by calling ``jit.off()``
111+
before the profiler start. Refer to the following
112+
:ref:`explanation <profiler_usage_internal_jitoff>` for more details.
113+
114+
* Lines 6-8: Tail call optimization doesn't create a new call frame, so all
115+
allocations inside the function called via the ``CALLT/CALLMT`` `bytecodes <http://wiki.luajit.org/Bytecode-2.0#calls-and-vararg-handling>`_
116+
are attributed to its caller. See also comments to these lines.
117+
118+
* Lines 14-16: Usually, the information about allocations inside Lua built‑ins
119+
are not really
120+
useful for developers. That's why if a Lua built‑in function is called from
121+
a Lua function, the profiler attributes all allocations to the Lua function.
122+
Otherwise, this event is attributed to a C function.
123+
See also comments to these lines.
124+
124125
.. _profiler_usage_parse:
125126

126127
Parsing binary profile and generating profiling report
@@ -140,36 +141,35 @@ via Tarantool by using the following command
140141
where ``memprof_new.bin`` is the binary profile
141142
:ref:`generated earlier <profiler_usage_generate>`.
142143

143-
Tarantool generates a profiling report and closes the session.
144+
Tarantool generates a profiling report that is displayed in console and closes
145+
the session:
144146

145147
.. code-block:: console
146148
147149
ALLOCATIONS
148-
@test.lua:0, line 14: 1002 531818 0
149-
@test.lua:0, line 13: 1 24 0
150-
@test.lua:0, line 9: 1 32 0
151-
@test.lua:0, line 7: 1 20 0
150+
@test.lua:0, line 15: 1002 531818 0
151+
@test.lua:0, line 14: 1 24 0
152+
@test.lua:0, line 10: 1 32 0
153+
@test.lua:0, line 8: 1 20 0
152154
153155
REALLOCATIONS
154-
@test.lua:0, line 13: 9 16424 8248
155-
Overrides:
156-
@test.lua:0, line 13
157-
158-
@test.lua:0, line 14: 5 1984 992
156+
@test.lua:0, line 14: 9 16424 8248
159157
Overrides:
160158
@test.lua:0, line 14
161-
159+
@test.lua:0, line 15: 5 1984 992
160+
Overrides:
161+
@test.lua:0, line 15
162162
163163
DEALLOCATIONS
164164
INTERNAL: 20 0 1481
165-
@test.lua:0, line 14: 3 0 7168
165+
@test.lua:0, line 15: 3 0 7168
166166
Overrides:
167-
@test.lua:0, line 14
167+
@test.lua:0, line 15
168168
169169
.. note::
170170

171-
On MacOS , a report is different because Tarantool and LuaJIT are built
172-
with the GC64 mode enabled for this OS.
171+
On MacOS, a report will be different for the same chunk of code because
172+
Tarantool and LuaJIT are built with the GC64 mode enabled for MacOS.
173173

174174
Let's examine the report structure. A report has three sections:
175175

@@ -217,11 +217,11 @@ structures.
217217

218218
So, if the JIT compilation is on,
219219
new traces will be generated and there will be a mixture of events labeled
220-
``INTERNAL`` in the profiling report : some of them are really caused by
220+
``INTERNAL`` in the profiling report: some of them are really caused by
221221
internal LuaJIT structures, but some of them are caused by allocations on
222222
traces.
223223

224-
If you want to have more definite report without new trace allocations,
224+
If you want to have more definite report without JIT compiler allocations,
225225
:ref:`call jit.off() <profiler_usage_example01>` before starting the profiling.
226226
And if you want to completely exclude the trace allocations from the report,
227227
remove also the old traces by additionally calling ``jit.flush()`` after
@@ -244,8 +244,8 @@ most probably can arise while using the profiler.
244244

245245
.. _profiler_faq:
246246

247-
Frequently Asked Questions
248-
--------------------------
247+
FAQ
248+
---
249249

250250
In this section, some of the profiler-related points are discussed in
251251
a Q&A format.
@@ -264,7 +264,7 @@ or other non-Lua allocators. You can use ``valgrind`` to debug them.
264264
What does it mean?
265265

266266
**A**: ``INTERNAL`` means that these allocations/reallocations/deallocations are
267-
related to the internal LuaJIT structures or are made on JIT traces.
267+
related to the internal LuaJIT structures or are made on traces.
268268
Currently, the memory profiler doesn't report verbosely allocations of objects
269269
that are made during trace execution. Try to :ref:`add jit.off() <profiler_usage_internal_jitoff>`
270270
before profiler start.
@@ -334,16 +334,19 @@ from a console.
334334

335335
.. _profiler_analysis:
336336

337-
Report analysis example
338-
-----------------------
337+
Profiling report analysis example
338+
---------------------------------
339339

340340
In the example below, the following Lua code named ``format_concat.lua`` is
341341
investigated with the help of the memory profiler reports.
342342

343+
.. _profiler_usage_example03:
344+
343345
.. code-block:: lua
344346
:linenos:
345347
346-
jit.off() -- Prevent allocations on new traces.
348+
-- Prevent allocations on new traces.
349+
jit.off()
347350
348351
local function concat(a)
349352
local nstr = a.."a"
@@ -355,7 +358,7 @@ investigated with the help of the memory profiler reports.
355358
return nstr
356359
end
357360
358-
collectgarbage() -- Clean up.
361+
collectgarbage()
359362
360363
local binfile = "/tmp/memprof_"..(arg[0]):match("([^/]*).lua")..".bin"
361364
@@ -367,7 +370,7 @@ investigated with the help of the memory profiler reports.
367370
local f = format(i)
368371
local c = concat(i)
369372
end
370-
collectgarbage() -- Clean up.
373+
collectgarbage()
371374
372375
local st, err = misc.memprof.stop()
373376
assert(st, err)
@@ -381,19 +384,18 @@ you will get the following profiling report:
381384
.. code-block:: console
382385
383386
ALLOCATIONS
384-
@format_concat.lua:8, line 9: 19998 624322 0
387+
@format_concat.lua:9, line 10: 19998 624322 0
385388
INTERNAL: 1 65536 0
386389
387390
REALLOCATIONS
388391
389392
DEALLOCATIONS
390393
INTERNAL: 19998 0 558816
391394
Overrides:
392-
@format_concat.lua:8, line 9
393-
394-
@format_concat.lua:8, line 9: 2 0 98304
395+
@format_concat.lua:9, line 10
396+
@format_concat.lua:9, line 10: 2 0 98304
395397
Overrides:
396-
@format_concat.lua:8, line 9
398+
@format_concat.lua:9, line 10
397399
398400
The reasonable questions regarding the report can be:
399401

@@ -402,53 +404,54 @@ The reasonable questions regarding the report can be:
402404
* Why are there approximately 20K allocations instead of 10K?
403405

404406
First of all, LuaJIT doesn't create a new string if the string with the same
405-
payload exists. This is called the string interning. So, when the string is
407+
payload exists (see details on `lua-users.org/wiki <http://lua-users.org/wiki/ImmutableObjects>`_).
408+
This is called the `string interning <https://en.wikipedia.org/wiki/String_interning>`_.
409+
So, when the string is
406410
created via the ``format()`` function, there is no need to create the same
407-
string via the ``concat()`` function, and LuaJIT just use the previous one.
411+
string via the ``concat()`` function, and LuaJIT just uses the previous one.
408412

409-
That is also the reason why the amount of allocations is not the round numbber
413+
That is also the reason why the amount of allocations is not the round number
410414
as can be expected from the cycle operator ``for i = 1, 10000...``:
411415
Tarantool creates some
412-
strings for internal needs and built-in modules, so some strings already exist.
416+
strings for internal needs and builtin modules, so some strings already exist.
413417

414418
But why are there so many allocations? It's almost twice as big as the expected
415-
amount. This is because the ``string.format()`` built-in function creates
419+
amount. This is because the ``string.format()`` builtin function creates
416420
another string necessary for the ``%s`` identifier, so there are two allocations
417421
for each iteration: for ``tostring(i)`` and for ``string.format("%sa", string_i_value)``.
418422
You can see the difference in behaviour by adding the
419-
``local _ = tostring(i)`` line between lines 21 and 22.
423+
``local _ = tostring(i)`` line between lines 22 and 23.
420424

421-
To profile only the ``concat()`` function, comment the line 22, namely,
425+
To profile only the ``concat()`` function, comment line 23, namely,
422426
``local f = format(i)`` and run the profiler.
423427

424428
The profiler's output is the following:
425429

426430
.. code-block:: console
427431
428432
ALLOCATIONS
429-
@format_concat.lua:3, line 4: 10000 284411 0
433+
@format_concat.lua:4, line 5: 10000 284411 0
430434
431435
REALLOCATIONS
432436
433437
DEALLOCATIONS
434438
INTERNAL: 10000 0 218905
435439
Overrides:
436-
@format_concat.lua:3, line 4
437-
438-
@format_concat.lua:3, line 4: 1 0 32768
439-
440+
@format_concat.lua:4, line 5
441+
@format_concat.lua:4, line 5: 1 0 32768
440442
441443
**Q**: But what will change if the JIT compilation is enabled?
442444

443-
**A**: Let's comment the first line of the code, namely, ``jit.off()`` and run
445+
**A**: In the :ref:`code <profiler_usage_example03>`, comment line 2, namely,
446+
``jit.off()`` and run
444447
the profiler . Now, there are only 56 allocations in the report, and all other
445448
allocations are JIT-related (see also the related
446449
`dev issue <https://github.com/tarantool/tarantool/issues/5679>`_):
447450

448451
.. code-block:: console
449452
450453
ALLOCATIONS
451-
@format_concat.lua:3, line 4: 56 1112 0
454+
@format_concat.lua:4, line 5: 56 1112 0
452455
@format_concat.lua:0, line 0: 4 640 0
453456
INTERNAL: 2 382 0
454457
@@ -457,24 +460,24 @@ allocations are JIT-related (see also the related
457460
DEALLOCATIONS
458461
INTERNAL: 58 0 1164
459462
Overrides:
460-
@format_concat.lua:3, line 4
463+
@format_concat.lua:4, line 5
461464
INTERNAL
462465
463466
This happens because a trace is compiled after 56 iterations, and the
464467
JIT-compiler removed the unused ``c`` variable from the trace, and, therefore,
465468
the dead code of the ``concat()`` function is eliminated.
466469

467470
Next, let's profile only the ``format()`` function with JIT enabled.
468-
For that, keep the lines 1 and 23 commented (``jit.off()`` and
469-
``local c = concat(i)`` respectively), uncomment the line 22
471+
For that, keep lines 2 and 24 commented (``jit.off()`` and
472+
``local c = concat(i)`` respectively), uncomment line 23
470473
(``local f = format(i)``), and run the profiler.
471474

472475
The profiler's output is the following:
473476

474477
.. code-block:: console
475478
476479
ALLOCATIONS
477-
@format_concat.lua:8, line 9: 19998 624322 0
480+
@format_concat.lua:9, line 10: 19998 624322 0
478481
INTERNAL: 4 66824 0
479482
@format_concat.lua:0, line 0: 4 640 0
480483
@@ -484,19 +487,19 @@ The profiler's output is the following:
484487
INTERNAL: 19999 0 559072
485488
Overrides:
486489
@format_concat.lua:0, line 0
487-
@format_concat.lua:8, line 9
488-
489-
@format_concat.lua:8, line 9: 2 0 98304
490+
@format_concat.lua:9, line 10
491+
@format_concat.lua:9, line 10: 2 0 98304
490492
Overrides:
491-
@format_concat.lua:8, line 9
493+
@format_concat.lua:9, line 10
492494
493495
**Q**: Why is there so many allocations in comparison to the ``concat()`` function?
494496

495497
**A**: The answer is simple: the ``string.format()`` function with the ``%s``
496498
identifier is not yet compiled via LuaJIT. So, a trace can't be recorded and
497499
the compiler doesn't perform the corresponding optimizations.
498500

499-
If we change the ``format()`` function in the following way
501+
If we change the ``format()`` function in the :ref:`code chunk <profiler_usage_example03>`
502+
in the following way
500503

501504
.. code-block:: lua
502505
@@ -510,7 +513,7 @@ the profiling report becomes much prettier:
510513
.. code-block:: console
511514
512515
ALLOCATIONS
513-
@format_concat.lua:8, line 9: 110 2131 0
516+
@format_concat.lua:9, line 10: 110 2131 0
514517
@format_concat.lua:0, line 0: 4 640 0
515518
INTERNAL: 3 1148 0
516519
@@ -520,5 +523,5 @@ the profiling report becomes much prettier:
520523
INTERNAL: 113 0 2469
521524
Overrides:
522525
@format_concat.lua:0, line 0
523-
@format_concat.lua:8, line 9
526+
@format_concat.lua:9, line 10
524527
INTERNAL

0 commit comments

Comments
 (0)