Skip to content

Commit a66bb19

Browse files
stats: integrate with metrics rock
If `metrics` [1] found, you can use metrics collectors to store statistics. `metrics >= 0.10.0` is required to use metrics driver. (`metrics >= 0.9.0` is required to use summary quantiles with age buckets. `metrics >= 0.5.0, < 0.9.0` is unsupported due to quantile overflow bug [2]. `metrics == 0.9.0` has bug that do not permits to create summary collector without quantiles [3]. In fact, user may use `metrics >= 0.5.0`, `metrics != 0.9.0` if he wants to use metrics without quantiles, and `metrics >= 0.9.0` if he wants to use metrics with quantiles. But this is confusing, so let's use a single restriction for both cases.) The metrics are part of global registry and can be exported together (e.g. to Prometheus) with default tools without any additional configuration. Disabling stats destroys the collectors. Metrics collectors are used by default if supported. To explicitly set driver, call `crud.cfg{ stats = true, stats_driver = driver }` ('local' or 'metrics'). To enable quantiles, call ``` crud.cfg{ stats = true, stats_driver = 'metrics', stats_quantiles = true, } ``` With quantiles, `latency` statistics are changed to 0.99 quantile of request execution time (with aging). Quantiles computations increases performance overhead up to 10% when used in statistics. Add CI matrix to run tests with `metrics` installed. To get full coverage on coveralls, #248 must be resolved. 1. https://github.com/tarantool/metrics 2. tarantool/metrics#235 3. tarantool/metrics#262 Closes #224
1 parent 82f6cd0 commit a66bb19

File tree

11 files changed

+1197
-137
lines changed

11 files changed

+1197
-137
lines changed

.github/workflows/test_on_push.yaml

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,13 +13,21 @@ jobs:
1313
matrix:
1414
# We need 1.10.6 here to check that module works with
1515
# old Tarantool versions that don't have "tuple-keydef"/"tuple-merger" support.
16-
tarantool-version: ["1.10.6", "1.10", "2.2", "2.3", "2.4", "2.5", "2.6", "2.7"]
16+
tarantool-version: ["1.10.6", "1.10", "2.2", "2.3", "2.4", "2.5", "2.6", "2.7", "2.8"]
17+
metrics-version: [""]
1718
remove-merger: [false]
1819
include:
20+
- tarantool-version: "1.10"
21+
metrics-version: "0.12.0"
1922
- tarantool-version: "2.7"
2023
remove-merger: true
24+
- tarantool-version: "2.8"
25+
metrics-version: "0.1.8"
26+
- tarantool-version: "2.8"
27+
metrics-version: "0.10.0"
2128
- tarantool-version: "2.8"
2229
coveralls: true
30+
metrics-version: "0.12.0"
2331
fail-fast: false
2432
runs-on: [ubuntu-latest]
2533
steps:
@@ -47,6 +55,10 @@ jobs:
4755
tarantool --version
4856
./deps.sh
4957
58+
- name: Install metrics
59+
if: matrix.metrics-version != ''
60+
run: tarantoolctl rocks install metrics ${{ matrix.metrics-version }}
61+
5062
- name: Remove external merger if needed
5163
if: ${{ matrix.remove-merger }}
5264
run: rm .rocks/lib/tarantool/tuple/merger.so
@@ -71,6 +83,7 @@ jobs:
7183
strategy:
7284
matrix:
7385
bundle_version: [ "1.10.11-0-gf0b0e7ecf-r422", "2.7.3-0-gdddf926c3-r422" ]
86+
metrics-version: ["", "0.12.0"]
7487
fail-fast: false
7588
runs-on: [ ubuntu-latest ]
7689
steps:
@@ -86,6 +99,10 @@ jobs:
8699
tarantool --version
87100
./deps.sh
88101
102+
- name: Install metrics
103+
if: matrix.metrics-version != ''
104+
run: tarantoolctl rocks install metrics ${{ matrix.metrics-version }}
105+
89106
# This server starts and listen on 8084 port that is used for tests
90107
- name: Stop Mono server
91108
run: sudo kill -9 $(sudo lsof -t -i tcp:8084) || true

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
99

1010
### Added
1111
* Statistics for CRUD operations on router (#224).
12+
* Integrate CRUD statistics with [`metrics`](https://github.com/tarantool/metrics) (#224).
1213

1314
### Changed
1415

README.md

Lines changed: 57 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -606,11 +606,28 @@ crud.cfg{ stats = false }
606606
crud.reset_stats()
607607
```
608608

609+
If [`metrics`](https://github.com/tarantool/metrics) `0.10.0` or greater
610+
found, metrics collectors will be used by default to store statistics
611+
instead of local collectors. Quantiles in metrics summary collections
612+
are disabled by default. You can manually choose driver and enable quantiles.
613+
```lua
614+
-- Use metrics collectors. (Default if metrics found).
615+
crud.cfg{ stats = true, stats_driver = 'metrics' }
616+
617+
-- Use metrics collectors with 0.99 quantiles.
618+
crud.cfg{ stats = true, stats_driver = 'metrics', stats_quantiles = true }
619+
620+
-- Use simple local collectors.
621+
crud.cfg{ stats = true, stats_driver = 'local' }
622+
```
623+
609624
You can use `crud.cfg` to check current stats state.
610625
```lua
611626
crud.cfg
612627
---
613-
- stats: true
628+
- stats_quantiles: true
629+
stats: true
630+
stats_driver: local
614631
...
615632
```
616633

@@ -660,9 +677,39 @@ Possible statistics operation labels are
660677
Each operation section contains of different collectors
661678
for success calls and error (both error throw and `nil, err`)
662679
returns. `count` is total requests count since instance start
663-
or stats restart. `latency` is average time of requests execution,
680+
or stats restart. `latency` is 0.99 quantile of request execution
681+
time if `metrics` driver used and quantiles enabled,
682+
otherwise `latency` is total average.
664683
`time` is the total time of requests execution.
665684

685+
In [`metrics`](https://www.tarantool.io/en/doc/latest/book/monitoring/)
686+
registry statistics are stored as `tnt_crud_stats` metrics
687+
with `operation`, `status` and `name` labels.
688+
```
689+
metrics:collect()
690+
---
691+
- - label_pairs:
692+
status: ok
693+
operation: insert
694+
name: customers
695+
value: 221411
696+
metric_name: tnt_crud_stats_count
697+
- label_pairs:
698+
status: ok
699+
operation: insert
700+
name: customers
701+
value: 10.49834896344692
702+
metric_name: tnt_crud_stats_sum
703+
- label_pairs:
704+
status: ok
705+
operation: insert
706+
name: customers
707+
quantile: 0.99
708+
value: 0.00023606420935973
709+
metric_name: tnt_crud_stats
710+
...
711+
```
712+
666713
`select` section additionally contains `details` collectors.
667714
```lua
668715
crud.stats('my_space').select.details
@@ -679,6 +726,10 @@ looked up on storages while collecting responses for calls (including
679726
scrolls for multibatch requests). Details data is updated as part of
680727
the request process, so you may get new details before `select`/`pairs`
681728
call is finished and observed with count, latency and time collectors.
729+
In [`metrics`](https://www.tarantool.io/en/doc/latest/book/monitoring/)
730+
registry they are stored as `tnt_crud_map_reduces`,
731+
`tnt_crud_tuples_fetched` and `tnt_crud_tuples_lookup` metrics
732+
with `{ operation = 'select', name = space_name }` labels.
682733

683734
Since `pairs` request behavior differs from any other crud request, its
684735
statistics collection also has specific behavior. Statistics (`select`
@@ -690,7 +741,10 @@ collector.
690741

691742
Statistics are preserved between package reloads. Statistics are preserved
692743
between [Tarantool Cartridge role reloads](https://www.tarantool.io/en/doc/latest/book/cartridge/cartridge_api/modules/cartridge.roles/#reload)
693-
if you use CRUD Cartridge roles.
744+
if you use CRUD Cartridge roles. Beware that metrics 0.12.0 and below do not
745+
support preserving stats between role reload
746+
(see [tarantool/metrics#334](https://github.com/tarantool/metrics/issues/334)),
747+
thus this feature will be unsupported for `metrics` driver.
694748

695749
## Cartridge roles
696750

crud/cfg.lua

Lines changed: 59 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -16,11 +16,49 @@ local function set_defaults_if_empty(cfg)
1616
cfg.stats = false
1717
end
1818

19+
if cfg.stats_driver == nil then
20+
cfg.stats_driver = stats.get_default_driver()
21+
end
22+
23+
if cfg.stats_quantiles == nil then
24+
cfg.stats_quantiles = false
25+
end
26+
1927
return cfg
2028
end
2129

2230
local cfg = set_defaults_if_empty(stash.get(stash.name.cfg))
2331

32+
local function configure_stats(cfg, opts)
33+
if (opts.stats == nil)
34+
and (opts.stats_driver == nil)
35+
and (opts.stats_quantiles == nil) then
36+
return
37+
end
38+
39+
if opts.stats == nil then
40+
opts.stats = cfg.stats
41+
end
42+
43+
if opts.stats_driver == nil then
44+
opts.stats_driver = cfg.stats_driver
45+
end
46+
47+
if opts.stats_quantiles == nil then
48+
opts.stats_quantiles = cfg.stats_quantiles
49+
end
50+
51+
if opts.stats == true then
52+
stats.enable{ driver = opts.stats_driver, quantiles = opts.stats_quantiles }
53+
else
54+
stats.disable()
55+
end
56+
57+
rawset(cfg, 'stats', opts.stats)
58+
rawset(cfg, 'stats_driver', opts.stats_driver)
59+
rawset(cfg, 'stats_quantiles', opts.stats_quantiles)
60+
end
61+
2462
--- Configure CRUD module.
2563
--
2664
-- @local
@@ -32,23 +70,33 @@ local cfg = set_defaults_if_empty(stash.get(stash.name.cfg))
3270
-- @bool[opt] opts.stats
3371
-- Enable or disable statistics collect.
3472
-- Statistics are observed only on router instances.
73+
74+
-- @string[opt] opts.stats_driver
75+
-- `'local'` or `'metrics'`.
76+
-- If `'local'`, stores statistics in local registry (some Lua tables)
77+
-- and computes latency as overall average. `'metrics'` requires
78+
-- `metrics >= 0.10.0` installed and stores statistics in
79+
-- global metrics registry (integrated with exporters).
80+
-- `'metrics'` driver supports computing latency as 0.99 quantile with aging.
81+
-- If `'metrics'` driver is available, it is used by default,
82+
-- otherwise `'local'` is used.
83+
--
84+
-- @bool[opt] opts.stats_quantiles
85+
-- Enable or disable statistics quantiles (only for metrics driver).
86+
-- Quantiles computations increases performance overhead up to 10%.
3587
--
3688
-- @return Copy of configuration table.
3789
--
3890
local function __call(self, opts)
39-
checks('table', { stats = '?boolean' })
91+
checks('table', {
92+
stats = '?boolean',
93+
stats_driver = '?string',
94+
stats_quantiles = '?boolean'
95+
})
4096

41-
opts = opts or {}
97+
opts = table.deepcopy(opts) or {}
4298

43-
if opts.stats ~= nil then
44-
if opts.stats == true then
45-
stats.enable()
46-
else
47-
stats.disable()
48-
end
49-
50-
rawset(cfg, 'stats', opts.stats)
51-
end
99+
configure_stats(cfg, opts)
52100

53101
return self
54102
end

crud/common/stash.lua

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,10 +16,14 @@ local stash = {}
1616
-- @tfield string stats_local_registry
1717
-- Stash for local metrics registry.
1818
--
19+
-- @tfield string stats_metrics_registry
20+
-- Stash for metrics rocks statistics registry.
21+
--
1922
stash.name = {
2023
cfg = '__crud_cfg',
2124
stats_internal = '__crud_stats_internal',
22-
stats_local_registry = '__crud_stats_local_registry'
25+
stats_local_registry = '__crud_stats_local_registry',
26+
stats_metrics_registry = '__crud_stats_metrics_registry'
2327
}
2428

2529
--- Setup Tarantool Cartridge reload.

crud/stats/local_registry.lua

Lines changed: 15 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,16 @@
22
-- @module crud.stats.local_registry
33
--
44

5+
local errors = require('errors')
6+
57
local dev_checks = require('crud.common.dev_checks')
68
local stash = require('crud.common.stash')
79
local op_module = require('crud.stats.operation')
810
local registry_common = require('crud.stats.registry_common')
911

1012
local registry = {}
1113
local internal = stash.get(stash.name.stats_local_registry)
14+
local StatsLocalError = errors.new_class('StatsLocalError', {capture_stack = false})
1215

1316
--- Initialize local metrics registry.
1417
--
@@ -17,9 +20,19 @@ local internal = stash.get(stash.name.stats_local_registry)
1720
--
1821
-- @function init
1922
--
20-
-- @treturn boolean Returns true.
23+
-- @tab opts
2124
--
22-
function registry.init()
25+
-- @bool opts.quantiles
26+
-- Quantiles is not supported for local, only `false` is valid.
27+
--
28+
-- @treturn boolean Returns `true`.
29+
--
30+
function registry.init(opts)
31+
dev_checks({ quantiles = 'boolean' })
32+
33+
StatsLocalError:assert(opts.quantiles == false,
34+
"Quantiles are not supported for 'local' statistics registry")
35+
2336
internal.registry = {}
2437
internal.registry.spaces = {}
2538

0 commit comments

Comments
 (0)