Skip to content

Commit b0c682b

Browse files
committed
Doc: use custom sharding key to calculate bucket id
Describe functionality and current limitations (#212, #213 and #219) with custom sharding key in CHANGELOG and README. Closes #166
1 parent 110b9f8 commit b0c682b

File tree

2 files changed

+62
-8
lines changed

2 files changed

+62
-8
lines changed

CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,10 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
2424
* `crud.len()` function to calculate the number of tuples
2525
in the space for memtx engine and calculate the maximum
2626
approximate number of tuples in the space for vinyl engine.
27+
* CRUD operations calculates bucket id automatically using sharding
28+
key specified with DDL schema or in `_ddl_sharding_key` space.
29+
NOTE: CRUD methods delete(), get() and update() requires that sharding key
30+
must be a part of primary key.
2731

2832
## [0.8.0] - 02-07-21
2933

README.md

Lines changed: 58 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -53,11 +53,58 @@ crud.unflatten_rows(res.rows, res.metadata)
5353
**Notes:**
5454

5555
* A space should have a format.
56-
* By default, `bucket_id` is computed as `vshard.router.bucket_id_strcrc32(key)`,
57-
where `key` is the primary key value.
58-
Custom bucket ID can be specified as `opts.bucket_id` for each operation.
59-
For operations that accepts tuple/object bucket ID can be specified as
60-
tuple/object field as well as `opts.bucket_id` value.
56+
57+
**Sharding key and bucket id calculation**
58+
59+
*Sharding key* is a set of tuple field values used for calculation *bucket ID*.
60+
*Sharding key definition* is a set of tuple field names that describe what
61+
tuple field should be a part of sharding key. *Bucket ID* determines which
62+
replicaset stores certain data. Function that used for calculation bucket ID is
63+
named *sharding function*.
64+
65+
By default CRUD calculates bucket ID using primary key and a function
66+
`vshard.router.bucket_id_strcrc32(key)`, it happen automatically and doesn't
67+
require any actions from user side. User can calculate bucket ID on outside and
68+
pass it as an option to CRUD methods that accepts tuple or object (see option
69+
`bucket_id` below).
70+
71+
In version > 0.8.0 users who don't want to use primary key as a sharding key
72+
may set custom sharding key definition as a part of [DDL
73+
schema](https://github.com/tarantool/ddl#input-data-format) or insert manually
74+
to the space `_ddl_sharding_key` (for both cases consider a DDL module
75+
documentation). As soon as sharding key for a certain space is available in
76+
`_ddl_sharding_key` space CRUD will use it for bucket ID calculation
77+
automatically. Note that CRUD methods `delete()`, `get()` and `update()`
78+
requires that sharding key must be a part of primary key.
79+
80+
Table below describe what operations supports custom sharding key:
81+
82+
| CRUD method | Sharding key support |
83+
| -------------------------------- | -------------------------- |
84+
| `get()` | Yes |
85+
| `insert()` / `insert_object()` | Yes |
86+
| `delete()` | Yes |
87+
| `replace()` / `replace_object()` | Yes |
88+
| `upsert()` / `upsert_object()` | Yes |
89+
| `select()` / `pairs()` | Yes |
90+
| `update()` | Yes |
91+
| `upsert()` / `upsert_object()` | Yes |
92+
| `replace() / replace_object()` | Yes |
93+
| `min()` / `max()` | No (not required) |
94+
| `cut_rows()` / `cut_objects()` | No (not required) |
95+
| `truncate()` | No (not required) |
96+
| `len()` | No (not required) |
97+
98+
Current limitations for using custom sharding key:
99+
100+
- It's not possible to update sharding keys automatically when schema is
101+
updated on storages, see [#212](https://github.com/tarantool/crud/issues/212).
102+
However it is possible to do it manually with
103+
`sharding_key.update_sharding_keys_cache()`.
104+
- CRUD select may lead map reduce in some cases, see
105+
[#213](https://github.com/tarantool/crud/issues/213).
106+
- No support of JSON path for sharding key, see
107+
[#219](https://github.com/tarantool/crud/issues/219).
61108

62109
### Insert
63110

@@ -115,7 +162,8 @@ local object, err = crud.get(space_name, key, opts)
115162
where:
116163

117164
* `space_name` (`string`) - name of the space
118-
* `key` (`any`) - primary key value
165+
* `key` (`any`) - primary key value in version < 0.8.0 and sharding key when
166+
DDL sharding key is used in version >= 0.8.0. See section 'Sharding key' above.
119167
* `opts`:
120168
* `fields` (`?table`) - field names for getting only a subset of fields
121169
* `bucket_id` (`?number|cdata`) - bucket ID
@@ -152,7 +200,8 @@ local object, err = crud.update(space_name, key, operations, opts)
152200
where:
153201

154202
* `space_name` (`string`) - name of the space
155-
* `key` (`any`) - primary key value
203+
* `key` (`any`) - primary key value in version < 0.8.0 and sharding key when
204+
DDL sharding key is used in version >= 0.8.0. See section 'Sharding key' above.
156205
* `operations` (`table`) - update [operations](https://www.tarantool.io/en/doc/latest/reference/reference_lua/box_space/#box-space-update)
157206
* `opts`:
158207
* `timeout` (`?number`) - `vshard.call` timeout (in seconds)
@@ -185,7 +234,8 @@ local object, err = crud.delete(space_name, key, opts)
185234
where:
186235

187236
* `space_name` (`string`) - name of the space
188-
* `key` (`any`) - primary key value
237+
* `key` (`any`) - primary key value in version < 0.8.0 and sharding key when
238+
DDL sharding key is used in version >= 0.8.0. See section 'Sharding key' above.
189239
* `opts`:
190240
* `timeout` (`?number`) - `vshard.call` timeout (in seconds)
191241
* `bucket_id` (`?number|cdata`) - bucket ID

0 commit comments

Comments
 (0)