Enable chunking for large models in AOF, RDB, replication and MODELGET #387

lantiga · 2020-05-25T22:01:00Z

Addresses #386

Tests are missing, we need a strategy for downloading a large (512 MB) model from CI.

Note: this PR changes the format of RDB. Versioning is handled, ie older RDB are still loaded correctly.

codecov · 2020-05-25T22:27:36Z

Codecov Report

Merging #387 into master will increase coverage by 0.36%.
The diff coverage is 67.56%.

@@            Coverage Diff             @@
##           master     #387      +/-   ##
==========================================
+ Coverage   72.60%   72.96%   +0.36%     
==========================================
  Files          21       21              
  Lines        4424     4472      +48     
==========================================
+ Hits         3212     3263      +51     
+ Misses       1212     1209       -3

Impacted Files	Coverage Δ
src/script.c	`63.78% <50.00%> (ø)`
src/config.c	`28.57% <59.09%> (+9.82%)`	⬆️
src/model.c	`68.68% <59.37%> (-0.53%)`	⬇️
src/redisai.c	`78.79% <94.11%> (+1.70%)`	⬆️
src/tensor.c	`83.50% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a5d1524...9db5d3b. Read the comment docs.

filipecosta90 · 2020-05-26T10:26:00Z

src/redisai.c

    RedisModule_ReplyWithCString(ctx, "blob");
-    RedisModule_ReplyWithStringBuffer(ctx, buffer, len);
+    const size_t n_chunks = len / RAI_CHUNK_LEN + 1;
+    RedisModule_ReplyWithArray(ctx, (long)n_chunks);


@lantiga this will break the clients correct? IMO we also need the change documented as soon as we merge this.

Yes it will. Part of this review was also to coordinate on this aspect.

as we've discussed today, let's reply with cstring if the total number of chunks is 1 ( for retro compatibility ) and reply with an array if more than 1 chunk.
Enabling a chunk size on load time or via ai.config should be also possible.

Both should be done in 18b61e5

We now need a chunking test and we are done

…ly if n_chunks==1

DvirDukhan · 2020-06-15T17:31:29Z

src/config.c

+        sizeof(*buffer));
+    sprintf(buffer, "%s: %lld", REDISAI_INFOMSG_MODEL_CHUNK_SIZE,
+            getModelChunkSize());
+    RedisModule_Log(ctx, "notice", buffer);


void REDISMODULE_API_FUNC(RedisModule_Log)(RedisModuleCtx *ctx, const char *level, const char *fmt, ...)
I think you can do
RedisModule_Log(ctx, "notice", "%s: %lld", REDISAI_INFOMSG_MODEL_CHUNK_SIZE, getModelChunkSize());

DvirDukhan · 2020-06-15T17:33:06Z

src/config.c

      RedisModule_Free(buffer);
    }
+  } else if (strcasecmp((key), "MODEL_CHUNK_SIZE") == 0) {
+    RedisAI_Config_ModelChunkSize(rsval);


Check error code?

DvirDukhan · 2020-06-15T17:44:20Z

src/redisai.c

+    long long chunk_size = getModelChunkSize();
+    const size_t n_chunks = len / chunk_size + 1;
+    if (n_chunks > 1) {
+      RedisModule_ReplyWithArray(ctx, (long)n_chunks);
+      for (size_t i=0; i<n_chunks; i++) {
+        size_t chunk_len = i < n_chunks - 1 ? chunk_size : len % chunk_size;
+        RedisModule_ReplyWithStringBuffer(ctx, buffer + i * chunk_size, chunk_len);
+      }
+    }
+    else {
+      RedisModule_ReplyWithStringBuffer(ctx, buffer, len);
+    }


Reorder code or migrate to a function?

DvirDukhan · 2020-06-15T17:54:14Z

src/version.h

+/* This is the major / minor encver, to be used as
+   argument to RM_CreateDataType, which expects
+   encver to be < 1024 */
+static const long long RAI_ENC_VER_MM = RAI_ENC_VER == 999999 ? 1023 : RAI_ENC_VER / 100;


As far as I understand, current (master) RAI_ENC_VER is 0?
How do you plan to maintain updates to RAI_ENC_VER in the objects RDB decoders?

Current master is RAI_ENC_VER 999999, while for 1.0 RAI_ENC_VER is 10000 (major 1, minor 00, patch 00).

ok
So there is a need to update a minor version once this update is merged, right?
For now, the decoding of objects from RDB is checking if RAI_ENC_VER > 100 for the new partitioned decode, otherwise it will decode as a single buffer. This is ok for now since there are only two "official" versions. Once there will be additional modifications in encoding/decoding objects to RDB I suggest moving to a dedicated decoding function per encoding version.

Yes, if we release this will need to be a minor (1.1.0)

#387) * Enable chunking for large models in AOF, RDB, replication and MODELGET * Remove spurious printfs * Make chunk size configurable via MODEL_CHUNK_SIZE. Return blob directly if n_chunks==1 * Make chunk size configurable, add tests * Update docs * Address review comments

lantiga requested a review from filipecosta90 May 26, 2020 08:17

filipecosta90 reviewed May 26, 2020

View reviewed changes

lantiga added 3 commits June 10, 2020 15:25

Enable chunking for large models in AOF, RDB, replication and MODELGET

52a505f

Remove spurious printfs

21ed2fa

Make chunk size configurable via MODEL_CHUNK_SIZE. Return blob direct…

18b61e5

…ly if n_chunks==1

lantiga force-pushed the chunking branch from 25626db to 18b61e5 Compare June 10, 2020 16:37

lantiga added 2 commits June 15, 2020 09:04

Make chunk size configurable, add tests

8eb70b5

Update docs

ccb2d2f

lantiga requested a review from filipecosta90 June 15, 2020 08:16

DvirDukhan reviewed Jun 15, 2020

View reviewed changes

Address review comments

9db5d3b

DvirDukhan approved these changes Jun 15, 2020

View reviewed changes

lantiga merged commit f7fbb29 into master Jun 15, 2020

lantiga deleted the chunking branch June 15, 2020 20:58

filipecosta90 mentioned this pull request Aug 24, 2020

v1.0 cherry picks #450

Merged

Enable chunking for large models in AOF, RDB, replication and MODELGET #387

Enable chunking for large models in AOF, RDB, replication and MODELGET #387

Uh oh!

Conversation

lantiga commented May 25, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented May 25, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

lantiga commented May 25, 2020 •

edited

Loading

codecov bot commented May 25, 2020 •

edited

Loading