-
Notifications
You must be signed in to change notification settings - Fork 1.1k
[v4] added wasm cache #1471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v4] added wasm cache #1471
Conversation
Indeed, caching the mjs file would be necessary to ensure full offline support, and imo is essential before adding a caching feature like this PR proposes. Any ideas for how we could do this? Perhaps bundling |
xenova
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice + clean PR! Now we just need to figure out how to completely remove the ort-wasm-simd-threaded.jsep.mjs dependency.
|
So after another deep-dive into onnxruntime I figured out that it's actually no problem at all to load the wasm factory (.mjs) as a blob, which allows us to load it from cache. On to of that I also did some refactoring of the hub.js. My goal is to keep large files that only have a handfull of exported methods as clean as possible by extracting some heloper functions and constants into their separate files. I also wanted to improve the caching (which is now used not only in the hub.js but also in the backends/onnx.js) so I created a helper function also also an interface "CacheInterface" that any given cache has to implement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very worthwhile refactor -- thanks!
I wonder if you think the following feature could be useful: design some form of CacheRegistry class which a user could import from the library... like
import { cache } from '@huggingface/transformers';
/// check if model is cached
// cache.match('org/model') or something -- API should be well-designed, returning a list/map of files that are cached for this model maybe?
// cache.delete('org/model') -- remove all files cached for this modelI think we can draw inspiration from hf cache cli tool
hf cache --help
Usage: hf cache [OPTIONS] COMMAND [ARGS]...
Manage local cache directory.
Options:
--help Show this message and exit.
Commands:
ls List cached repositories or revisions.
prune Remove detached revisions from the cache.
rm Remove cached repositories or revisions.
verify Verify checksums for a single repo revision from cache or a...may not need to be added in this PR, but maybe something to discuss here.
|
I like the CacheRegistry! The only problem is that we normally don't know upfront all the files a model will load/expect (altough I think it would be great to add that as well). |
src/backends/utils/cacheWasm.js
Outdated
| if (cache) { | ||
| try { | ||
| return await cache.match(url); | ||
| } catch (e) { | ||
| console.warn(`Error reading ${fileName} from cache:`, e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if await cache.match(url) returns undefined (i.e., it is not in cache), then we return undefined from this function... and the fetch below is never called (meaning, it is never cached).
src/backends/utils/cacheWasm.js
Outdated
| if (cache) { | ||
| try { | ||
| return await cache.match(url); | ||
| } catch (e) { | ||
| console.warn(`Error reading ${fileName} from cache:`, e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if (cache) { | |
| try { | |
| return await cache.match(url); | |
| } catch (e) { | |
| console.warn(`Error reading ${fileName} from cache:`, e); | |
| if (cache) { | |
| try { | |
| const result = await cache.match(url); | |
| if (result) { | |
| return result; | |
| } | |
| } catch (e) { | |
| console.warn(`Error reading ${fileName} from cache:`, e); |
seems to fix it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
made this change.
…ransformers.js into v4-cache-wasm-file
Don't throw error if we can't open cache or load file from cache, but we are able to make the request.
xenova
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! 🚀 Had time today to finish the review, and it works well! I had to make one small adjustment (only return when await cache.match(url) matches... not when undefined), and it's good to go :)
I also like the refactor out of the monolithic hub.js file.
I tested it on the recent chatterbox webgpu demo, and it now runs fully offline thanks to this PR! 🔥
This adds caching of the wasm Binary.
I also added an env.cacheKey so developers can modify the cacheKey. By default it will be
transformers-cachebut they are free to set something related to their app.Note: this will only cache the wasm file. So there will still be a request to https://cdn.jsdelivr.net/ for the mjs file. In my opinion, this should be cached via the service-worker if the applications requires full offline support.