Skip to content

Conversation

@ariutta
Copy link
Contributor

@ariutta ariutta commented Jan 27, 2016

The URL parser had a full and a simple option, but the simple option was never used. Removing the unused simple option will speed up the code and make the total file size smaller.

I also pulled jsonld.url.parse and _removeDotSegments into their own modules to make it easier to read the main jsonld.js file. These modules are currently referenced in package.json under my github account, but I'd be happy to transfer these repos to digitalbazaar or the jsonld.js organization.

@dlongley
Copy link
Member

In the near future, we're likely going to break jsonld.js into a number of different files and then add a build system to cobble them back together. Our plan is to do this all at once, but it's unclear what will get its own module and what will just be a file here in this repo. There are bits that definitely should get their own module (eg: we're working on this https://github.com/digitalbazaar/rdf-canonize/tree/dev), but it's doubtful we'll break out things like the _removeDotSegments into separate modules because of the maintenance overhead that generates.

@ariutta
Copy link
Contributor Author

ariutta commented Mar 1, 2016

@dlongley: sure, I understand the maintenance overhead. The Babel project uses a monorepo that contains multiple NPM packages. Here's a description of how they do it and the tool they created to manage this: https://github.com/kittens/lerna

I undid the splitting out of _removeDotSegments. Are you onboard with the other changes?

@ariutta
Copy link
Contributor Author

ariutta commented Mar 1, 2016

I really need to minimize the file size for browser usage. That's why I removed the unused simple option from the URL parser.

@dlongley
Copy link
Member

dlongley commented Mar 1, 2016

The nextTick changes won't work because it should be using setImmediate when it's available to allow I/O to be processed during normalization of large/complex datasets. However, we're separating out the normalization/canonization stuff entirely right now so we can probably just leave that alone and wait for those changes to be done.

Regarding the url parser change, including another library using a synchronous require will break existing AMD setups.

We really need to get a build system in place to cobble together the bits people need so people can pick and choose what they want if they'd like to create custom builds to cut down on file size for the browser. Again, people may not need normalization/canonization at all (or maybe they don't need framing) -- and leaving those out will result in far better savings in terms of size than a few lines in the URL parser.

@ariutta
Copy link
Contributor Author

ariutta commented Mar 2, 2016

Replaced nextTick polyfill with setImmediate polyfill.

@ariutta ariutta changed the title Break out url.parse and _removeDotSegments into own modules. Modularize code Mar 3, 2016
@ariutta
Copy link
Contributor Author

ariutta commented Mar 4, 2016

Test Statuses

node tests: all pass on my system, but I'm not sure how to set the JSDIR env var to ./dist/node for your Travis CI system.

browser tests: appear to be failing only due to a bug with the test code, not because of an actual bug in the library. It appears to work fine in the browser. update When I changed tests/test.js to use phantom.injectJs instead of require to add ./js/jsonld.js, the tests all pass. Additionally, you can see it running every example from the main README if you visit this test page and take a look at your browser console to view the results from test.js:

http://plnkr.co/edit/jkiQVUQBbFdZi2UoxGJQ?p=preview

node-document-loader tests: fail, because this version is using superagent and superagent-cache instead of the existing node document loader. This change allows for re-enabling caching in both node and the browser.

@ariutta
Copy link
Contributor Author

ariutta commented Mar 4, 2016

The more recent pushes implement a proposed version of a modularized system with control for customized builds. This system supports tree shaking, which "only includes the bits of code your bundle actually needs to run," as implemented by Rollup. If a user only needs to expand data, tree shaking allows that user to pull in just the code needed to run jsonld.expand, skipping all the extra unused code for compacting, framing, normalization, etc.

The build system by default creates ./dist/node/jsonld.js and ./dist/browser/jsonld.js via a postinstall lifecycle script. These files can also be generated from the command line with npm run build. The rollup.config.*.js files are where the build preferences are set. It would be straightforward to add a build option to allow a user to specify only the parts of the jsonld API that they need in a custom build.

For the browser, the build system generates bundles that have the umd format:

umd – Universal Module Definition, works as amd, cjs and iife all in one
from the Rollup docs

@davidlehn
Copy link
Member

We did want to be more modular, but ~128 files is a level far beyond modular! Individual files for single constants is an unmaintainable coding style, don't you think? There are tradeoffs between importing just what you need and making it crazy difficult to edit the code. I think the approach here is on the difficult to use side. How many bytes is this saving in a common use case where someone is using many of the top level API functions?

I assume the reason for this file split up is to leverage tree shaking of ES2015 tools. That is a good goal, but rollup and similar tools do do some dead code elimination too. Have you tried using larger files with many exports and seeing how well they get optimized when run through, say, rollup with uglify-js?

The normalization code is already in the process of breaking out into a different module entirely (for various reasons):
https://github.com/digitalbazaar/rdf-canonize/tree/dev

Another issue is that I don't think we want to use ES2015 modules in jsonld.js at this time. I think our perspective at Digital Bazaar is to keep the code able to run with default LTS nodejs options for ease of development and wider compatibility. For a number of our projects, including jsonld.js, we are exploring various tools (rollup, webpack, browserify, etc) to build browser compatible blobs out of a CommonJS based module style. As nodejs and v8 gain more mature ES2015 module support and usage guidelines we'll consider switching over.

@ariutta
Copy link
Contributor Author

ariutta commented Mar 4, 2016

Regarding the modules for individual variables, definitely agree that's
overkill. This is a first draft for starting discussion, and as a first
cut, I just did an export for every top-level variable and function
declaration. Now we can decide where to more appropriately draw the line.

Regarding ES2015, this version is actually ES5 for Node and the browser.
Look at the "main" property in package.json. The ES2015 code is for the
"esnext:main" property in package.json.
On Mar 3, 2016 6:33 PM, "David I. Lehn" [email protected] wrote:

We did want to be more modular, but ~128 files is a level far beyond
modular! Individual files for single constants is an unmaintainable coding
style, don't you think? There are tradeoffs between importing just what you
need and making it crazy difficult to edit the code. I think the approach
here is on the difficult to use side. How many bytes is this saving in a
common use case where someone is using many of the top level API functions?

I assume the reason for this file split up is to leverage tree shaking of
ES2015 tools. That is a good goal, but rollup and similar tools do do some
dead code elimination too. Have you tried using larger files with many
exports and seeing how well they get optimized when run through, say,
rollup with uglify-js?

The normalization code is already in the process of breaking out into a
different module entirely (for various reasons):
https://github.com/digitalbazaar/rdf-canonize/tree/dev

Another issue is that I don't think we want to use ES2015 modules in
jsonld.js at this time. I think our perspective at Digital Bazaar is to
keep the code able to run with default LTS nodejs options for ease of
development and wider compatibility. For a number of our projects,
including jsonld.js, we are exploring various tools (rollup, webpack,
browserify, etc) to build browser compatible blobs out of a CommonJS based
module style. As nodejs and v8 gain more mature ES2015 module support and
usage guidelines we'll consider switching over.


Reply to this email directly or view it on GitHub
#126 (comment)
.

@ariutta
Copy link
Contributor Author

ariutta commented Mar 5, 2016

Moved all top-level constants into a single file.

@ariutta
Copy link
Contributor Author

ariutta commented Mar 6, 2016

Size comparisons for bundles to be used in the browser, all minified:

  • current jsonld.js: 120,758 B
  • this version, full, as in plunkr demo: 90,309 B
  • this version with just jsonld.expand: 69,366 B (not yet even realizing the full potential of tree-shaking, because the Processor is currently one single large module.)

@leipert
Copy link

leipert commented Sep 2, 2016

@ariutta @dlongley
What is the state of this. How can I help?

@dlongley
Copy link
Member

dlongley commented Sep 2, 2016

@leipert -- thanks for offering to help. I think this work has stalled, so feel free to do what you can. My comment here lays out what we're looking for:

#126 (comment)

@dlongley
Copy link
Member

dlongley commented Sep 2, 2016

@leipert,

I'll add a caveat that we don't necessarily need to complete rdf-canonize before the reorganization happens here... and we may find that we can finish both off based on the efforts here. rdf-canonize is essentially just waiting on a good module infrastructure/framework to copy from this library. It has the same problem, except it has been split up already -- it just needs the capability to be sewn back together.

@ariutta
Copy link
Contributor Author

ariutta commented Sep 2, 2016

My work was really to spark discussion and get something going, but it is
not production ready. I'm not going to be able to get back to this anytime
soon, so feel free to use anything useful and ignore everything else from
my refactoring.

On Sep 2, 2016 6:49 AM, "Dave Longley" [email protected] wrote:

@leipert https://github.com/leipert,

I'll add a caveat that we don't necessarily need to complete rdf-canonize
before the reorganization happens here... and we may find that we can
finish both off based on the efforts here. rdf-canonize is essentially just
waiting on a good module infrastructure/framework to copy from this
library. It has the same problem, except it has been split up already -- it
just needs the capability to be sewn back together.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#126 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABPJgwc84CZnsqSNb-E1L_c16ZuxN6b9ks5qmClNgaJpZM4HNFFi
.

@BigBlueHat
Copy link
Contributor

First, thanks to @ariutta for taking the first crack at this!

Second, I've been reading up on System.js which @tilgovi's suggested would be a good fit for Apache Annotator.

I'm not keen on chasing the "new shiny thing" but it seems these folks have been at this for awhile, it has a pretty proven track record and can do some fabulous async loading and/or bundling and the like.

Would there be any interest in a PR attempt at system.js usage for this repo?

Cheers!
🎩

@dlongley
Copy link
Member

@BigBlueHat, @davidlehn is working on this problem right now ... so don't do anything just yet without him chiming in. :)

@davidlehn davidlehn mentioned this pull request Jul 26, 2017
22 tasks
@dlongley
Copy link
Member

Closing this is out-of-date -- we long ago did a rewrite to better modularize the code. Any further refactoring should be done via a new PR.

@dlongley dlongley closed this May 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants