Skip to content

Conversation

ericvergnaud
Copy link
Contributor

@ericvergnaud ericvergnaud commented Dec 13, 2023

Hi, following-up on issue #6145.

This is a first iteration, that already covers all features of index.d.ts in binaryen.ts repo, plus a few additions: TypeBuilder and minimal arrays (WIP).

The purpose of this PR is to ensure that the proposed approach is fine, notably because it is a bit different from the one originally envisaged i.e. produce the binaryen.js-post.js file from a ts file. I tried that approach and it failed, so submitting a new approach.

Why did the original approach fail ?

  • the generated js file contains a single function (Binaryen) that needs to be invoked before accessing wasm entry points. This is fine but it makes it impossible to specify consts and enums in the ts code itself. Rather, they need to be populated once the code is loaded (in binaryen_wasm, this is done by the initializeConstants method, called after the module is loaded). Overriding consts and enums in Typescript is not supported.
  • in the existing implementation, the Module 'class' wraps the entire module by adding methods to it, rather than using it from a separated class. This works in JS (and a manually built .d.ts file) but is impossible to mimic when generating JS from Typescript.
  • some components of the tool chain fail with the JS generated from Typescript.

Overall, working around all these issues would have led to breaking changes and extremely hacky code.

What is the proposed approach

The proposed approach in this PR is to create a new 'binaryen_wasm_ts' target, that works as follows:

  • it leaves the existing targets untouched - no breaking change
  • it enriches the generated binaryen_wasm_ts.js file with the minimal set of entry points required for running the Typescript target
  • it provides a Typescript file that loads the generated binaryen_wasm_ts.js file, such that consts and enums can be initialized with runtime values. Also, more classes that wrap binaryen entry points can be declared in a simple manner.

The new approach provides a significant benefit over binaryen_wasm in that the generated files can be used directly without packaging, which is pretty useful for testing. I have migrated my code base to use it and haven't found any issues yet. It also rationalizes the API a bit, by grouping and renaming some wrapper functions for improved clarity.

The GC related stuff (TypeBuilder and 'arrays' section) is obviously incomplete, this part is WIP and will be enhanced through future PRs (if this one is accepted).

A caveat is that one of the original goals i.e. a 'single source of truth' is not met, because we now have 2: binaryen_wasm and binaryen_wasm_ts. That said, the current situation is worse because the index.d.ts file in binaryen.js repo is incorrect (it contains inexistent API calls, and some are missing parameters). I don't know enough about who's using binaryen_wasm to anticipate how that will evolve, but I suspect that many JS users will be happy with the generated binaryen_ts.js file, whilst I'm quite sure that Typescript users will find it beneficial to use the new target. If that happens, the binaryen_wasm target could be dropped in the future.

Building the target is achieved as follows:
emcmake cmake . && emmake make binaryen_wasm_ts

Copy link
Member

@tlively tlively left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My high-level question here is whether it would be possible to de-duplicate a lot of this structure with the normal binaryen.js build.

Comment on lines +544 to +549
target_link_libraries(binaryen_wasm_ts "-sFILESYSTEM")
target_link_libraries(binaryen_wasm_ts "-sEXPORT_NAME=Binaryen")
target_link_libraries(binaryen_wasm_ts "-sNODERAWFS=0")
target_link_libraries(binaryen_wasm_ts "-sEXPORT_ES6")
target_link_libraries(binaryen_wasm_ts "-sEXPORTED_RUNTIME_METHODS=allocateUTF8OnStack,stringToAscii")
target_link_libraries(binaryen_wasm_ts "-sEXPORTED_FUNCTIONS=_malloc,_free")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to avoid having to repeat all these flags from the vanilla binaryen_wasm build? Do we need another target at all? Can we unconditionally emit the TS wrapper for all binaryen_js and binaryen_wasm builds instead?

Copy link
Contributor Author

@ericvergnaud ericvergnaud Dec 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An option I haven't considered is to apply for JS the same approach than TS i.e. rather than embed the JS code in the generated JS, have it live aside and consume the wasm. That way we would have a single wasm target generating:

  • binaryen_wasm.js (a trimmed down version of the current target)
  • binaryen_wasm_js.js (complements binaryen_wasm.js with the trimmed down stuff above)
  • binaryen_wasm_ts.js
  • binaryen_wasm_ts.d.ts

That's a breaking change in that the JS now requires 2 files instead of 1, but I don't anticipate anyone complaining about this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually tried out this approach as an experiment, and technically speaking it would work. Let me know if that's a better idea and I'll evolve this PR accordingly.

That said it not only impacts the binaryen_wasm.js target, but also the binaryen_js.js one, since they share the binaryen.js-post.js post-generation file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kripken, what do you think about this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding adding another file here, that seems possible in general but we do build for node.js using -sSINGLE_FILE. I'm actually not sure why we do that, but if we want to add another file here we should probably look into removing SINGLE_FILE first.

Separately, could another option be to always build for TS and let users use a TS-to-JS transpiler if they don't want TS? How is this kind of thing usually done in the JS/TS ecosystem?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I'm keeping the current JS is backwards compatibility. The build script already creates a JS from TS, but it's a bit different from the current one. If binaryen_wasm only has a few consumers then it's obviously simpler to drop the current one and advertise the breaking changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And it's done the usual way I.e. run tsc, which generates the JS file, a d.ts file and a .map file for debugging. Pretty standard.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks.

When you say "a bit different" is that stuff like code size or does it also behave differently? Is it a drop-in replacement?

Copy link
Contributor Author

@ericvergnaud ericvergnaud Dec 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new file behaves differently:

  1. the current file returns a function which must be awaited in order to access the various entry points, whereas the new one directly exposes the entry points (the await is done by the generated file)
  2. a few entry points have been moved or renamed for consistency
  3. xxxID constants have been dropped, purely out of lazyness - I can't think of a scenario where one would need them

I can't change 1, but I can fix 2 and 3 if breaking change 1 is acceptable and that leads do a single source of truth and just 2 targets (like we have today) instead of 3.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried turning off SINGLE_FILE. The purpose of this flag seems to be the following:

  • the first generated file is a .wasm file
  • if SINGLE_FILE is off, a .js file is generated, with code to consume the .wasm data, but locating the file is left to the application
  • if SINGLE_FILE is on, the binary .wasm is serialized to base64 and embedded in the .js
    (that doesn't affect the async loading requirement which is the blocking point for typescript, because you can't declare const and enums at runtime)

@@ -0,0 +1,4 @@
/* the purpose of this file is simply to enable local compilation of the binaryen_ts.ts file.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "enable local compilation" mean here? Is this tricking some tool into thinking that the generated file always exists?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"enable local compilation" means running tsc for binaryen_ts.ts outside the build. This is particularly useful for checking the typescript code during development.

@ericvergnaud
Copy link
Contributor Author

Closing in favor of a new PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants