The HTML Highlighter is a JavaScript module that solves these problems:
- Display colorful highlights on words in a live Web page that are
determined by either or both of these sources:
- User
selections
identified by a user dragging their pointer over a portion of a page, possibly covering multiple tags in the DOM tree. - Machine
selections
identified by a program, which might run in the browser or in a server-side environment that processes the HTML and text of a page to decide which portions of content should be marked.
- User
- Provide these offsets to either JavaScript or backend tools. StreamCorpus Pipeline is being extended to provide translation between the relative offsets generated by HTML Highlighter and the absolute character offsets used by many backend text processing tools.
- Provide objects isomorphic to JavaScript's
Range
object, which has character offsets relative to DOM nodes identified by Xpaths:
{
start: {
xpath: <string> // unique address to DOM node
offset: <int> // relative character offset
},
end: {
xpath: <string> // unique address to DOM node
offset: <int> // relative character offset
}
}
The inline comments and class documentation are sufficient for a JavaScript programmer to jump in and start using this. To see an example, you can:
$ git clone git://github.com/dossier/html-highlighter
$ cd html-highlighter
$ yarn install
$ yarn start
$ $BROWSER http://localhost:8080/examples/monolith/
Note the trailing slash in the URL fragment examples/monolith/
. Omitting
the slash will result in something loading but not quite running.
The HTML Highlighter library relies on a number of dependencies that must be installed before any other step is taken. Installing dependencies can be done as given below:
$ yarn install
When hacking on the HTML Highlighter, running the following command frees one from having to manually compile the code with each iteration. Bundles are automatically generated as changes are made, making development a breeze.
Generated assets can be accessed on a browser via the URL
http://localhost:8080
, however the port can be customized via the environment
variable NODE_PORT
.
$ yarn start
An alternative dynamic build mode relies on webpack's watch method for building
bundles incrementally as changes are made. Assets are placed in the directory
dist
.
$ yarn start:watch
Finally, the command given below creates a static development build which is
composed of a minified bundle of the HTML Highlighter library with the suffix
.min.js
added to distinguish it from the development build artifact. All
artifacts are placed inside the dist
directory and its existing contents are
left untouched. Note that this command must be run each time changes are made
to the code.
$ yarn build:min
Creating a production build requires running a simple command and results in only the HTML Highlighter library being built, fully optimized and minimized. Everything else is omitted.
$ yarn prepublish
Running tests requires a standard terminal environment and can be accomplished by executing the following command:
$ yarn test
Note that tests relying on the document.createRange
function are skipped due
to the fact that jsdom, the virtual DOM environment used, does not provide an
implementation.
To release this repository to NPM, we use the following command:
npm version <type>
Once that command has been run, execute the following command to push the new commit and its tag.
git push --follow-tags
The automated build in CircleCI will then publish the package to NPM.