Skip to content
This repository was archived by the owner on Apr 16, 2021. It is now read-only.

docs: add blog posts for npm-on-ipfs #215

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
122 changes: 122 additions & 0 deletions content/post/73-putting-npm-on-ipfs-part-1.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
---
date: 2019-03-03
title: Putting npm on IPFS Part 1 - The Registry
author: Alex Potsides
---

[npm](https://www.npmjs.com) is the de facto package manager for the JavaScript ecosystem and the largest registry in the world, with more than [900k](https://replicate.npmjs.com/_all_docs) packages and over 7 billion downloads a week.

Today npm is incredibly fast and reliable thanks to the hard work put in by the NPM, Inc team, however we couldn't stop ourselves from wondering; what if we put the registry on the distributed web and took it to a whole new level?

Having dependencies on the distributed web makes development more resillient as there will be multiple nodes available to supply tarballs and some may even be on your local network which lowers bandwidth costs, is faster and can work without an internet connection. These benefits are still a work-in-progress, but with further iteration there's great potential to evolve this cool experiment into a more production-ready demo.

We're going to look at how we put npm onto IPFS in two parts - this is part one, all about the registry, including design goals, implementation details and infrasructure concerns. It will be followed by part two which describes a next-generation client you can use to install dependencies from the distributed web.

## 🗄️ registry.js.ipfs.io

We're maintaining a complete mirror of npm at https://registry.js.ipfs.io - the difference being it adds [CID](https://docs.ipfs.io/guides/concepts/cid)s to the metadata for every package which you can use to fetch packages from IPFS!

You can use the registry today by specifying the `--registry` parameter to npm:

```console
$ npm --registry=https://registry.js.ipfs.io install
```

...or Yarn:

```console
$ yarn --registry=https://registry.js.ipfs.io
```

This will instruct it to use our registry mirror instead of the default registry. You can add this to your config via:

```console
$ npm config set registry https://registry.js.ipfs.io
$ yarn config set registry https://registry.js.ipfs.io
```

### 📦 GET your dependencies, proxy everything else

Only `GET` requests are honoured by our mirror, all other requests are forwarded on to the npm registry, so you if you publish while using our mirror, your package will still end up on the public registry. Requests/response content is not logged so your credentials are never at rest on our servers.

Everything is also served over https unlike some very old npm modules which are served over http by the npm registry.

## 👷‍♀️ How we built it

A group of [ipfs-npm-registry-mirror](https://github.com/ipfs-shipyard/ipfs-npm-registry-mirror) nodes are running, they contain a [js-ipfs](https://github.com/ipfs/js-ipfs) instance used to share and resolve modules and are connected to the wider IPFS network. They also have a small http server used to respond to requests for module metadata and tarballs from the from the npm cli. These all sit behind an http load balancer to distribute traffic to the mirrors.

![Network topology](/73-putting-npm-on-ipfs-part-1/network-topology.png)

### 🗃️ The datastore

Each mirror has an [IPFS Repository](https://github.com/ipfs/specs/tree/master/repo) that is backed by AWS S3 via an instance of [datastore-s3](https://github.com/ipfs/js-datastore-s3) - this lets us deploy the service as a set of immutable containers, handle huge amounts of data cost effectively and scale up and down as required at the price of a small amount of latency on transfers.

![Datastore S3](/73-putting-npm-on-ipfs-part-1/datastore-s3.png)

If you'd like to leverage S3 for your IPFS node, check out the example of how to configure IPFS to use S3 in the [datastore-s3 examples folder](https://github.com/ipfs/js-datastore-s3/tree/v0.2.3/examples/full-s3-repo).

### 📝 Module metadata

Each module has a set of metadata that describes the versions that are available, along with the tarballs that make up the release. It is a JSON document that we store in the [MFS](https://docs.ipfs.io/guides/concepts/mfs/) under the directory `/npm-registry`, so the module [`ipfs`](https://www.npmjs.com/package/ipfs), for example, would look like:

```console
$ jsipfs files read /npm-registry/ipfs
{"_id":"ipfs","_rev":"122-28686ac76345db3f398b88ae73346a15","name":"ipfs","description":"JavaScript implementation of the IPFS specification","di...
```

The metadata for a [module on registry.js.ipfs.io](https://registry.js.ipfs.io/ipfs) is almost identical to that [on the public registry](https://registry.npmjs.org/ipfs), the only difference is that we store CIDs and the original download location:

```javascript
{
...
"name": "ipfs",
...
"versions": {
"0.34.4": {
"name": "ipfs",
...
"dist": {
"tarball": "https://registry.js.ipfs.io/ipfs/-/ipfs-0.34.4.tgz",
"source": "https://registry.npmjs.org/ipfs/-/ipfs-0.34.4.tgz",
"cid": "bafybeiafts7s65iodk4wsetucwhury3cpk4fge374wxhdu5vzc4zuli4xi"
}
},
...
```

The CID resolves to the tarball for a module, but while the module has been added to IPFS is not stored in the MFS so we resolve it using the CID:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The CID resolves to the tarball for a module, but while the module has been added to IPFS is not stored in the MFS so we resolve it using the CID:
The CID resolves to the tarball for a module, but although the module has been added to IPFS is not stored in the MFS so we resolve it using the CID:

Explain why it's not necessary to store it in MFS.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is it not necessary to store it in MFS?


```console
$ jsipfs files read /ipfs/bafybeiafts7s65iodk4wsetucwhury3cpk4fge374wxhdu5vzc4zuli4xi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've just said it's not stored in MFS but have used a command we've documented as being for primary use with MFS to read it - perhaps using ipfs cat is more appropriate here?

??V?0:O=E??g?ܢ?]n<?l?...
```

### 🔍 Resolving a tarball

The npm CLI first requests the metadata for a given module - this contains all published versions and URLs to tarballs for that version. It is a JSON document the CLI uses to select which version to request that satisfies the [semver](https://semver.org/) range the developer has specified in their `package.json` file.

The CLI then requests a tarball - we use the URL to load the metadata from the package and from that to look up the CID of the tarball. If the metadata contains a CID for the package, we use it to fetch the content from the IPFS network and relay it to the user. If the CID is not present, we request the tarball from the npm registry and add it to IPFS while streaming it back to the user. Once the stream is complete we update the manifest with the CID for future use.

![Request sequence](/73-putting-npm-on-ipfs-part-1/request.png)

### 🆕 Updates

What about new modules? npm is constantly being updated with people publishing new modules every few minutes. npm [publishes a feed](https://replicate.npmjs.com/registry) that works like a CouchDB replication log - we have a [replication server](https://github.com/ipfs-shipyard/ipfs-npm-registry-mirror/tree/master/packages/replication-master) that watches this feed for new modules being published.

![Replication](/73-putting-npm-on-ipfs-part-1/replication.png)

When it sees a new module published, the replicator pulls down any new tarballs, adds them to IPFS and updates the metadata with the CIDs of the new module.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...and explain that this is the reason we're using MFS?


It then uses [pubsub](https://blog.ipfs.io/25-pubsub/) to inform the mirrors of the update, which then update their local copies of the module metadata with the new versions & CIDs.

### 🧳 Host your own version

The code for [ipfs-npm-registry-mirror](https://github.com/ipfs-shipyard/ipfs-npm-registry-mirror) is completely open source so you can host a version on your own infrastructure, or on a Raspberry Pi. If you try this, [let us know](https://github.com/ipfs-shipyard/ipfs-npm-registry-mirror/issues) how you get on!

## 🎁 What's next?

While you can use the registry today by specifying the `--registry=https://registry.js.ipfs.io` flag, you are still using HTTP to request modules even if they are then resolved internally via IPFS.

What if you could skip that part and just use IPFS?

Stay tuned for [Putting npm on IPFS Part 2 - The Client](/post/74-putting-npm-on-ipfs-part-2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't wait. Can we get this one merged and published 🚢🚢🚢??

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@achingbrain can has final review and finishing touches pls?

90 changes: 90 additions & 0 deletions content/post/74-putting-npm-on-ipfs-part-2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
---
date: 2019-03-04
title: Putting npm on IPFS Part 2 - The Client
author: Alex Potsides
---

In [Putting npm on IPFS Part 1 - The Registry]((/post/70-putting-npm-on-ipfs-part-1)) we looked at [registry.js.ipfs.io](https://registry.js.ipfs.io) - a mirror of the npm registry that adds [CID](https://docs.ipfs.io/guides/concepts/cid)s to package metadata and uses IPFS to fetch tarballs using IPFS.

Using the registry in this way works, but it still depends on HTTP and DNS to resolve your dependencies. So far so centralised.

What if there was a way to skip that bit and have it be distributed-web-turtles all the way down?

You would be able to speedily fetch dependencies from local peers, save hard drive space with a de-duplicated local data store, save on bandwidth costs and increase your resilience by not relying on central services!

** drumroll **

## 🙌 Introducing ipfs-npm

![npm on IPFS](https://github.com/ipfs-shipyard/npm-on-ipfs/raw/master/img/npm-on-ipfs.jpg)

[`ipfs-npm`](https://www.npmjs.com/package/ipfs-npm) is a command line tool that spins up an IPFS node and uses that to resolve the tarballs that make up your project's dependencies ([other configurations are available](https://www.npmjs.com/package/ipfs-npm#cli)). It works with npm and Yarn, and will add your dependencies to the [IPFS Repository](https://github.com/ipfs/specs/tree/master/repo) of your node so you can resolve them offline later on.

### 🚚 Installation

You can install it with:

```console
$ npm install -g ipfs-npm
```

### 🔧 Usage

`ipfs-npm` proxies for the npm cli, passing all arguments on, so use it as you would npm, just with a different command name:

```console
$ cd my-project
$ ipfs-npm install
👿 Spawning an in-process IPFS node using repo at /Users/alex/.jsipfs
Swarm listening on /ip4/127.0.0.1/tcp/57029/ipfs/QmZ7vEvXRdTVipb9o2p2Cmt3s1S8rqbo2ohscTjvTLgnpP
🗂️ Loading registry index from https://registry.js.ipfs.io
☎️ Dialling registry mirror /ip4/35.178.192.119/tcp/10015/ipfs/QmWBaYSnmgZi6F6D69JuZGhyL8rm6pt8GX5r7Atc6Gd7vR,/dns4/registry.js.ipfs.io/tcp/10015/ipfs/QmWBaYSnmgZi6F6D69JuZGhyL8rm6pt8GX5r7Atc6Gd7vR
📱️ Connected to registry
👩‍🚀 Starting local proxy
🚀 Server running on port 37847
🎁 Installing dependencies with /Users/alex/.nvm/versions/node/v10.15.1/bin/npm
...
```

If you prefer Yarn and have it installed globally, use the `ipfs-yarn` command (installed alongside `ipfs-npm`):

```console
$ cd my-project
$ ipfs-yarn
👿 Spawning an in-process IPFS node using repo at /Users/alex/.jsipfs
Swarm listening on /ip4/127.0.0.1/tcp/49905/ipfs/QmZ7vEvXRdTVipb9o2p2Cmt3s1S8rqbo2ohscTjvTLgnpP
🗂️ Loading registry index from https://registry.js.ipfs.io
☎️ Dialling registry mirror /ip4/35.178.192.119/tcp/10040/ipfs/QmfKqxieE71QoAchNk5e2MKmvWKjGdUnSifHqq1xZLEzyn,/dns4/registry.js.ipfs.io/tcp/10040/ipfs/QmfKqxieE71QoAchNk5e2MKmvWKjGdUnSifHqq1xZLEzyn
📱️ Connected to registry
👩‍🚀 Starting local proxy
🚀 Server running on port 49935
🎁 Installing dependencies with /Users/alex/.nvm/versions/node/v10.15.1/bin/yarn
yarn install v1.13.0
...
```

## 🙋 How does it work?

Behind the scenes `ipfs-npm` first starts a local http server and configures npm/Yarn to use it as the registry. When a module's [packument](https://github.com/zkat/pacote/tree/33c53cf10b080e78182bccc56ec1d5126f8b627e#packument) is requested, it uses the same package metadata as [registry.js.ipfs.io](https://registry.js.ipfs.io) to satisfy the request, also checking with the central registry for any updated versions and including them if available.

`npm`/`yarn` uses the package metadata to select a tarball to request that satisfies the [semver](https://semver.org/) range in the developer's `package.json` for the dependency. Once the tarball request is recieved, it uses the requested URL to look up the relevant CID in the module's metadata - if a CID is present, it uses IPFS to request the content, otherwise it downloads it from the public npm registry, adds it to IPFS and stores the CID for the next time it's requested.

You can use `ipfs-npm` with an existing IPFS node running on your computer or remotely, or fall back to the default configuration which is to run an IPFS node for the duration of the install.

![Request sequence](/74-putting-npm-on-ipfs-part-2/ipfs-npm-sequence.png)

If this sounds familiar it may be because you've just read [part 1](/post/73-putting-npm-on-ipfs-part-1) - indeed it's the exact same code fulfilling the exact same function, just locally on your computer instead of on a server somewhere.

The http server is necessary because at the time of writing neither npm or Yarn support IPFS as a transport - hopefully one day this won't be necessary but as it stands no http traffic leaves your machine, unless you request modules that don't have CIDs in their metadata.

It also dials the ipfs-npm-registry-mirror directly to improve the speed of content resolution - this is partly because we know that the mirror will have the content we are after but also because there is no [DHT](https://ipfs.io/ipfs/QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1mXWo6uco/wiki/Distributed_hash_table.html) in js-IPFS (yet - it's coming!) so discovery would have to occur via the gateway nodes.

## 🎁 What's next?

So many things!

* We'd love (love!) to get IPFS into npm and Yarn as a first-class transport option
* Once the DHT is enabled in js-ipfs v0.36.0 we'll not need to contact the npm mirror, so a point of failure will be removed and startup will be faster
* How about a service you can run on on your local network to further speed up package resolution with no need to get onto the Internet?

Publishing? Identity? The posibilities are endless - if you'd like to help out please visit [ipfs-shipyard/npm-on-ipfs](https://github.com/ipfs-shipyard/npm-on-ipfs), or check out the [📦 IPFS Package Managers Special Interest Group](https://github.com/ipfs/package-managers) for other research, demos, and experiments.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added static/73-putting-npm-on-ipfs-part-1/request.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.