Skip to content

Start to add lots of documentation #899

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Aug 2, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
104 changes: 104 additions & 0 deletions docs/ARCHITECTURE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# Architecture of Crates.io

This document is an intro to the codebase in this repo. If you want to work on a bug or a feature,
hopefully after reading this doc, you'll have a good idea of where to start looking for the code
you want to change.

This is a work in progress. Pull requests and issues to improve this document are very welcome!

## Documentation

Documentation about the codebase appears in these locations:

* `LICENSE-APACHE` and `LICENSE-MIT` - the terms under which this codebase is licensed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like package.json only lists MIT. I'm not sure if that is an oversight or if we need to word this more carefully. We should probably explicitly call out dual licensing in README.md

Also see: https://docs.npmjs.com/files/package.json#license

* `README.md` - Important information we want to show on the github front page.
* `docs/` - Long-form documentation.

## Backend - Rust

The backend of crates.io is written in Rust. Most of that code lives in the *src* directory. It
serves a JSON API over HTTP, and the HTTP server interface is provided by the [conduit][] crate and
related crates. More information about the backend is in
[`docs/BACKEND.md`](https://github.com/rust-lang/crates.io/blob/master/docs/BACKEND.md).

[conduit]: https://crates.io/crates/conduit

These files and directories have to do with the backend:

* `build.rs` - Cargo build script
* `Cargo.lock` - Locks dependencies to specific versions providing consistency across development
and deployment
* `Cargo.toml` - Defines the crate and its dependencies
* `migrations/` - Diesel migrations applied to the database during development and deployment
* `.rustfmt.toml` - Defines Rust coding style guidelines which are enforced by the CI environment
* `src/` - The backend's source code
* `target/` - Compiled output, including dependencies and final binary artifacts - (ignored in
`.gitignore`)
* `tmp/index-co` - The registry repository; in production this is cloned from Github and in
development from `tmp/index-bare` - (ignored in `.gitignore`)

The backend stores information in a Postgres database.

## Frontend - Ember.js

The frontend of crates.io is written in JavaScript using [Ember.js][]. More information about the
frontend is in [`docs/FRONTEND.md`](https://github.com/rust-lang/crates.io/blob/master/docs/FRONTEND.md).

[Ember.js]: https://emberjs.com/

These files have to do with the frontend:

* `app/` - The frontend's source code
* `config/{environment,targets}.js` - Configuration of the frontend
* `dist/` - Contains the distributable (optimized and self-contained) output of building the
frontend; served under the root `/` url - (ignored in `.gitignore`)
* `.ember-cli` - Settings for the `ember` command line interface
* `ember-cli-build.js` - Contains the build specification for Broccoli
* `.eslintrc.js` - Defines Javascript coding style guidelines (enforced during CI???)
* `mirage/` - A mock backend used during development and testing
* `node_modules/` - npm dependencies - (ignored in `.gitignore`)
* `package.json` - Defines the npm package and its dependencies
* `package-lock.json` - Locks dependencies to specific versions providing consistency across
development and deployment
* `public/` - Static files that are merged into `dist/` during build
* `testem.js` - Integration with Test'em Scripts
* `tests/` - Frontend tests
* `vendor/` - frontend dependencies not distributed by npm; not currently used

## Deployment - Heroku

Crates.io is deployed on [Heroku][https://heroku.com/]. See [`docs/MIRROR.md`][] for info about
setting up your own instance on Heroku!

[`docs/MIRROR.md`]: https://github.com/rust-lang/crates.io/blob/master/docs/MIRROR.md

These files are Heroku-specific; if you're deploying the crates.io codebase on another platform,
there's useful information in these files that you might need to translate to a different format
for another platform.

* `app.json` - Configuration for Heroku Deploy Button
* `.buildpacks` - A list of buildpacks used during deployment
* `config/nginx.conf.erb` - Template used by the nginx buildpack
* `.diesel_version` - Used by diesel buildpack to install a specific version of Diesel CLI during
deployment
* `Procfile` - Contains process type declarations for Heroku

## Development

These files are mostly only relevant when running crates.io's code in development mode.

* `.editorconfig` - Coding style definitions supported by some IDEs // TODO: Reference extensions
for common editors
* `.env` - Environment variables loaded by the backend - (ignored in `.gitignore`)
* `.env.sample` - Example environment file checked into the repository
* `.git/` - The git repository; not available in all deployments (e.g. Heroku)
* `.gitignore` - Configures git to ignore certain files and folders
* `script/init-local-index.sh` - Creates registry repositories used during development
* `tmp/` - Temporary files created during development; when deployed on Heroku this is the only
writable directory - (ignored in `.gitignore`)
* `tmp/index-bare` - A bare git repository, used as the origin for `tmp/index-co` during
development - (ignored in `.gitignore`)
* `.travis.yml` - Configuration for continous integration at [TravisCI][]
* `.watchmanconfig` - Use by Ember CLI to efficiently watch for file changes if you install watchman

[TravisCI]: https://travis-ci.org/rust-lang/crates.io
118 changes: 118 additions & 0 deletions docs/BACKEND.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
# Backend Overview

## Server

The code to actually run the server is in *src/bin/server.rs*. This is where most of the pieces of
the system are instantiated and configured, and can be thought of as the "entry point" to crates.io.

The server does the following things:

1. Initialize logging
2. Check out the index git repository, if it isn't already checked out
3. Reads values from environment variables to configure a new instance of `cargo_registry::App`
4. Adds middleware to the app by calling `cargo_registry::middleware`
5. Syncs the categories defined in *src/categories.toml* with the categories in the database
6. Starts a [civet][] `Server` that uses the `cargo_registry::App` instance
7. Tells Nginx on Heroku that the application is ready to receive requests, if running on Heroku
8. Blocks forever (or until the process is killed) waiting to receive messages on a channel that no
messages are ever sent to, in order to outive the civet `Server` threads

[civet]: https://crates.io/crates/civet

## Routes

The API URLs that the server responds to (aka "routes") are defined in
*src/lib.rs*.

All of the `api_router` routes are mounted under the `/api/v1` path (see the
lines that look like `router.get("/api/v1/*path", R(api_router.clone()));`).

Each API route definition looks like this:

```rust
api_router.get("/crates", C(krate::index));
```

This line defines a route that responds to a GET request made to
`/api/v1/crates` with the results of calling the `krate::index` function. `C`
is a struct that holds a function and implements the [`conduit::Handler`][]
trait so that the results of the function are the response if the function
succeeds, and that the server returns an error response if the function doesn't
succeed. The `C` struct's purpose is to reduce some boilerplate.

[`conduit::Handler`]: https://docs.rs/conduit/0.8.1/conduit/trait.Handler.html

## Code having to do with running a web application

These modules could *maybe* be refactored into another crate. Maybe not. But their primary purpose
is supporting the running of crates.io's web application parts, and they don't have much to do with
the crate registry purpose of the application.

### The `app` module

This contains the `App` struct, which holds a `Config` instance plus a few more application
components such as:

- The database connection pools (there are two until we finish migrating the app to use Diesel
everywhere)
- The GitHub OAuth configuration
- The cookie session key given to [conduit-cookie][]
- The `git2::Repository` instance for the index repo checkout
- The `Config` instance

This module also contains `AppMiddleware`, which implements the `Middleware` trait in order to
inject the `app` instance into every request. That way, we can call `req.app()` to get to any of
these components.

[conduit-cookie]: https://crates.io/crates/conduit-cookie

### The `config` module

### The `db` module

### The `dist` module

### The `http` module

### The `model` module

### The `schema` module

### The `utils` module

## Code having to do with managing a registry of crates

These modules are specific to the domain of being a crate registry. These concepts would exist no
matter what language or framework crates.io was implemented in.

### The `krate` module

### The `users` module

### The `badge` module

### The `categories` module

### The `category` module

### The `dependency` module

### The `download` module

### The `git` module

### The `keyword` module

### The `owner` module

### The `upload` module

### The `uploaders` module

### The `version` module

## Database

## Tests

## Scripts
7 changes: 7 additions & 0 deletions docs/FRONTEND.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Frontend Overview

The frontend of crates.io is written in JavaScript using [Ember.js][]. Most of that code lives in
the *src* directory. We endeavor to follow Ember conventions and best practices, but we're Rust
developers, so we don't always live up to this goal :)

[Ember.js]: https://emberjs.com/
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
"bugs": {
"url": "https://github.com/rust-lang/crates.io/issues"
},
"license": "MIT",
"license": "(MIT OR Apache-2.0)",
"author": "",
"directories": {
"doc": "docs",
Expand Down
24 changes: 22 additions & 2 deletions src/app.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
//! Application-wide components in a struct accessible from each request

use std::env;
use std::error::Error;
use std::path::PathBuf;
Expand All @@ -18,14 +20,19 @@ pub struct App {
/// The database connection pool
pub database: db::Pool,

/// The database connection pool
/// The diesel database connection pool
pub diesel_database: db::DieselPool,

/// The GitHub OAuth2 configuration
pub github: oauth2::Config,

/// A unique key used with conduit_cookie to generate cookies
pub session_key: String,

/// The crate index git repository
pub git_repo: Mutex<git2::Repository>,

/// The location on disk of the checkout of the crate index git repository
pub git_repo_checkout: PathBuf,

/// The server configuration
Expand All @@ -38,14 +45,20 @@ pub struct AppMiddleware {
}

impl App {
/// Creates a new `App` with a given `Config`
///
/// Configures and sets up:
///
/// - GitHub OAuth
/// - Database connection pools
/// - A `git2::Repository` instance from the index repo checkout (that server.rs ensures exists)
pub fn new(config: &Config) -> App {
let mut github = oauth2::Config::new(
&config.gh_client_id,
&config.gh_client_secret,
"https://github.com/login/oauth/authorize",
"https://github.com/login/oauth/access_token",
);

github.scopes.push(String::from("read:org"));

let db_pool_size = match (env::var("DB_POOL_SIZE"), config.env) {
Expand All @@ -66,6 +79,7 @@ impl App {
_ => 1,
};

// We need two connection pools until we finish transitioning everything to use diesel.
let db_config = r2d2::Config::builder()
.pool_size(db_pool_size)
.min_idle(db_min_idle)
Expand All @@ -78,6 +92,7 @@ impl App {
.build();

let repo = git2::Repository::open(&config.git_repo_checkout).unwrap();

App {
database: db::pool(&config.db_url, db_config),
diesel_database: db::diesel_pool(&config.db_url, diesel_db_config),
Expand All @@ -89,6 +104,11 @@ impl App {
}
}

/// Returns a handle for making HTTP requests to upload crate files.
///
/// The handle will go through a proxy if the uploader being used has specified one, which
/// is only done in test mode in order to be able to record and inspect the HTTP requests
/// that tests make.
pub fn handle(&self) -> Easy {
let mut handle = Easy::new();
if let Some(proxy) = self.config.uploader.proxy() {
Expand Down
Loading