Skip to content

Commit 77931ce

Browse files
Merge #899
899: Start to add lots of documentation r=carols10cents I started this a while ago, but an email from @vignesh-sankaran prompted me to get what I have out there rather than bitrotting on my machine. I'd love a review of what I have so far, @vignesh-sankaran! What do you think of the outline of the architecture document? What kinds of things would make that document a more useful high-level overview? The doc comments should hopefully make the docs at https://docs.rs/cargo-registry/ better, I don't think the little bit I've added here warrants a version bump and publish yet, though.
2 parents 120f800 + 25383e5 commit 77931ce

File tree

7 files changed

+339
-12
lines changed

7 files changed

+339
-12
lines changed

docs/ARCHITECTURE.md

Lines changed: 104 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,104 @@
1+
# Architecture of Crates.io
2+
3+
This document is an intro to the codebase in this repo. If you want to work on a bug or a feature,
4+
hopefully after reading this doc, you'll have a good idea of where to start looking for the code
5+
you want to change.
6+
7+
This is a work in progress. Pull requests and issues to improve this document are very welcome!
8+
9+
## Documentation
10+
11+
Documentation about the codebase appears in these locations:
12+
13+
* `LICENSE-APACHE` and `LICENSE-MIT` - the terms under which this codebase is licensed.
14+
* `README.md` - Important information we want to show on the github front page.
15+
* `docs/` - Long-form documentation.
16+
17+
## Backend - Rust
18+
19+
The backend of crates.io is written in Rust. Most of that code lives in the *src* directory. It
20+
serves a JSON API over HTTP, and the HTTP server interface is provided by the [conduit][] crate and
21+
related crates. More information about the backend is in
22+
[`docs/BACKEND.md`](https://github.com/rust-lang/crates.io/blob/master/docs/BACKEND.md).
23+
24+
[conduit]: https://crates.io/crates/conduit
25+
26+
These files and directories have to do with the backend:
27+
28+
* `build.rs` - Cargo build script
29+
* `Cargo.lock` - Locks dependencies to specific versions providing consistency across development
30+
and deployment
31+
* `Cargo.toml` - Defines the crate and its dependencies
32+
* `migrations/` - Diesel migrations applied to the database during development and deployment
33+
* `.rustfmt.toml` - Defines Rust coding style guidelines which are enforced by the CI environment
34+
* `src/` - The backend's source code
35+
* `target/` - Compiled output, including dependencies and final binary artifacts - (ignored in
36+
`.gitignore`)
37+
* `tmp/index-co` - The registry repository; in production this is cloned from Github and in
38+
development from `tmp/index-bare` - (ignored in `.gitignore`)
39+
40+
The backend stores information in a Postgres database.
41+
42+
## Frontend - Ember.js
43+
44+
The frontend of crates.io is written in JavaScript using [Ember.js][]. More information about the
45+
frontend is in [`docs/FRONTEND.md`](https://github.com/rust-lang/crates.io/blob/master/docs/FRONTEND.md).
46+
47+
[Ember.js]: https://emberjs.com/
48+
49+
These files have to do with the frontend:
50+
51+
* `app/` - The frontend's source code
52+
* `config/{environment,targets}.js` - Configuration of the frontend
53+
* `dist/` - Contains the distributable (optimized and self-contained) output of building the
54+
frontend; served under the root `/` url - (ignored in `.gitignore`)
55+
* `.ember-cli` - Settings for the `ember` command line interface
56+
* `ember-cli-build.js` - Contains the build specification for Broccoli
57+
* `.eslintrc.js` - Defines Javascript coding style guidelines (enforced during CI???)
58+
* `mirage/` - A mock backend used during development and testing
59+
* `node_modules/` - npm dependencies - (ignored in `.gitignore`)
60+
* `package.json` - Defines the npm package and its dependencies
61+
* `package-lock.json` - Locks dependencies to specific versions providing consistency across
62+
development and deployment
63+
* `public/` - Static files that are merged into `dist/` during build
64+
* `testem.js` - Integration with Test'em Scripts
65+
* `tests/` - Frontend tests
66+
* `vendor/` - frontend dependencies not distributed by npm; not currently used
67+
68+
## Deployment - Heroku
69+
70+
Crates.io is deployed on [Heroku][https://heroku.com/]. See [`docs/MIRROR.md`][] for info about
71+
setting up your own instance on Heroku!
72+
73+
[`docs/MIRROR.md`]: https://github.com/rust-lang/crates.io/blob/master/docs/MIRROR.md
74+
75+
These files are Heroku-specific; if you're deploying the crates.io codebase on another platform,
76+
there's useful information in these files that you might need to translate to a different format
77+
for another platform.
78+
79+
* `app.json` - Configuration for Heroku Deploy Button
80+
* `.buildpacks` - A list of buildpacks used during deployment
81+
* `config/nginx.conf.erb` - Template used by the nginx buildpack
82+
* `.diesel_version` - Used by diesel buildpack to install a specific version of Diesel CLI during
83+
deployment
84+
* `Procfile` - Contains process type declarations for Heroku
85+
86+
## Development
87+
88+
These files are mostly only relevant when running crates.io's code in development mode.
89+
90+
* `.editorconfig` - Coding style definitions supported by some IDEs // TODO: Reference extensions
91+
for common editors
92+
* `.env` - Environment variables loaded by the backend - (ignored in `.gitignore`)
93+
* `.env.sample` - Example environment file checked into the repository
94+
* `.git/` - The git repository; not available in all deployments (e.g. Heroku)
95+
* `.gitignore` - Configures git to ignore certain files and folders
96+
* `script/init-local-index.sh` - Creates registry repositories used during development
97+
* `tmp/` - Temporary files created during development; when deployed on Heroku this is the only
98+
writable directory - (ignored in `.gitignore`)
99+
* `tmp/index-bare` - A bare git repository, used as the origin for `tmp/index-co` during
100+
development - (ignored in `.gitignore`)
101+
* `.travis.yml` - Configuration for continous integration at [TravisCI][]
102+
* `.watchmanconfig` - Use by Ember CLI to efficiently watch for file changes if you install watchman
103+
104+
[TravisCI]: https://travis-ci.org/rust-lang/crates.io

docs/BACKEND.md

Lines changed: 118 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,118 @@
1+
# Backend Overview
2+
3+
## Server
4+
5+
The code to actually run the server is in *src/bin/server.rs*. This is where most of the pieces of
6+
the system are instantiated and configured, and can be thought of as the "entry point" to crates.io.
7+
8+
The server does the following things:
9+
10+
1. Initialize logging
11+
2. Check out the index git repository, if it isn't already checked out
12+
3. Reads values from environment variables to configure a new instance of `cargo_registry::App`
13+
4. Adds middleware to the app by calling `cargo_registry::middleware`
14+
5. Syncs the categories defined in *src/categories.toml* with the categories in the database
15+
6. Starts a [civet][] `Server` that uses the `cargo_registry::App` instance
16+
7. Tells Nginx on Heroku that the application is ready to receive requests, if running on Heroku
17+
8. Blocks forever (or until the process is killed) waiting to receive messages on a channel that no
18+
messages are ever sent to, in order to outive the civet `Server` threads
19+
20+
[civet]: https://crates.io/crates/civet
21+
22+
## Routes
23+
24+
The API URLs that the server responds to (aka "routes") are defined in
25+
*src/lib.rs*.
26+
27+
All of the `api_router` routes are mounted under the `/api/v1` path (see the
28+
lines that look like `router.get("/api/v1/*path", R(api_router.clone()));`).
29+
30+
Each API route definition looks like this:
31+
32+
```rust
33+
api_router.get("/crates", C(krate::index));
34+
```
35+
36+
This line defines a route that responds to a GET request made to
37+
`/api/v1/crates` with the results of calling the `krate::index` function. `C`
38+
is a struct that holds a function and implements the [`conduit::Handler`][]
39+
trait so that the results of the function are the response if the function
40+
succeeds, and that the server returns an error response if the function doesn't
41+
succeed. The `C` struct's purpose is to reduce some boilerplate.
42+
43+
[`conduit::Handler`]: https://docs.rs/conduit/0.8.1/conduit/trait.Handler.html
44+
45+
## Code having to do with running a web application
46+
47+
These modules could *maybe* be refactored into another crate. Maybe not. But their primary purpose
48+
is supporting the running of crates.io's web application parts, and they don't have much to do with
49+
the crate registry purpose of the application.
50+
51+
### The `app` module
52+
53+
This contains the `App` struct, which holds a `Config` instance plus a few more application
54+
components such as:
55+
56+
- The database connection pools (there are two until we finish migrating the app to use Diesel
57+
everywhere)
58+
- The GitHub OAuth configuration
59+
- The cookie session key given to [conduit-cookie][]
60+
- The `git2::Repository` instance for the index repo checkout
61+
- The `Config` instance
62+
63+
This module also contains `AppMiddleware`, which implements the `Middleware` trait in order to
64+
inject the `app` instance into every request. That way, we can call `req.app()` to get to any of
65+
these components.
66+
67+
[conduit-cookie]: https://crates.io/crates/conduit-cookie
68+
69+
### The `config` module
70+
71+
### The `db` module
72+
73+
### The `dist` module
74+
75+
### The `http` module
76+
77+
### The `model` module
78+
79+
### The `schema` module
80+
81+
### The `utils` module
82+
83+
## Code having to do with managing a registry of crates
84+
85+
These modules are specific to the domain of being a crate registry. These concepts would exist no
86+
matter what language or framework crates.io was implemented in.
87+
88+
### The `krate` module
89+
90+
### The `users` module
91+
92+
### The `badge` module
93+
94+
### The `categories` module
95+
96+
### The `category` module
97+
98+
### The `dependency` module
99+
100+
### The `download` module
101+
102+
### The `git` module
103+
104+
### The `keyword` module
105+
106+
### The `owner` module
107+
108+
### The `upload` module
109+
110+
### The `uploaders` module
111+
112+
### The `version` module
113+
114+
## Database
115+
116+
## Tests
117+
118+
## Scripts

docs/FRONTEND.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# Frontend Overview
2+
3+
The frontend of crates.io is written in JavaScript using [Ember.js][]. Most of that code lives in
4+
the *src* directory. We endeavor to follow Ember conventions and best practices, but we're Rust
5+
developers, so we don't always live up to this goal :)
6+
7+
[Ember.js]: https://emberjs.com/

package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
"bugs": {
66
"url": "https://github.com/rust-lang/crates.io/issues"
77
},
8-
"license": "MIT",
8+
"license": "(MIT OR Apache-2.0)",
99
"author": "",
1010
"directories": {
1111
"doc": "docs",

src/app.rs

Lines changed: 22 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
//! Application-wide components in a struct accessible from each request
2+
13
use std::env;
24
use std::error::Error;
35
use std::path::PathBuf;
@@ -18,14 +20,19 @@ pub struct App {
1820
/// The database connection pool
1921
pub database: db::Pool,
2022

21-
/// The database connection pool
23+
/// The diesel database connection pool
2224
pub diesel_database: db::DieselPool,
2325

2426
/// The GitHub OAuth2 configuration
2527
pub github: oauth2::Config,
2628

29+
/// A unique key used with conduit_cookie to generate cookies
2730
pub session_key: String,
31+
32+
/// The crate index git repository
2833
pub git_repo: Mutex<git2::Repository>,
34+
35+
/// The location on disk of the checkout of the crate index git repository
2936
pub git_repo_checkout: PathBuf,
3037

3138
/// The server configuration
@@ -38,14 +45,20 @@ pub struct AppMiddleware {
3845
}
3946

4047
impl App {
48+
/// Creates a new `App` with a given `Config`
49+
///
50+
/// Configures and sets up:
51+
///
52+
/// - GitHub OAuth
53+
/// - Database connection pools
54+
/// - A `git2::Repository` instance from the index repo checkout (that server.rs ensures exists)
4155
pub fn new(config: &Config) -> App {
4256
let mut github = oauth2::Config::new(
4357
&config.gh_client_id,
4458
&config.gh_client_secret,
4559
"https://github.com/login/oauth/authorize",
4660
"https://github.com/login/oauth/access_token",
4761
);
48-
4962
github.scopes.push(String::from("read:org"));
5063

5164
let db_pool_size = match (env::var("DB_POOL_SIZE"), config.env) {
@@ -66,6 +79,7 @@ impl App {
6679
_ => 1,
6780
};
6881

82+
// We need two connection pools until we finish transitioning everything to use diesel.
6983
let db_config = r2d2::Config::builder()
7084
.pool_size(db_pool_size)
7185
.min_idle(db_min_idle)
@@ -78,6 +92,7 @@ impl App {
7892
.build();
7993

8094
let repo = git2::Repository::open(&config.git_repo_checkout).unwrap();
95+
8196
App {
8297
database: db::pool(&config.db_url, db_config),
8398
diesel_database: db::diesel_pool(&config.db_url, diesel_db_config),
@@ -89,6 +104,11 @@ impl App {
89104
}
90105
}
91106

107+
/// Returns a handle for making HTTP requests to upload crate files.
108+
///
109+
/// The handle will go through a proxy if the uploader being used has specified one, which
110+
/// is only done in test mode in order to be able to record and inspect the HTTP requests
111+
/// that tests make.
92112
pub fn handle(&self) -> Easy {
93113
let mut handle = Easy::new();
94114
if let Some(proxy) = self.config.uploader.proxy() {

0 commit comments

Comments
 (0)