Skip to content

Commit 167de29

Browse files
committed
Move README renderings off the web server
Rendering the README is a process that can take an arbitrarily long amount of time, and ends with uploading a file to S3 -- which also can take an arbitrarily long period of time. Since there are no README contents which we consider invalid (which I've indicated by changing the return type of the rendering functions to `String` instead of `Result`), there's no reason for us to do any of this work on the web server. Just like the index updates, we can offload this to our worker to be completed at some point in the future. This does mean that a crate will render in our UI before the README has finished rendering. During this brief period, we will behave the same as if the crate had no README file, showing the description instead. Eventually I think it'd make sense to have this job behave like the `render-readmes` worker, where it pulls the tar file down and reads the actual readme file. This would mean that Cargo can stop sending us the README as its own field when its already sending us a tar file which contains the README. I've left it alone for now though, since there's no benefit while Cargo is already sending us the README, and we can't have it stop doing that until we move to publish v2. A few implementation notes: - The `readme` column on `crates` was never used for anything in Rust, but is used on the DB side to generate the search index. I've left the column in place, but removed most uses of it on the Rust side. - The `readme_file` column on `crates` never used for anything. This commit does not include the migration, which must be deployed separately. This commit only includes code which works with or without the migration for deployment. - `background_jobs::Environment` gained two fields that are just straight up copied from `App`. I've opted to do this rather than just giving it `App`, since I do want to separate "the stuff you need to do background jobs" from "the stuff you need to be the web server". Since we do eventually plan on not having s3 uploads happen on the server, I want to bundle all of this into a single field and just have it live on `background_jobs::Environment` eventually. - I did some general cleanup of `krate::publish` while I was in there, removed a lot of pre-1.0 style code. - I've left the `render-readmes` binary alone for now since it's untested. I'll change it to just queue up jobs once we get to the point where jobs are doing all the tar file work as well.
1 parent 0d47f35 commit 167de29

17 files changed

+165
-165
lines changed

src/app.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ pub struct App {
3333
/// In production this shares a single connection pool across requests. In tests
3434
/// this is either None (in which case any attempt to create an outgoing connection
3535
/// will panic) or a `Client` configured with a per-test replay proxy.
36-
http_client: Option<Client>,
36+
pub(crate) http_client: Option<Client>,
3737
}
3838

3939
impl App {

src/background_jobs.rs

Lines changed: 24 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,10 @@
11
use std::panic::AssertUnwindSafe;
22
use std::sync::{Mutex, MutexGuard, PoisonError};
3+
use swirl::PerformError;
34

45
use crate::db::{DieselPool, DieselPooledConn};
56
use crate::git::Repository;
7+
use crate::uploaders::Uploader;
68
use crate::util::errors::{CargoErrToStdErr, CargoResult};
79

810
impl<'a> swirl::db::BorrowedConnection<'a> for DieselPool {
@@ -23,18 +25,24 @@ pub struct Environment {
2325
pub credentials: Option<(String, String)>,
2426
// FIXME: https://github.com/sfackler/r2d2/pull/70
2527
pub connection_pool: AssertUnwindSafe<DieselPool>,
28+
pub uploader: Uploader,
29+
http_client: AssertUnwindSafe<Option<reqwest::Client>>,
2630
}
2731

2832
impl Environment {
2933
pub fn new(
3034
index: Repository,
3135
credentials: Option<(String, String)>,
3236
connection_pool: DieselPool,
37+
uploader: Uploader,
38+
http_client: Option<reqwest::Client>,
3339
) -> Self {
3440
Self {
3541
index: Mutex::new(index),
3642
credentials,
3743
connection_pool: AssertUnwindSafe(connection_pool),
44+
uploader,
45+
http_client: AssertUnwindSafe(http_client),
3846
}
3947
}
4048

@@ -44,13 +52,27 @@ impl Environment {
4452
.map(|(u, p)| (u.as_str(), p.as_str()))
4553
}
4654

47-
pub fn connection(&self) -> CargoResult<DieselPooledConn<'_>> {
48-
self.connection_pool.0.get().map_err(Into::into)
55+
pub fn connection(&self) -> Result<DieselPooledConn<'_>, PerformError> {
56+
self.connection_pool
57+
.get()
58+
.map_err(|e| CargoErrToStdErr(e).into())
4959
}
5060

5161
pub fn lock_index(&self) -> CargoResult<MutexGuard<'_, Repository>> {
5262
let repo = self.index.lock().unwrap_or_else(PoisonError::into_inner);
5363
repo.reset_head()?;
5464
Ok(repo)
5565
}
66+
67+
/// Returns a client for making HTTP requests to upload crate files.
68+
///
69+
/// # Panics
70+
///
71+
/// Panics if no client was given when constructing the environment. This
72+
/// will occur in tests unless `TestApp::with_proxy()` is called.
73+
pub(crate) fn http_client(&self) -> &reqwest::Client {
74+
self.http_client
75+
.as_ref()
76+
.expect("No HTTP client is configured. In tests, use `TestApp::with_proxy()`.")
77+
}
5678
}

src/bin/background-worker.rs

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,13 @@ fn main() {
3535

3636
let repository = Repository::open(&config.index_location).expect("Failed to clone index");
3737

38-
let environment = Environment::new(repository, credentials, db_pool.clone());
38+
let environment = Environment::new(
39+
repository,
40+
credentials,
41+
db_pool.clone(),
42+
config.uploader,
43+
Some(reqwest::Client::new()),
44+
);
3945

4046
let runner = swirl::Runner::builder(db_pool, environment)
4147
.thread_count(1)

src/bin/render-readmes.rs

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -114,7 +114,7 @@ fn main() {
114114
let mut tasks = Vec::with_capacity(page_size as usize);
115115
for (version, krate_name) in versions {
116116
let config = config.clone();
117-
version.record_readme_rendering(&conn).unwrap_or_else(|_| {
117+
Version::record_readme_rendering(version.id, &conn).unwrap_or_else(|_| {
118118
panic!(
119119
"[{}-{}] Couldn't record rendering time",
120120
krate_name, version.num
@@ -215,7 +215,6 @@ fn get_readme(
215215
.map_or("README.md", |e| &**e),
216216
manifest.package.repository.as_ref().map(|e| &**e),
217217
)
218-
.unwrap_or_else(|_| panic!("[{}-{}] Couldn't render README", krate_name, version.num))
219218
};
220219
return Some(rendered);
221220

src/controllers/krate/publish.rs

Lines changed: 39 additions & 52 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,6 @@
11
//! Functionality related to publishing a new crate or version of a crate.
22
33
use hex::ToHex;
4-
use std::collections::HashMap;
54
use std::sync::Arc;
65
use swirl::Job;
76

@@ -10,8 +9,8 @@ use crate::git;
109
use crate::models::dependency;
1110
use crate::models::{Badge, Category, Keyword, NewCrate, NewVersion, Rights, User};
1211
use crate::render;
13-
use crate::util::{internal, CargoError, ChainError, Maximums};
1412
use crate::util::{read_fill, read_le_u32};
13+
use crate::util::{CargoError, ChainError, Maximums};
1514
use crate::views::{EncodableCrateUpload, GoodCrate, PublishWarnings};
1615

1716
/// Handles the `PUT /crates/new` route.
@@ -39,29 +38,6 @@ pub fn publish(req: &mut dyn Request) -> CargoResult<Response> {
3938

4039
let (new_crate, user) = parse_new_headers(req)?;
4140

42-
let name = &*new_crate.name;
43-
let vers = &*new_crate.vers;
44-
let links = new_crate.links.clone();
45-
let repo = new_crate.repository.as_ref().map(|s| &**s);
46-
let features = new_crate
47-
.features
48-
.iter()
49-
.map(|(k, v)| {
50-
(
51-
k[..].to_string(),
52-
v.iter().map(|v| v[..].to_string()).collect(),
53-
)
54-
})
55-
.collect::<HashMap<String, Vec<String>>>();
56-
let keywords = new_crate
57-
.keywords
58-
.as_ref()
59-
.map(|kws| kws.iter().map(|kw| &***kw).collect())
60-
.unwrap_or_else(Vec::new);
61-
62-
let categories = new_crate.categories.as_ref().map(|s| &s[..]).unwrap_or(&[]);
63-
let categories: Vec<_> = categories.iter().map(|k| &***k).collect();
64-
6541
let conn = app.diesel_database.get()?;
6642

6743
let verified_email_address = user.verified_email(&conn)?;
@@ -75,15 +51,34 @@ pub fn publish(req: &mut dyn Request) -> CargoResult<Response> {
7551
// Create a transaction on the database, if there are no errors,
7652
// commit the transactions to record a new or updated crate.
7753
conn.transaction(|| {
54+
let name = new_crate.name;
55+
let vers = &*new_crate.vers;
56+
let links = new_crate.links;
57+
let repo = new_crate.repository;
58+
let features = new_crate
59+
.features
60+
.into_iter()
61+
.map(|(k, v)| (k.0, v.into_iter().map(|v| v.0).collect()))
62+
.collect();
63+
let keywords = new_crate
64+
.keywords
65+
.iter()
66+
.map(|s| s.as_str())
67+
.collect::<Vec<_>>();
68+
let categories = new_crate
69+
.categories
70+
.iter()
71+
.map(|s| s.as_str())
72+
.collect::<Vec<_>>();
73+
7874
// Persist the new crate, if it doesn't already exist
7975
let persist = NewCrate {
80-
name,
76+
name: &name,
8177
description: new_crate.description.as_ref().map(|s| &**s),
8278
homepage: new_crate.homepage.as_ref().map(|s| &**s),
8379
documentation: new_crate.documentation.as_ref().map(|s| &**s),
8480
readme: new_crate.readme.as_ref().map(|s| &**s),
85-
readme_file: new_crate.readme_file.as_ref().map(|s| &**s),
86-
repository: repo,
81+
repository: repo.as_ref().map(String::as_str),
8782
license: new_crate.license.as_ref().map(|s| &**s),
8883
max_upload_size: None,
8984
};
@@ -101,7 +96,7 @@ pub fn publish(req: &mut dyn Request) -> CargoResult<Response> {
10196
));
10297
}
10398

104-
if &krate.name != name {
99+
if krate.name != *name {
105100
return Err(human(&format_args!(
106101
"crate was previously named `{}`",
107102
krate.name
@@ -163,31 +158,33 @@ pub fn publish(req: &mut dyn Request) -> CargoResult<Response> {
163158
let ignored_invalid_badges = Badge::update_crate(&conn, &krate, new_crate.badges.as_ref())?;
164159
let max_version = krate.max_version(&conn)?;
165160

166-
// Render the README for this crate
167-
let readme = match new_crate.readme.as_ref() {
168-
Some(readme) => Some(render::readme_to_html(
169-
&**readme,
170-
new_crate.readme_file.as_ref().map_or("README.md", |s| &**s),
161+
if let Some(readme) = new_crate.readme {
162+
render::render_and_upload_readme(
163+
version.id,
164+
readme,
165+
new_crate
166+
.readme_file
167+
.unwrap_or_else(|| String::from("README.md")),
171168
repo,
172-
)?),
173-
None => None,
174-
};
169+
)
170+
.enqueue(&conn)
171+
.map_err(|e| CargoError::from_std_error(e))?;
172+
}
175173

176174
// Upload the crate, return way to delete the crate from the server
177175
// If the git commands fail below, we shouldn't keep the crate on the
178176
// server.
179-
let (cksum, mut crate_bomb, mut readme_bomb) = app
177+
let cksum = app
180178
.config
181179
.uploader
182-
.upload_crate(req, &krate, readme, maximums, vers)?;
183-
version.record_readme_rendering(&conn)?;
180+
.upload_crate(req, &krate, maximums, vers)?;
184181

185182
let mut hex_cksum = String::new();
186183
cksum.write_hex(&mut hex_cksum)?;
187184

188185
// Register this crate in our local git repo.
189186
let git_crate = git::Crate {
190-
name: name.to_string(),
187+
name: name.0,
191188
vers: vers.to_string(),
192189
cksum: hex_cksum,
193190
features,
@@ -197,17 +194,7 @@ pub fn publish(req: &mut dyn Request) -> CargoResult<Response> {
197194
};
198195
git::add_crate(git_crate)
199196
.enqueue(&conn)
200-
.map_err(|e| CargoError::from_std_error(e))
201-
.chain_error(|| {
202-
internal(&format_args!(
203-
"could not add crate `{}` to the git repo",
204-
name
205-
))
206-
})?;
207-
208-
// Now that we've come this far, we're committed!
209-
crate_bomb.path = None;
210-
readme_bomb.path = None;
197+
.map_err(|e| CargoError::from_std_error(e))?;
211198

212199
// The `other` field on `PublishWarnings` was introduced to handle a temporary warning
213200
// that is no longer needed. As such, crates.io currently does not return any `other`

src/git.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -165,7 +165,7 @@ pub fn yank(
165165
let repo = env.lock_index().map_err(std_error_no_send)?;
166166
let dst = repo.index_file(&krate);
167167

168-
let conn = env.connection().map_err(std_error_no_send)?;
168+
let conn = env.connection()?;
169169

170170
conn.transaction(|| {
171171
let yanked_in_db = versions::table

src/middleware/run_pending_background_jobs.rs

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,13 @@ impl Middleware for RunPendingBackgroundJobs {
2222

2323
let connection_pool = app.diesel_database.clone();
2424
let repo = Repository::open(&app.config.index_location).expect("Could not clone index");
25-
let environment = Environment::new(repo, None, app.diesel_database.clone());
25+
let environment = Environment::new(
26+
repo,
27+
None,
28+
app.diesel_database.clone(),
29+
app.config.uploader.clone(),
30+
app.http_client.clone(),
31+
);
2632

2733
let runner = Runner::builder(connection_pool, environment)
2834
// We only have 1 connection in tests, so trying to run more than

src/models/krate.rs

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -40,8 +40,6 @@ pub struct Crate {
4040
pub description: Option<String>,
4141
pub homepage: Option<String>,
4242
pub documentation: Option<String>,
43-
pub readme: Option<String>,
44-
pub readme_file: Option<String>,
4543
pub license: Option<String>,
4644
pub repository: Option<String>,
4745
pub max_upload_size: Option<i32>,
@@ -58,8 +56,6 @@ type AllColumns = (
5856
crates::description,
5957
crates::homepage,
6058
crates::documentation,
61-
crates::readme,
62-
crates::readme_file,
6359
crates::license,
6460
crates::repository,
6561
crates::max_upload_size,
@@ -74,8 +70,6 @@ pub const ALL_COLUMNS: AllColumns = (
7470
crates::description,
7571
crates::homepage,
7672
crates::documentation,
77-
crates::readme,
78-
crates::readme_file,
7973
crates::license,
8074
crates::repository,
8175
crates::max_upload_size,
@@ -101,7 +95,6 @@ pub struct NewCrate<'a> {
10195
pub homepage: Option<&'a str>,
10296
pub documentation: Option<&'a str>,
10397
pub readme: Option<&'a str>,
104-
pub readme_file: Option<&'a str>,
10598
pub repository: Option<&'a str>,
10699
pub max_upload_size: Option<i32>,
107100
pub license: Option<&'a str>,

src/models/version.rs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -99,12 +99,12 @@ impl Version {
9999
})
100100
}
101101

102-
pub fn record_readme_rendering(&self, conn: &PgConnection) -> QueryResult<usize> {
102+
pub fn record_readme_rendering(version_id_: i32, conn: &PgConnection) -> QueryResult<usize> {
103103
use crate::schema::readme_renderings::dsl::*;
104104
use diesel::dsl::now;
105105

106106
diesel::insert_into(readme_renderings)
107-
.values(version_id.eq(self.id))
107+
.values(version_id.eq(version_id_))
108108
.on_conflict(version_id)
109109
.do_update()
110110
.set(rendered_at.eq(now))

0 commit comments

Comments
 (0)