-
Notifications
You must be signed in to change notification settings - Fork 645
Set up the groundwork for a port to Diesel #589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set up the groundwork for a port to Diesel #589
Conversation
1689966
to
0296ec4
Compare
So just glancing through where the codebase is at now, a few notes in terms of feature support.
None of this would block finishing the port, but we will have a few places where we'll have to fall back to raw SQL. And I probably have 2 more releases of Diesel worth of work to do to support all this without raw SQL. :P |
I'm fine with having to fall back to raw sql sometimes. THEY'RE GOOD QUERIES, SGROF |
but but https://github.com/rust-lang/crates.io/blob/master/src/krate.rs#L497 I'd like to refactor that part, one because it could use some more structure and two to eventually support things like searching with a query within a keyword. I'll think about how we could rework this. |
Yes, everything except the |
The DATABASE_URL env var set by heroku does not have
There's at least one place I know crates.io depends on this-- when uploading a new version, if the git updates or s3 uploads fail, we roll back the database additions. By "it's unnecessary most of the time", do you mean that Diesel has a way to wrap those requests where it is necessary in a transaction? @alexcrichton, do you know of any other places besides new crate and tests where we're relying on each request being wrapped in a transaction? |
Yes. We can wrap in a transaction where it's needed (this PR does that in |
I've updated to require SSL on Heroku. (Probably not even required, since the default value is |
8a61fc2
to
d2cfce3
Compare
Hm. Well so I wrote the whole web app on this assumption, so we may want to leave it for now :). In general there's lots of transactional points such as:
We currently rely on database transactions so if github fails the insertion is rolled back. Same for tons of API calls and such, the database is updated willy-nilly in the assumption that a failure down the line will roll back updates. Also yeah I believe SSL is required on Heroku nowadays (I found it hard to find docs on that...). The ironic part is that although SSL is required they're not using valid certificates, so it's still MitM happy all over the place. I emailed them awhile back about this and they said yeah that's what most database drivers do, ignore certs. ;_; |
To be clear, I have no issue with wrapping everything in a transaction, I just would rather do it in the requests that perform updates than around every request implicitly. Since our transactions are lexically scoped, it's difficult to establish lazily if we need to wrap each request. |
Returning an error would still rollback the transaction, we just wouldn't wrap it in endpoints like |
Ah yeah so if errors still roll back transactions then that's fine by me, it was the only intention of the setup. (I thought that was standard "write a web app" practice, right?) Unfortunately crates.io doesn't have a lexical scope for the transaction, you can see that with the bit of unsafe code around managing the postgres transaction (I think it's still there at least...) Also FWIW one speed bump I hit with Heroku + Diesel in the past was that Diesel needed a db connection at compile-time but w/ Heroku you don't have that (only at runtime). Is that |
Errors rolling back definitely is. I haven't seen "wrap all requests in a transaction" too often though.
We can do it in a middleware, but it would require eagerly acquiring the database connection. And yeah, we'd have to do something similar to what you're doing with the postgres transaction (and rely on Diesel internals) to avoid that.
|
Also I thought you could get a DB connection at build time with Heroku now? Maybe I'm mistaken |
Asked my friend who works at Heroku.
|
@alexcrichton So we can use |
Oh interesting, how do you roll back without a transaction? (I thought transactions were how that was done). Acquiring a db connection eagerly is probably ok, we can always investigate other routes if it turns out to be a problem. Ah yeah and that makes sense (first deploy not having a db). I'm fine either way w/ crates.io, but it does mean that redeployment may be difficult if we ever try to do it... |
Sorry to be more clear, you do use a transaction, just one that's explicit (or implicit at a lower level. e.g. Rails wraps every
It was a problem in tests, but I can dig in and see if I can figure out why.
I'll go with the manual file for now, since getting that |
d2cfce3
to
a3decb3
Compare
I've rebased and updated to not connect to the database at compile time. |
👍 |
Ah, I've figured out what the issue with tests was. I'll go ahead and update this to retain the auto transaction wrapping for now. I'll continue to explicitly wrap anything that looks like it should be in a transaction though, so we can easily go to a lazy connection if we decide it's necessary. |
Actually I think that's best left for a separate PR, since this one is quite large already. Are there any other concerns I need to address? |
This PR changes the `users#show` endpoint to use Diesel instead of rust-postgres, (and adds a test for that endpoint). I chose that endpoint as it seemed likely to touch as few things as possible, but enough to require setting up Diesel connection pooling, and testing. I have also ported over the migrations to use Diesel's migration infrastructure. The migration files (except the final migration) were generated programatically. Any non-reversible migrations were not dumped as these contained procedural data updates which aren't relevant to setting up a new database. `cargo run --bin migrate` will move all the entries in the old migrations table into the one used by Diesel. That function is brittle since it relies on Diesel internals. However, it really only matters for existing contributors (and one deploy to Heroku), so when that function breaks we can just delete it. I've added an additional migration to make the schema compatible with `infer_schema!`, which doesn't support tables that have no primary key. I'm not using `infer_schema!` just yet, as it doesn't work with non-core types and there are some tsvector columns. I re-ordered the columns on the `User` struct, as diesel emits an explicit select clause and then fetches columns by index rather than by name. This means that the order has to match how it was defined to Diesel (which in the case of `infer_schema!` will be the database definition order). If we don't want this restriction, we can replace the `infer_schema!` call with manual `table!` calls (which can be automatically generated by `diesel print-schema`), and have the columns ordered there to match the struct. Differences to note ------------------- The rust-postgres connection is set to require TLS when the env var `HEROKU` is set. In Diesel, this would be to add `?sslmode=require` onto the database URL. I'm unsure if heroku-postgres URLs already have this or not, so I have not added any explicit code for that. It's a question that should be answered before this is deployed. Additionally, I chose not to automatically wrap each request in a transaction. It's unneccessary most of the time, and seems like it was only done to make testing easier. Since Diesel has explicit support for "run tests in a transaction that never commits", we don't need that here.
This isn't strictly necessary, but inferring the `crates` table requires a feature that Diesel won't have until 0.12, and this has the fewest unanswered questions.
a3decb3
to
f5f3d85
Compare
So I ran Also could you update the README with new setup instructions? Based on the travis changes, it looks like people need to |
Looks like it needs to be |
Yeah, by default it installs with all the supported backends so if you don't have mysql and sqlite installed you'll want |
Yes. It also migrated your old migrations table over to the Diesel one. |
oops, found another spot-- could you update the Procfile too? |
I've left the original migration binary in there for now, since we need to migrate the database to use Diesel's migration infrastructure on the first deploy. After this commit has been deployed it can be removed.
|
||
use app::{App, RequestApp}; | ||
use util::{CargoResult, LazyCell, internal}; | ||
|
||
pub type Pool = r2d2::Pool<PCM>; | ||
pub type Config = r2d2::Config<pg::Connection, r2d2_postgres::Error>; | ||
type PooledConnnection = r2d2::PooledConnection<PCM>; | ||
pub type DieselPool = r2d2::Pool<ConnectionManager<PgConnection>>; | ||
type DieselPooledConn = r2d2::PooledConnection<ConnectionManager<PgConnection>>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
@@ -0,0 +1,171 @@ | |||
// This file can be regenerated with `diesel print-schema` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
YAYYYY A SCHEMA!!!!!!
pub gh_access_token: String, | ||
pub api_token: String, | ||
pub gh_login: String, | ||
pub name: Option<String>, | ||
pub avatar: Option<String>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, Queryable
cares about the order of the fields on User
but it doesn't care that the field here is named avatar
but the field in the db is named gh_avatar
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. We will generate an explicit select
clause so we know the order that the fields come back in, and then access them by order. This is because we can really only represent rows as a tuple, we can't express "has a field called name
that is of this type" in the type system. We also don't assume that your Queryable
structs are one-to-one with a database table. That said, I'm looking at ways to optionally allow by-name instead of by-index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, ok, gotcha.
Seems to work locally, just have that one question about |
wait how does heroku know it needs diesel-cli? |
It doesn't. I'm making a buildpack now. |
lolololol |
@@ -42,6 +43,7 @@ addons: | |||
|
|||
env: | |||
global: | |||
- DATABASE_URL=postgres://postgres:@localhost/cargo_registry_test | |||
- TEST_DATABASE_URL=postgres://postgres:@localhost/cargo_registry_test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh so if travis is using DATABASE_URL now, do we need to set TEST_DATABASE_URL anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DATABASE_URL
is just used by diesel for migrations by convention. The tests still run using TEST_DATABASE_URL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was also using DATABASE_URL
for compilation before, but since we don't need that now I can just specify --database-url
instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean just for diesel database setup
, not migrations, right? because it looks like src/tests/all.rs
takes care of the migrations and uses TEST_DATABASE_URL? I just want to make sure I understand :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. diesel database setup
runs migrations as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah i see
We'll want to drop `schema_migrations` sometime soonish, but for now let's make sure that everything just works. We don't want to incur downtime if we have to roll back.
8a245a9
to
203555f
Compare
R2D2 tries to immediately populate the pool, and we're running out of connections when it does so. Since Diesel is only used on one endpoint, we don't need 10 connections.
omg it's working on staging, hold onto your butts |
#588 (comment)
This PR changes the
users#show
endpoint to use Diesel instead ofrust-postgres, (and adds a test for that endpoint). I chose that
endpoint as it seemed likely to touch as few things as possible, but
enough to require setting up Diesel connection pooling, and testing.
I have also ported over the migrations to use Diesel's migration
infrastructure. The migration files (except the final migration) were
generated programatically. Any non-reversible migrations were not dumped
as these contained procedural data updates which aren't relevant to
setting up a new database.
cargo run --bin migrate
will move all the entries in the oldmigrations table into the one used by Diesel. That function is brittle
since it relies on Diesel internals. However, it really only matters for
existing contributors (and one deploy to Heroku), so when that function
breaks we can just delete it.
I've added an additional migration to make the schema compatible with
infer_schema!
, which doesn't support tables that have no primary key.I'm not using
infer_schema!
just yet, as it doesn't work with non-coretypes and there are some tsvector columns.
I re-ordered the columns on the
User
struct, as diesel emits anexplicit select clause and then fetches columns by index rather than by
name. This means that the order has to match how it was defined to
Diesel (which in the case of
infer_schema!
will be the databasedefinition order). If we don't want this restriction, we can replace the
infer_schema!
call with manualtable!
calls (which can beautomatically generated by
diesel print-schema
), and have the columnsordered there to match the struct.
Differences to note
The rust-postgres connection is set to require TLS when the env var
HEROKU
is set. In Diesel, this would be to add?sslmode=require
ontothe database URL. I'm unsure if heroku-postgres URLs already have this
or not, so I have not added any explicit code for that. It's a question
that should be answered before this is deployed.
Additionally, I chose not to automatically wrap each request in a
transaction. It's unneccessary most of the time, and seems like it was
only done to make testing easier. Since Diesel has explicit support for
"run tests in a transaction that never commits", we don't need that
here.