Skip to content

Commit 9e8af60

Browse files
mark-i-mspastorino
authored andcommitted
Add Karrq's salsa chapter (#529)
* add Karrq's salsa chapter * add youtu.be short url
1 parent 3984184 commit 9e8af60

File tree

3 files changed

+216
-1
lines changed

3 files changed

+216
-1
lines changed

book.toml

+1-1
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,6 @@ level = 1
1515

1616
[output.linkcheck]
1717
follow-web-links = true
18-
exclude = [ "crates\\.io", "gcc\\.godbolt\\.org", "youtube\\.com", "dl\\.acm\\.org" ]
18+
exclude = [ "crates\\.io", "gcc\\.godbolt\\.org", "youtube\\.com", "youtu\\.be", "dl\\.acm\\.org" ]
1919
cache-timeout = 172800
2020
warning-policy = "error"

src/SUMMARY.md

+1
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@
3838
- [Incremental compilation](./queries/incremental-compilation.md)
3939
- [Incremental compilation In Detail](./queries/incremental-compilation-in-detail.md)
4040
- [Debugging and Testing](./incrcomp-debugging.md)
41+
- [Salsa](./salsa.md)
4142
- [Lexing and Parsing](./the-parser.md)
4243
- [`#[test]` Implementation](./test-implementation.md)
4344
- [Macro expansion](./macro-expansion.md)

src/salsa.md

+214
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,214 @@
1+
# How Salsa works
2+
3+
This chapter is based on the explanation given by Niko Matsakis in this
4+
[video](https://www.youtube.com/watch?v=_muY4HjSqVw) about
5+
[Salsa](https://github.com/salsa-rs/salsa).
6+
7+
> Salsa is not used directly in rustc, but it is used extensively for
8+
> rust-analyzer and may be integrated into the compiler in the future.
9+
10+
## What is Salsa?
11+
12+
Salsa is a library for incremental recomputation, this means reusing
13+
computation that has already been done in the past to increase the efficiency
14+
of future computations.
15+
16+
The objectives of Salsa are:
17+
* Provide that functionality in an automatic way, so reusing old computations
18+
is done automatically by the library
19+
* Doing so in a "sound", or "correct", way, therefore leading to the same
20+
results as if it had been done from scratch
21+
22+
Salsa's actual model is much richer, allowing many kinds of inputs and many
23+
different outputs.
24+
For example, integrating Salsa with an IDE could mean that the inputs could be
25+
the manifest (`Cargo.toml`), entire source files (`foo.rs`), snippets and so
26+
on; the outputs of such an integration could range from a binary executable, to
27+
lints, types (for example, if a user selects a certain variable and wishes to
28+
see its type), completions, etc.
29+
30+
## How does it work?
31+
32+
The first thing that Salsa has to do is identify the "base inputs" [^EN1].
33+
34+
Then Salsa has to also identify intermediate, "derived" values, which are
35+
something that the library produces, but, for each derived value there's a
36+
"pure" function that computes the derived value.
37+
38+
For example, there might be a function `ast(x: Path) -> AST`. The produced
39+
`AST` isn't a final value, it's an intermidiate value that the library would
40+
use for the computation.
41+
42+
This means that when you try to compute with the library, Salsa is going to
43+
compute various derived values, and eventually read the input and produce the
44+
result for the asked computation.
45+
46+
In the course of computing, Salsa tracks which inputs were accessed and which
47+
values are derived. This information is used to determine what's going to
48+
happen when the inputs change: are the derived values still valid?
49+
50+
This doesn't necessarily mean that each computation downstream from the input
51+
is going to be checked, which could be costly. Salsa only needs to check each
52+
downstream computation until it finds one that isn't changed. At that point, it
53+
won't check other derived computations since they wouldn't need to change.
54+
55+
It's is helpful to think about this as a graph with nodes. Each derived value
56+
has a dependency on other values, which could themselves be either base or
57+
derived. Base values don't have a dependency.
58+
59+
```ignore
60+
I <- A <- C ...
61+
|
62+
J <- B <--+
63+
```
64+
65+
When an input `I` changes, the derived value `A` could change. The derived
66+
value `B` , which does not depend on `I`, `A`, or any value derived from `A` or
67+
`I`, is not subject to change. Therefore, Salsa can reuse the computation done
68+
for `B` in the past, without having to compute it again.
69+
70+
The computation could also terminate early. Keeping the same graph as before,
71+
say that input `I` has changed in some way (and input `J` hasn't) but, when
72+
computing `A` again, it's found that `A` hasn't changed from the previous
73+
computation. This leads to an "early termination", because there's no need to
74+
check if `C` needs to change, since both `C` direct inputs, `A` and `B`,
75+
haven't changed.
76+
77+
## Key Salsa concepts
78+
79+
### Query
80+
81+
A query is some value that Salsa can access in the course of computation. Each
82+
query can have a number of keys (from 0 to many), and all queries have a
83+
result, akin to functions. 0-key queries are called "input" queries.
84+
85+
### Database
86+
87+
The database is basically the context for the entire computation, it's meant to
88+
store Salsa's internal state, all intermediate values for each query, and
89+
anything else that the computation might need. The database must know all the
90+
queries that the library is going to do before it can be built, but they don't
91+
need to be specified in the same place.
92+
93+
After the database is formed, it can be accessed with queries that are very
94+
similar to functions. Since each query's result is stored in the database,
95+
when a query is invoked N times, it will return N **cloned** results, without
96+
having to recompute the query (unless the input has changed in such a way that
97+
it warrants recomputation).
98+
99+
For each input query (0-key), a "set" method is generated, allowing the user to
100+
change the output of such query, and trigger previous memoized values to be
101+
potentially invalidated.
102+
103+
### Query Groups
104+
105+
A query group is a set of queries which have been defined together as a unit.
106+
The database is formed by combining query groups. Query groups are akin to
107+
"Salsa modules" [^EN2].
108+
109+
A set of queries in a query group are just a set of methods in a trait.
110+
111+
To create a query group a trait annotated with a specific attribute
112+
(`#[salsa::query_group(...)]`) has to be created.
113+
114+
An argument must also be provided to said attribute as it will be used by Salsa
115+
to create a struct to be used later when the database is created.
116+
117+
Example input query group:
118+
119+
```rust,ignore
120+
/// This attribute will process this tree, produce this tree as output, and produce
121+
/// a bunch of intermidiate stuff that Salsa also uses. One of these things is a
122+
/// "StorageStruct", whose name we have specified in the attribute.
123+
///
124+
/// This query group is a bunch of **input** queries, that do not rely on any
125+
/// derived input.
126+
#[salsa::query_group(InputsStorage)]
127+
pub trait Inputs {
128+
/// This attribute (`#[salsa::input]`) indicates that this query is a base
129+
/// input, therefore `set_manifest` is going to be auto-generated
130+
#[salsa::input]
131+
fn manifest(&self) -> Manifest;
132+
133+
#[salsa::input]
134+
fn source_text(&self, name: String) -> String;
135+
}
136+
```
137+
138+
To create a **derived** query group, one must specify which other query groups
139+
this one depends on by specifying them as supertraits, as seen in the following
140+
example:
141+
142+
```rust,ignore
143+
/// This query group is going to contain queries that depend on derived values a
144+
/// query group can access another query group's queries by specifying the
145+
/// dependency as a super trait query groups can be stacked as much as needed using
146+
/// that pattern.
147+
#[salsa::query_group(ParserStorage)]
148+
pub trait Parser: Inputs {
149+
/// This query `ast` is not an input query, it's a derived query this means
150+
/// that a definition is necessary.
151+
fn ast(&self, name: String) -> String;
152+
}
153+
```
154+
155+
When creating a derived query the implementation of said query must be defined
156+
outside the trait. The definition must take a database parameter as an `impl
157+
Trait` (or `dyn Trait`), where `Trait` is the query group that the definition
158+
belongs to, in addition to the other keys.
159+
160+
```rust,ignore
161+
///This is going to be the definition of the `ast` query in the `Parser` trait.
162+
///So, when the query `ast` is invoked, and it needs to be recomputed, Salsa is going to call this function
163+
///and it's is going to give it the database as `impl Parser`.
164+
///The function doesn't need to be aware of all the queries of all the query groups
165+
fn ast(db: &impl Parser, name: String) -> String {
166+
//! Note, `impl Parser` is used here but `dyn Parser` works just as well
167+
/* code */
168+
///By passing an `impl Parser`, this is allowed
169+
let source_text = db.input_file(name);
170+
/* do the actual parsing */
171+
return ast;
172+
}
173+
```
174+
175+
Eventually, after all the query groups have been defined, the database can be
176+
created by declaring a struct.
177+
178+
To specify which query groups are going to be part of the database an attribute
179+
(`#[salsa::database(...)]`) must be added. The argument of said attribute is a
180+
list of identifiers, specifying the query groups **storages**.
181+
182+
```rust,ignore
183+
///This attribute specifies which query groups are going to be in the database
184+
#[salsa::database(InputsStorage, ParserStorage)]
185+
#[derive(Default)] //optional!
186+
struct MyDatabase {
187+
///You also need this one field
188+
runtime : salsa::Runtime<MyDatabase>,
189+
}
190+
///And this trait has to be implemented
191+
impl salsa::Databse for MyDatabase {
192+
fn salsa_runtime(&self) -> &salsa::Runtime<MyDatabase> {
193+
&self.runtime
194+
}
195+
}
196+
```
197+
198+
Example usage:
199+
200+
```rust,ignore
201+
fn main() {
202+
let db = MyDatabase::default();
203+
db.set_manifest(...);
204+
db.set_source_text(...);
205+
loop {
206+
db.ast(...); //will reuse results
207+
db.set_source_text(...);
208+
}
209+
}
210+
```
211+
212+
[^EN1]: "They are not something that you **inaubible** but something that you kinda get **inaudible** from the outside [3:23](https://youtu.be/_muY4HjSqVw?t=203).
213+
214+
[^EN2]: What is a Salsa module?

0 commit comments

Comments
 (0)