|
| 1 | +# How Salsa works |
| 2 | + |
| 3 | +This chapter is based on the explanation given by Niko Matsakis in this |
| 4 | +[video](https://www.youtube.com/watch?v=_muY4HjSqVw) about |
| 5 | +[Salsa](https://github.com/salsa-rs/salsa). |
| 6 | + |
| 7 | +> Salsa is not used directly in rustc, but it is used extensively for |
| 8 | +> rust-analyzer and may be integrated into the compiler in the future. |
| 9 | +
|
| 10 | +## What is Salsa? |
| 11 | + |
| 12 | +Salsa is a library for incremental recomputation, this means reusing |
| 13 | +computation that has already been done in the past to increase the efficiency |
| 14 | +of future computations. |
| 15 | + |
| 16 | +The objectives of Salsa are: |
| 17 | + * Provide that functionality in an automatic way, so reusing old computations |
| 18 | + is done automatically by the library |
| 19 | + * Doing so in a "sound", or "correct", way, therefore leading to the same |
| 20 | + results as if it had been done from scratch |
| 21 | + |
| 22 | +Salsa's actual model is much richer, allowing many kinds of inputs and many |
| 23 | +different outputs. |
| 24 | +For example, integrating Salsa with an IDE could mean that the inputs could be |
| 25 | +the manifest (`Cargo.toml`), entire source files (`foo.rs`), snippets and so |
| 26 | +on; the outputs of such an integration could range from a binary executable, to |
| 27 | +lints, types (for example, if a user selects a certain variable and wishes to |
| 28 | +see its type), completions, etc. |
| 29 | + |
| 30 | +## How does it work? |
| 31 | + |
| 32 | +The first thing that Salsa has to do is identify the "base inputs" [^EN1]. |
| 33 | + |
| 34 | +Then Salsa has to also identify intermediate, "derived" values, which are |
| 35 | +something that the library produces, but, for each derived value there's a |
| 36 | +"pure" function that computes the derived value. |
| 37 | + |
| 38 | +For example, there might be a function `ast(x: Path) -> AST`. The produced |
| 39 | +`AST` isn't a final value, it's an intermidiate value that the library would |
| 40 | +use for the computation. |
| 41 | + |
| 42 | +This means that when you try to compute with the library, Salsa is going to |
| 43 | +compute various derived values, and eventually read the input and produce the |
| 44 | +result for the asked computation. |
| 45 | + |
| 46 | +In the course of computing, Salsa tracks which inputs were accessed and which |
| 47 | +values are derived. This information is used to determine what's going to |
| 48 | +happen when the inputs change: are the derived values still valid? |
| 49 | + |
| 50 | +This doesn't necessarily mean that each computation downstream from the input |
| 51 | +is going to be checked, which could be costly. Salsa only needs to check each |
| 52 | +downstream computation until it finds one that isn't changed. At that point, it |
| 53 | +won't check other derived computations since they wouldn't need to change. |
| 54 | + |
| 55 | +It's is helpful to think about this as a graph with nodes. Each derived value |
| 56 | +has a dependency on other values, which could themselves be either base or |
| 57 | +derived. Base values don't have a dependency. |
| 58 | + |
| 59 | +```ignore |
| 60 | +I <- A <- C ... |
| 61 | + | |
| 62 | +J <- B <--+ |
| 63 | +``` |
| 64 | + |
| 65 | +When an input `I` changes, the derived value `A` could change. The derived |
| 66 | +value `B` , which does not depend on `I`, `A`, or any value derived from `A` or |
| 67 | +`I`, is not subject to change. Therefore, Salsa can reuse the computation done |
| 68 | +for `B` in the past, without having to compute it again. |
| 69 | + |
| 70 | +The computation could also terminate early. Keeping the same graph as before, |
| 71 | +say that input `I` has changed in some way (and input `J` hasn't) but, when |
| 72 | +computing `A` again, it's found that `A` hasn't changed from the previous |
| 73 | +computation. This leads to an "early termination", because there's no need to |
| 74 | +check if `C` needs to change, since both `C` direct inputs, `A` and `B`, |
| 75 | +haven't changed. |
| 76 | + |
| 77 | +## Key Salsa concepts |
| 78 | + |
| 79 | +### Query |
| 80 | + |
| 81 | +A query is some value that Salsa can access in the course of computation. Each |
| 82 | +query can have a number of keys (from 0 to many), and all queries have a |
| 83 | +result, akin to functions. 0-key queries are called "input" queries. |
| 84 | + |
| 85 | +### Database |
| 86 | + |
| 87 | +The database is basically the context for the entire computation, it's meant to |
| 88 | +store Salsa's internal state, all intermediate values for each query, and |
| 89 | +anything else that the computation might need. The database must know all the |
| 90 | +queries that the library is going to do before it can be built, but they don't |
| 91 | +need to be specified in the same place. |
| 92 | + |
| 93 | +After the database is formed, it can be accessed with queries that are very |
| 94 | +similar to functions. Since each query's result is stored in the database, |
| 95 | +when a query is invoked N times, it will return N **cloned** results, without |
| 96 | +having to recompute the query (unless the input has changed in such a way that |
| 97 | +it warrants recomputation). |
| 98 | + |
| 99 | +For each input query (0-key), a "set" method is generated, allowing the user to |
| 100 | +change the output of such query, and trigger previous memoized values to be |
| 101 | +potentially invalidated. |
| 102 | + |
| 103 | +### Query Groups |
| 104 | + |
| 105 | +A query group is a set of queries which have been defined together as a unit. |
| 106 | +The database is formed by combining query groups. Query groups are akin to |
| 107 | +"Salsa modules" [^EN2]. |
| 108 | + |
| 109 | +A set of queries in a query group are just a set of methods in a trait. |
| 110 | + |
| 111 | +To create a query group a trait annotated with a specific attribute |
| 112 | +(`#[salsa::query_group(...)]`) has to be created. |
| 113 | + |
| 114 | +An argument must also be provided to said attribute as it will be used by Salsa |
| 115 | +to create a struct to be used later when the database is created. |
| 116 | + |
| 117 | +Example input query group: |
| 118 | + |
| 119 | +```rust,ignore |
| 120 | +/// This attribute will process this tree, produce this tree as output, and produce |
| 121 | +/// a bunch of intermidiate stuff that Salsa also uses. One of these things is a |
| 122 | +/// "StorageStruct", whose name we have specified in the attribute. |
| 123 | +/// |
| 124 | +/// This query group is a bunch of **input** queries, that do not rely on any |
| 125 | +/// derived input. |
| 126 | +#[salsa::query_group(InputsStorage)] |
| 127 | +pub trait Inputs { |
| 128 | + /// This attribute (`#[salsa::input]`) indicates that this query is a base |
| 129 | + /// input, therefore `set_manifest` is going to be auto-generated |
| 130 | + #[salsa::input] |
| 131 | + fn manifest(&self) -> Manifest; |
| 132 | +
|
| 133 | + #[salsa::input] |
| 134 | + fn source_text(&self, name: String) -> String; |
| 135 | +} |
| 136 | +``` |
| 137 | + |
| 138 | +To create a **derived** query group, one must specify which other query groups |
| 139 | +this one depends on by specifying them as supertraits, as seen in the following |
| 140 | +example: |
| 141 | + |
| 142 | +```rust,ignore |
| 143 | +/// This query group is going to contain queries that depend on derived values a |
| 144 | +/// query group can access another query group's queries by specifying the |
| 145 | +/// dependency as a super trait query groups can be stacked as much as needed using |
| 146 | +/// that pattern. |
| 147 | +#[salsa::query_group(ParserStorage)] |
| 148 | +pub trait Parser: Inputs { |
| 149 | + /// This query `ast` is not an input query, it's a derived query this means |
| 150 | + /// that a definition is necessary. |
| 151 | + fn ast(&self, name: String) -> String; |
| 152 | +} |
| 153 | +``` |
| 154 | + |
| 155 | +When creating a derived query the implementation of said query must be defined |
| 156 | +outside the trait. The definition must take a database parameter as an `impl |
| 157 | +Trait` (or `dyn Trait`), where `Trait` is the query group that the definition |
| 158 | +belongs to, in addition to the other keys. |
| 159 | + |
| 160 | +```rust,ignore |
| 161 | +///This is going to be the definition of the `ast` query in the `Parser` trait. |
| 162 | +///So, when the query `ast` is invoked, and it needs to be recomputed, Salsa is going to call this function |
| 163 | +///and it's is going to give it the database as `impl Parser`. |
| 164 | +///The function doesn't need to be aware of all the queries of all the query groups |
| 165 | +fn ast(db: &impl Parser, name: String) -> String { |
| 166 | + //! Note, `impl Parser` is used here but `dyn Parser` works just as well |
| 167 | + /* code */ |
| 168 | + ///By passing an `impl Parser`, this is allowed |
| 169 | + let source_text = db.input_file(name); |
| 170 | + /* do the actual parsing */ |
| 171 | + return ast; |
| 172 | +} |
| 173 | +``` |
| 174 | + |
| 175 | +Eventually, after all the query groups have been defined, the database can be |
| 176 | +created by declaring a struct. |
| 177 | + |
| 178 | +To specify which query groups are going to be part of the database an attribute |
| 179 | +(`#[salsa::database(...)]`) must be added. The argument of said attribute is a |
| 180 | +list of identifiers, specifying the query groups **storages**. |
| 181 | + |
| 182 | +```rust,ignore |
| 183 | +///This attribute specifies which query groups are going to be in the database |
| 184 | +#[salsa::database(InputsStorage, ParserStorage)] |
| 185 | +#[derive(Default)] //optional! |
| 186 | +struct MyDatabase { |
| 187 | + ///You also need this one field |
| 188 | + runtime : salsa::Runtime<MyDatabase>, |
| 189 | +} |
| 190 | +///And this trait has to be implemented |
| 191 | +impl salsa::Databse for MyDatabase { |
| 192 | + fn salsa_runtime(&self) -> &salsa::Runtime<MyDatabase> { |
| 193 | + &self.runtime |
| 194 | + } |
| 195 | +} |
| 196 | +``` |
| 197 | + |
| 198 | +Example usage: |
| 199 | + |
| 200 | +```rust,ignore |
| 201 | +fn main() { |
| 202 | + let db = MyDatabase::default(); |
| 203 | + db.set_manifest(...); |
| 204 | + db.set_source_text(...); |
| 205 | + loop { |
| 206 | + db.ast(...); //will reuse results |
| 207 | + db.set_source_text(...); |
| 208 | + } |
| 209 | +} |
| 210 | +``` |
| 211 | + |
| 212 | +[^EN1]: "They are not something that you **inaubible** but something that you kinda get **inaudible** from the outside [3:23](https://youtu.be/_muY4HjSqVw?t=203). |
| 213 | + |
| 214 | +[^EN2]: What is a Salsa module? |
0 commit comments