@@ -185,37 +185,36 @@ A regular expression program is essentially a sequence of opcodes produced by
185185the compiler plus various facts about the regular expression (such as whether
186186it is anchored, its capture names, etc.).
187187
188- ### The regex! macro (or why ` regex::internal ` exists)
189-
190- The ` regex! ` macro is defined in the ` regex_macros ` crate as a compiler plugin,
191- which is maintained in this repository. The ` regex! ` macro compiles a regular
192- expression at compile time into specialized Rust code.
193-
194- The ` regex! ` macro was written when this library was first conceived and
195- unfortunately hasn't changed much since then. In particular, it encodes the
196- entire Pike VM into stack allocated space (no heap allocation is done). When
197- ` regex! ` was first written, this provided a substantial speed boost over
198- so-called "dynamic" regexes compiled at runtime, and in particular had much
199- lower overhead per match. This was because the only matching engine at the
200- time was the Pike VM. The addition of other matching engines has inverted
201- the relationship; the ` regex! ` macro is almost never faster than the dynamic
202- variant. (In fact, it is typically substantially slower.)
203-
204- In order to build the ` regex! ` macro this way, it must have access to some
205- internals of the regex library, which is in a distinct crate. (Compiler plugins
206- must be part of a distinct crate.) Namely, it must be able to compile a regular
207- expression and access its opcodes. The necessary internals are exported as part
208- of the top-level ` internal ` module in the regex library, but is hidden from
209- public documentation. In order to present a uniform API between programs build
210- by the ` regex! ` macro and their dynamic analoges, the ` Regex ` type is an enum
211- whose variants are hidden from public documentation.
212-
213- In the future, the ` regex! ` macro should probably work more like Ragel, but
214- it's not clear how hard this is. In particular, the ` regex! ` macro should be
215- able to support all the features of dynamic regexes, which may be hard to do
216- with a Ragel-style implementation approach. (Which somewhat suggests that the
217- ` regex! ` macro may also need to grow conditional execution logic like the
218- dynamic variants, which seems rather grotesque.)
188+ ### The regex! macro
189+
190+ The ` regex! ` macro no longer exists. It was developed in a bygone era as a
191+ compiler plugin during the infancy of the regex crate. Back then, then only
192+ matching engine in the crate was the Pike VM. The ` regex! ` macro was, itself,
193+ also a Pike VM. The only advantages it offered over the dynamic Pike VM that
194+ was built at runtime were the following:
195+
196+ 1 . Syntax checking was done at compile time. Your Rust program wouldn't
197+ compile if your regex didn't compile.
198+ 2 . Reduction of overhead that was proportional to the size of the regex.
199+ For the most part, this overhead consisted of heap allocation, which
200+ was nearly eliminated in the compiler plugin.
201+
202+ The main takeaway here is that the compiler plugin was a marginally faster
203+ version of a slow regex engine. As the regex crate evolved, it grew other regex
204+ engines (DFA, bounded backtracker) and sophisticated literal optimizations.
205+ The regex macro didn't keep pace, and it therefore became (dramatically) slower
206+ than the dynamic engines. The only reason left to use it was for the compile
207+ time guarantee that your regex is correct. Fortunately, Clippy (the Rust lint
208+ tool) has a lint that checks your regular expression validity, which mostly
209+ replaces that use case.
210+
211+ Additionally, the regex compiler plugin stopped receiving maintenance. Nobody
212+ complained. At that point, it seemed prudent to just remove it.
213+
214+ Will a compiler plugin be brought back? The future is murky, but there is
215+ definitely an opportunity there to build something that is faster than the
216+ dynamic engines in some cases. But it will be challenging! As of now, there
217+ are no plans to work on this.
219218
220219
221220## Testing
@@ -236,7 +235,6 @@ the AT&T test suite) and code generate tests for each matching engine. The
236235approach we use in this library is to create a Cargo.toml entry point for each
237236matching engine we want to test. The entry points are:
238237
239- * ` tests/test_plugin.rs ` - tests the ` regex! ` macro
240238* ` tests/test_default.rs ` - tests ` Regex::new `
241239* ` tests/test_default_bytes.rs ` - tests ` bytes::Regex::new `
242240* ` tests/test_nfa.rs ` - tests ` Regex::new ` , forced to use the NFA
@@ -261,10 +259,6 @@ entry points, it can take a while to compile everything. To reduce compile
261259times slightly, try using ` cargo test --test default ` , which will only use the
262260` tests/test_default.rs ` entry point.
263261
264- N.B. To run tests for the ` regex! ` macro, use:
265-
266- cargo test --manifest-path regex_macros/Cargo.toml
267-
268262
269263## Benchmarking
270264
@@ -284,7 +278,6 @@ separately from the main regex crate.
284278Benchmarking follows a similarly wonky setup as tests. There are multiple entry
285279points:
286280
287- * ` bench_rust_plugin.rs ` - benchmarks the ` regex! ` macro
288281* ` bench_rust.rs ` - benchmarks ` Regex::new `
289282* ` bench_rust_bytes.rs ` benchmarks ` bytes::Regex::new `
290283* ` bench_pcre.rs ` - benchmarks PCRE
0 commit comments