You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: Documentation/Evolution/RegexTypeOverview.md
+5-5Lines changed: 5 additions & 5 deletions
Original file line number
Diff line number
Diff line change
@@ -231,7 +231,7 @@ The result builder allows for inline failable value construction, which particip
231
231
232
232
Swift regexes describe an unambiguous algorithm, where choice is ordered and effects can be reliably observed. For example, a `print()` statement inside the `TryCapture`'s transform function will run whenever the overall algorithm naturally dictates an attempt should be made. Optimizations can only elide such calls if they can prove it is behavior-preserving (e.g. "pure").
233
233
234
-
`CustomMatchingRegexComponent`, discussed in [String Processing Algorithms][pitches], allows industrial-strength parsers to be used a regex components. This allows us to drop the overly-permissive pre-parsing step:
234
+
`CustomPrefixMatchRegexComponent`, discussed in [String Processing Algorithms][pitches], allows industrial-strength parsers to be used a regex components. This allows us to drop the overly-permissive pre-parsing step:
235
235
236
236
```swift
237
237
funcprocessEntry(_line: String) -> Transaction? {
@@ -431,7 +431,7 @@ Regular expressions have a deservedly mixed reputation, owing to their historica
431
431
432
432
* "Regular expressions are bad because you should use a real parser"
433
433
- In other systems, you're either in or you're out, leading to a gravitational pull to stay in when... you should get out
434
-
- Our remedy is interoperability with real parsers via `CustomMatchingRegexComponent`
434
+
- Our remedy is interoperability with real parsers via `CustomPrefixMatchRegexComponent`
435
435
- Literals with refactoring actions provide an incremental off-ramp from regex syntax to result builders and real parsers
436
436
* "Regular expressions are bad because ugly unmaintainable syntax"
437
437
- We propose literals with source tools support, allowing for better syntax highlighting and analysis
@@ -516,7 +516,7 @@ Regex are compiled into an intermediary representation and fairly simple analysi
516
516
517
517
### Future work: parser combinators
518
518
519
-
What we propose here is an incremental step towards better parsing support in Swift using parser-combinator style libraries. The underlying execution engine supports recursive function calls and mechanisms for library extensibility. `CustomMatchingRegexComponent`'s protocol requirement is effectively a [monadic parser](https://homepages.inf.ed.ac.uk/wadler/papers/marktoberdorf/baastad.pdf), meaning `Regex` provides a regex-flavored combinator-like system.
519
+
What we propose here is an incremental step towards better parsing support in Swift using parser-combinator style libraries. The underlying execution engine supports recursive function calls and mechanisms for library extensibility. `CustomPrefixMatchRegexComponent`'s protocol requirement is effectively a [monadic parser](https://homepages.inf.ed.ac.uk/wadler/papers/marktoberdorf/baastad.pdf), meaning `Regex` provides a regex-flavored combinator-like system.
520
520
521
521
An issues with traditional parser combinator libraries are the compilation barriers between call-site and definition, resulting in excessive and overly-cautious backtracking traffic. These can be eliminated through better [compilation techniques](https://core.ac.uk/download/pdf/148008325.pdf). As mentioned above, Swift's support for custom static compilation is still under development.
522
522
@@ -565,9 +565,9 @@ Regexes are often used for tokenization and tokens can be represented with Swift
565
565
566
566
### Future work: baked-in localized processing
567
567
568
-
- `CustomMatchingRegexComponent` gives an entry point for localized processors
568
+
- `CustomPrefixMatchRegexComponent` gives an entry point for localized processors
569
569
- Future work includes (sub?)protocols to communicate localization intent
Copy file name to clipboardExpand all lines: Documentation/Evolution/StringProcessingAlgorithms.md
+22-21Lines changed: 22 additions & 21 deletions
Original file line number
Diff line number
Diff line change
@@ -8,9 +8,9 @@ We propose:
8
8
9
9
1. New regex-powered algorithms over strings, bringing the standard library up to parity with scripting languages
10
10
2. Generic `Collection` equivalents of these algorithms in terms of subsequences
11
-
3.`protocol CustomMatchingRegexComponent`, which allows 3rd party libraries to provide their industrial-strength parsers as intermixable components of regexes
11
+
3.`protocol CustomPrefixMatchRegexComponent`, which allows 3rd party libraries to provide their industrial-strength parsers as intermixable components of regexes
12
12
13
-
This proposal is part of a larger [regex-powered string processing initiative](https://forums.swift.org/t/declarative-string-processing-overview/52459). Throughout the document, we will reference the still-in-progress [`RegexProtocol`, `Regex`](https://github.com/apple/swift-experimental-string-processing/blob/main/Documentation/Evolution/StronglyTypedCaptures.md), and result builder DSL, but these are in flux and not formally part of this proposal. Further discussion of regex specifics is out of scope of this proposal and better discussed in another thread (see [Pitch and Proposal Status](https://github.com/apple/swift-experimental-string-processing/issues/107) for links to relevant threads).
13
+
This proposal is part of a larger [regex-powered string processing initiative](https://github.com/apple/swift-evolution/blob/main/proposals/0350-regex-type-overview.md), the status of each proposal is tracked [here](https://github.com/apple/swift-experimental-string-processing/blob/main/Documentation/Evolution/ProposalOverview.md). Further discussion of regex specifics is out of scope of this proposal and better discussed in their relevant reviews.
14
14
15
15
## Motivation
16
16
@@ -91,18 +91,18 @@ Note: Only a subset of Python's string processing API are included in this table
91
91
92
92
### Complex string processing
93
93
94
-
Even with the API additions, more complex string processing quickly becomes unwieldy. Up-coming support for authoring regexes in Swift help alleviate this somewhat, but string processing in the modern world involves dealing with localization, standards-conforming validation, and other concerns for which a dedicated parser is required.
94
+
Even with the API additions, more complex string processing quickly becomes unwieldy. String processing in the modern world involves dealing with localization, standards-conforming validation, and other concerns for which a dedicated parser is required.
95
95
96
96
Consider parsing the date field `"Date: Wed, 16 Feb 2022 23:53:19 GMT"` in an HTTP header as a `Date` type. The naive approach is to search for a substring that looks like a date string (`16 Feb 2022`), and attempt to post-process it as a `Date` with a date parser:
Parsing a currency string such as `$3,020.85` with regex is also tricky, as it can contain localized and currency symbols in addition to accounting conventions. This is why Foundation provides industrial-strength parsers for localized strings.
129
129
130
130
131
-
## Proposed solution
131
+
## Proposed solution
132
132
133
133
### Complex string processing
134
134
135
-
We propose a `CustomMatchingRegexComponent` protocol which allows types from outside the standard library participate in regex builders and `RegexComponent` algorithms. This allows types, such as `Date.ParseStrategy` and `FloatingPointFormatStyle.Currency`, to be used directly within a regex:
135
+
We propose a `CustomPrefixMatchRegexComponent` protocol which allows types from outside the standard library participate in regex builders and `RegexComponent` algorithms. This allows types, such as `Date.ParseStrategy` and `FloatingPointFormatStyle.Currency`, to be used directly within a regex:
136
136
137
137
```swift
138
138
let dateRegex = Regex {
139
-
capture(dateParser)
139
+
Capture(dateParser)
140
140
}
141
141
142
142
let date: Date = header.firstMatch(of: dateRegex).map(\.result.1)
let amount: [Decimal] = statement.matches(of: currencyRegex).map(\.result.1)
@@ -167,24 +167,25 @@ We also propose the following regex-powered algorithms as well as their generic
167
167
|`matches(of:)`| Returns a collection containing all matches of the specified `RegexComponent` |
168
168
169
169
170
-
## Detailed design
170
+
## Detailed design
171
171
172
-
### `CustomMatchingRegexComponent`
172
+
### `CustomPrefixMatchRegexComponent`
173
173
174
-
`CustomMatchingRegexComponent` inherits from `RegexComponent` and satisfies its sole requirement; Conformers can be used with all of the string algorithms generic over `RegexComponent`.
174
+
`CustomPrefixMatchRegexComponent` inherits from `RegexComponent` and satisfies its sole requirement. Conformers can be used with all of the string algorithms generic over `RegexComponent`.
/// Process the input string within the specified bounds, beginning at the given index, and return
181
+
/// the end position (upper bound) of the match and the produced output.
181
182
/// - Parameters:
182
183
/// - input: The string in which the match is performed.
183
184
/// - index: An index of `input` at which to begin matching.
184
185
/// - bounds: The bounds in `input` in which the match is performed.
185
186
/// - Returns: The upper bound where the match terminates and a matched instance, or `nil` if
186
187
/// there isn't a match.
187
-
funcmatch(
188
+
funcconsuming(
188
189
_input: String,
189
190
startingAtindex: String.Index,
190
191
inbounds: Range<String.Index>
@@ -198,8 +199,8 @@ public protocol CustomMatchingRegexComponent : RegexComponent {
198
199
We use Foundation `FloatingPointFormatStyle<Decimal>.Currency` as an example forprotocol conformance. It would implement the `match` function with `Match` being a `Decimal`. It could also add a static function `.localizedCurrency(code:)` as a member of `RegexComponent`, so it can be referred as `.localizedCurrency(code:)` in the `Regex` result builder:
0 commit comments