-
Notifications
You must be signed in to change notification settings - Fork 237
Description
Bassically, a common thing I often do is want to match against nested groups of brackets, e.g. given the string hello ( a ( b ) c ) wo ( rl ) d (
I want the matches
( a ( b ) c )
And
( rl )
but not (
as that doesn't have a matching closing bracket.
now pcre2 is capable of doing that,
if we let BEGIN
be the start regex (in this case \(
), and END
be the end regex (i.e. \)
), then you can do that with the regex:
BEGIN(?J)(?<name>(?:(?!BEGIN|END).|BEGIN(?&name)END)*)END
(In the above name
can be any name that is not a 'free variable' in BEGIN
and END
, i.e. the regex can be nested with the same name
)
The above is quite long and verbose and took me a while to work out (and I've since forgotten how it works), so it would be nice to have a syntax for this.
For example the syntax could be:
(?/BEGIN/END)
So to match round brackets (?/\(/\))
. To match round or curley brackets (e.g. { a ( b } )
), you could write use (?/[(}]/[(}])
.
Another example which I frequently use when processing LaTeX files is if you want to match against curly brackets but ignore those preceded by a backslash:
(?/(?<!\\)[{]/(?<!\\)[}])
The entire string { helo \{ world }
will match.