Description
Hi there,
I think the TZ (Timezone) regexp is flawed. It currently reads as TZ (?:[APMCE][SD]T|UTC)
and is used in the DATESTAMP_RFC822
and DATESTAMP_OTHER
expressions. Now RFC822 (ARPA Internet Text Messages) also lists values such as GMT
and UT
as valid identifiers, CET, CEST, STD, ... are all commonly used timezone identifiers that don't show up in the spec, but that I think should be matched non the less. TZ could be replaced with something like this([ABCDEFGHIJKLMNPSTUVWY][CDEHKMRWZOUAGLVFJNBISPY][DSOARNPWMUCLGVHKT][TA]|[ABCDEFGHIJKLMNPQRSTUVWXY][CDFMNPRSWOTAEGKLUVXJYHBI][TDKCB]|[ABCEHMUW][CZEHKAOLY][WOAHSVDLR][DSMU][T]|BORTST|[ABCDEFGHIKLMNOPQRSTUVWXYZ])
which would match everything that is in the PHP timezone abbreviations list. Or maybe a less complicated but more generic REGEXP such as [A-Z]{1,6}
?
Also I'm confused because the RFC822
spec wants a comma after %{DAY}
just like the RFC2822
spec. But this isn't reflected in the regexp.
There is a merge request for CET and CEST from 2015. But I believe the date stamp patterns in general need some work. I know it's easy to work around with custom patterns, but I think the current rules are confusing to end users.
Thanks for all the good work