User Details
- User Since
- Dec 26 2014, 6:51 PM (465 w, 5 d)
- Availability
- Available
- LDAP User
- Unknown
- MediaWiki User
- Popcorndude [ Global Accounts ]
Oct 27 2015
What if backtracking, grouping, and alternatives were all disabled, and each property could have multiple format constraints, and a value would pass if it matched any of them? Most paterns would need rewriting, but only for about 18 of them would it actually be difficult.
I messed with the constraints a bit, and it would be pretty easy to get up to ~50% with the constraints you outlined (the numbers I gave before may have forgotten to skip newlines, lowering the count). Adding + and * covers 3/4, and most of the rest could be rewritten without to much trouble (other than P1793 and possibly a few others that are really basically impossible).
Oct 25 2015
this matches the constraints I suggested:
^(?!.*?\.(\+|\*|\{\d+,\})\()(\\.|[^()\\\[\]]|\[([^\\\[\]]|\\.)*\]|\((?!\?)(\\.|[^()\\]|\[([^\\\[\]]|\\.)*\])*\))+$
My apologies. I eliminated those in my initial analysis and forgot to mention it. The full list of things with backslashes in front of them:
bdDpsSwx2()[]{}|^\/$?+*,-.
Those criteria accept 62 (8%) of the current constraints.
Adding character classes (\d is everywhere) brings it up to 166 (23%)
Oct 24 2015
matches 624 of the ok ones, and should only match ok ones, though some will fail.
I letting things like 0?\d{8} through the filter, and most of what's left is checking file extensions. I can make them not backtrack at all if commons filenames don't contain periods (I don't know what characters are allowed). They are generally of the form .*\.(<list of extensions>)
Of those 6 properties, 2 have "optionally the same character twice", 2 have "does not start with", and the other 2 are actually non-capturing groups that I misidentified as lookarounds.
I did some analysis of what regex features are actually used: https://www.wikidata.org/wiki/User:Popcorndude/formats
Jul 29 2015
Maybe just load the first ~20 statements with a "Load More" button?