-
Notifications
You must be signed in to change notification settings - Fork 4
Description
We should decide on what characters are valid in names. Currently there are 3 definitions:
The spec says [\p{L}\p{N}:$]+ (I guess \p{L} should be \p{Alpha} and \p{N} should be \p{Digit}, but I'm not absolutely sure on that)
The parser accepts [a-zA-Z][a-zA-Z0-9:]* (no letters except a-zA-Z, has to start with a letter, no $)
Some definitions use [\p{Alpha}][\p{Alpha}\p{Digit}_:]* (same as above, but all letters)
I'd like to stay with the specced version, as it gives a lot of freedom for element names (especially numeric-only names, also names starting in _ or $, which aren't uncommon in programming languages), though we should feel certain on allowing non ASCII characters then (which is probably good: it supports non-english languages - it might have some compatibilty issues, but we usually ignore those).
Also the question has been raised whether more 'special characters' should be allowed. As stated on the mailing list, I think we could allow for any of the following:
!#%&*+,/;?@^~