Skip to content
Open
Changes from 45 commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
6ca8acd
spec: grammar ABNF
jkowalleck Aug 6, 2025
a405b2b
fix ABNF for `.` and `/` exceptions
jkowalleck Aug 6, 2025
1a31031
fix ABNF for `.` inclusion
jkowalleck Aug 6, 2025
3ccbab1
fix dot in subpath, like `...foo
jkowalleck Aug 7, 2025
92f0b32
fix qualifier-key
jkowalleck Aug 7, 2025
18f4ca3
fix
jkowalleck Aug 7, 2025
4f2a434
format
jkowalleck Aug 7, 2025
c678791
fix subpath
jkowalleck Aug 7, 2025
ccecacb
fix subpath
jkowalleck Aug 7, 2025
8a3a840
reformat
jkowalleck Aug 7, 2025
f1429f4
reformat
jkowalleck Aug 7, 2025
e8f95a7
reformat
jkowalleck Aug 7, 2025
47b5940
moved grammar to own file
jkowalleck Aug 8, 2025
012c1d8
fix: percent encoding spec applied to ABNF
jkowalleck Aug 12, 2025
9751879
fix: percent encoding spec applied to ABNF
jkowalleck Aug 12, 2025
8155567
layoput
jkowalleck Aug 12, 2025
b3c8bb3
docs
jkowalleck Aug 12, 2025
187f2b2
fix
jkowalleck Aug 12, 2025
e822787
fix dot being not encoded nowhere never-ever
jkowalleck Aug 12, 2025
eeb3688
style
jkowalleck Aug 12, 2025
83b5727
style
jkowalleck Aug 12, 2025
20f93c8
style
jkowalleck Aug 12, 2025
097da44
style
jkowalleck Aug 12, 2025
7598bfb
style
jkowalleck Aug 12, 2025
5ea4232
style
jkowalleck Aug 12, 2025
294be92
Merge branch 'main' into spec/grammar-ABNF
jkowalleck Aug 14, 2025
9f4aea6
add to TOC
jkowalleck Aug 14, 2025
424fcc8
fix typos
jkowalleck Aug 14, 2025
9cc7a42
duplicates as keywords
jkowalleck Aug 14, 2025
ab2c46c
typo and style
jkowalleck Sep 22, 2025
affade7
fix: non-canonical namespace
jkowalleck Oct 1, 2025
c8b72e9
fix: non-canonical qualifiers
jkowalleck Oct 1, 2025
d54d323
fix: non-canonical subpath
jkowalleck Oct 1, 2025
305cf6e
Merge branch 'main' into spec/grammar-ABNF
mjherzog Oct 17, 2025
0748fb5
Merge remote-tracking branch 'upstream/main' into spec/grammar-ABNF
jkowalleck Dec 23, 2025
8765ae4
docs
jkowalleck Dec 23, 2025
bf00df3
docs
jkowalleck Dec 23, 2025
eb98ca7
style
jkowalleck Dec 23, 2025
a3a12e2
make grammar part of standard
jkowalleck Dec 23, 2025
4c27377
docs
jkowalleck Dec 23, 2025
d57c009
style: max 80 characters per line
jkowalleck Dec 23, 2025
3ce66b3
docs
jkowalleck Dec 23, 2025
82a3ce0
style
jkowalleck Dec 23, 2025
8102c22
docs
jkowalleck Dec 23, 2025
c00f4df
style
jkowalleck Dec 23, 2025
1d2fc09
revisit according to spec
jkowalleck Dec 23, 2025
1b1a6c7
style
jkowalleck Dec 23, 2025
9a1e736
docs
jkowalleck Dec 23, 2025
e44bffb
Apply suggestion from @ppkarwasz
jkowalleck Dec 24, 2025
4e5b907
style
jkowalleck Dec 24, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
106 changes: 106 additions & 0 deletions docs/standard/grammar.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
# Package-URL Grammar

A PURL string adhers to the following grammar,
using syntax as per [RFC5234: Augmented BNF for Syntax Specifications: ABNF](https://datatracker.ietf.org/doc/html/rfc5234).

```abnf
purl = scheme ":" *"/" type
[ 1*"/" namespace ] 1*"/" name *"/"
[ "@" version ] [ "?" qualifiers ]
[ "#" *"/" subpath *"/" ]
; leading/trailing slashes allowed here and there
purl-canonical = scheme ":" type-canonical
[ "/" namespace-canonical ] "/" name
[ "@" version ] [ "?" qualifiers-canonical ]
[ "#" subpath-canonical ]


scheme = %x70.6B.67 ; lowercase string "pkg"
Copy link
Member Author

@jkowalleck jkowalleck Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you dont like this original ABNF case-sensitive notation, we could use RFC7405's case-sensitive notation:

Suggested change
scheme = %x70.6B.67 ; lowercase string "pkg"
scheme = %s"pkg"


type = ALPHA *( ALPHA / DIGIT / "." / "-" )
type-canonical = LOWALPHA *( LOWALPHA / DIGIT / "." / "-" )

namespace = namespace-segment *( 1*"/" namespace-segment )
namespace-canonical = namespace-segment *( "/" namespace-segment )
namespace-segment = 1*namespace-sc
namespace-sc = PERM-ALPHANUM
/ PERM-PUNCTUATION
/ "%" ( PERM-ESCAPED-00-1F
/ PERM-ESCAPED-20-2C
; except general exclusion "-" (2D)
; except general exclusion "." (2E)
; except the separator "/" (2F)
/ PERM-ESCAPED-30-FF )
; namespace safe characters

name = 1*PCT-ENCODED

version = 1*PCT-ENCODED

qualifiers = qualifier *( "&" qualifier )
qualifiers-canonical = qualifier-canonical *( "&" qualifier-canonical )
qualifier = qualifier-key "=" [ qualifier-value ]
qualifier-canonical = qualifier-key-canonical "=" qualifier-value
qualifier-key = ALPHA *( ALPHA / DIGIT / "." / "-" / "_" )
qualifier-key-canonical = LOWALPHA *( LOWALPHA / DIGIT / "." / "-" / "_" )
qualifier-value = 1*PCT-ENCODED

subpath = subpath-segment
*( 1*"/" subpath-segment )
/ 0<subpath-sc> ; empty
subpath-canonical = subpath-segment-canonical
*( "/" subpath-segment-canonical )
/ 0<subpath-sc> ; empty
subpath-segment = 1*( subpath-sc / "." / "%2E" )
subpath-segment-canonical = [ "." ] subpath-sc *( subpath-sc / "." )
; prevent "." and ".." standalone
/ "." "." 1*( subpath-sc / "." )
; prevent ".." standalone
subpath-sc = PERM-ALPHANUM
/ "-" / "_" / "~" ; PERM-PUNCTUATION except "."
/ "%" ( PERM-ESCAPED-00-1F
/ PERM-ESCAPED-20-2C
; except general exclusion "-" (2D)
; except the special char "." (2E)
; except the separator "/" (2F)
/ PERM-ESCAPED-30-FF )
; subpath safe characters


LOWALPHA = %x61-7A ; a-z

PCT-ENCODED = PERM-ALPHANUM
/ PERM-PUNCTUATION
/ ":" ; a specific separator that must not be encoded
/ PERM-ESCAPED

; permitted character classes
PERM-ALPHANUM = ALPHA / DIGIT
PERM-PUNCTUATION = "." / "-" / "_" / "~"
PERM-SEPARATOR = ":" / "/" / "@" / "?" / "=" / "&" / "#"
PERM-ESCAPED = "%" ( PERM-ESCAPED-00-1F
/ PERM-ESCAPED-20-2C
/ PERM-ESCAPED-2D-2F
/ PERM-ESCAPED-30-FF )

; applied purl spec rules for general character encoding
PERM-ESCAPED-00-1F = %x30-31 HEXDIG ; 00-1F
PERM-ESCAPED-20-2C = %x32 ( DIGIT / "A" / "B" / "C" ) ; 20-2C
PERM-ESCAPED-2D-2F = ; except following characters: "-" (2D)
; except following characters: "." (2E)
/ %x32 "F" ; 2F
PERM-ESCAPED-30-FF = ; except following characters: "0"-"9" (30-39)
; except following characters: ":" (3A)
%x33 ( "B" / "C" / "D" / "E" / "F" ) ; 3B-3F
/ %x34 %x30 ; 40
; except following characters: "A"-"Z" (41-5A)
/ %x35 ( "B" / "C" / "D" / "E" ) ; 5B-5E
; except following characters: "_" (5F)
/ %x36 %x30 ; 60
; except following characters: "a"-"z" (61-7A)
/ %x37 ( "B" / "C" / "D" ) ; 7B-7D
; except following characters: "~" (7E)
/ %x37 "F" ; 7F
/ %x38-39 HEXDIG ; 80-9F
/ ( "A" / "B" / "C" / "D" / "E" / "F" ) HEXDIG ; A0-FF
```