-
Notifications
You must be signed in to change notification settings - Fork 7
draft ark uri scheme 00.xml
John A. Kunze edited this page Feb 18, 2026
·
1 revision
<title>The ARK URI scheme</title>
17beta
ksenia@17beta.top
General
Internet Engineering Task Force
ark
archival resource key
URI scheme
This specification defines the Archival Resource Key (ARK) URI scheme that is especially suitable for persistent identifiers.
Persistent identifiers for latest version of this document: .
Introduction
The ARK (Archival Resource Key) identifier scheme is flexible, dereferenceable and especially suitable for persistent identifiers. A founding principle of the design of the ARK scheme is that persistence is a matter of service not conferred by any particular identifier scheme; ARK is designed to ease the task of achieving persistence. This document specifies the technical details of the ARK system as an URI and IRI scheme and does not elaborate at length on the design rationale of the ARK system; for that see .
Conventions
The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in .
The terms “identifier”, “resource”, “representation”, “information resource” and “non-information resource” are used as described in . For conciseness we use the term “referent” to mean the resource identified by an identifier. Note that identifiers are strings of characters, representations are strings of octets paired with an interpretation and resources are abstract concepts like the book “Alice’s Adventures in Wonderland” by Lewis Carroll or Zermelo-Fraenkel set theory.
The notation used to describe syntax is that described in extended as follows: A literal preceeded by “~” matches any string that is equivalent when corresponding uppercase and lowercase codepoints in the range U+0000 to U+007F are taken as equivalent. The syntax is augmented with set difference indicated by the operator “-” whose precedence is between Alternative and Concatenation. All the syntactic terms defined in are referenced here.
Concepts
“ARK” stands for Archival Resource Key. The URI scheme defined in this document is named “ARK scheme”. Every identifier that uses this scheme is called an “ARK”. The ARK scheme is designed to ease the creation and maintenance of persistent and dereferenceable resource identifiers. An ARK may be used either to identify an information resource or a non-information resource. There are 3 forms of ARKs.
Status: permanent
Applications/protocols that use this scheme name: Existing ARK resolvers including the central resolver . Existing NAAs registered in .
Contact: Mario Xerxes Castel√°n Castro (Ksenia) regarding this specification; The ARK Maintenance Agency regarding the ARK system in general.
Change controller: ARK Maintenance Agency .
References: This document. References <title>Augmented BNF for Syntax Specifications</title> <title>ARK Maintenance Agency web site</title> <title>Cool URIs for the Semantic Web</title> <title>HTTP Strict Transport Security (HSTS)</title> <title>Internationalized Resource Identifiers (IRIs)</title> <title>The ARK Identifier Scheme</title> <title>Key words for use in RFCs to Indicate Requirement Levels</title> <title>Name Assigning Authority Number (NAAN) Registry</title> <title>Special-Use Domain Names</title> <title>Unicode Technical Report #36: Unicode Security Considerations, revision 15</title> <title>Uniform Resource Identifier (URI): Generic Syntax</title> <title>Architecture of the World Wide Web, Volume One</title> W3C
- Basic ARKs use the “ark” URI scheme and MUST NOT have an URI query nor URI fragment.
- Extended ARKs differ from Basic ARKs only in that URI queries and fragments are allowed; every Basic ARK is an Extended ARK.
- Embedded ARKs may use any URI scheme scheme and can be used to make ARKs easily dereferenceable by users.
- ark:12345/ax20315
- ark:12345/ax20315/edition1
- ark:12345/ax20315/edition1/chapter5
- ark:12345/ax20315.en could indicate a version in English language.
- ark:12345/ax20315.svg could indicate a Scalable Vector Graphics (SVG) version.
- ark:12345/ax20315.en.svg and ark:12345/ax20315.svg.en could indicate a Scalable Vector Graphics (SVG) version.
- A publisher of books could be a NAA that assigns an ARK to each of its books and simultaneously be a NMA that operates an ARK resolver for the ARKs it assigned. Suppose that one book of this publisher became censored within its country; the publisher then would discontinue dereference service for the ARK of that book. A NMA operating in a different country (say, a memory institution) could provide service for that ARK.
- A space agency could assign an ARK for a composite high resolution visible image of the surface of Earth. An independent organization could be a NMA for that ARK and offer access to that image in different media types, different resolutions, and different formats.
- Convert the scheme to lowercase.
- If the ARK starts with “ark:/” then replace that portion with “ark:”.
-
Transform the substring other than the initial “ark:”, the query (incl. question mark) and fragment (incl. hash symbol) as follows:
- Decode all percent-encoded characters that after decoding would match the ARK-unreserved production rule.
- Delete all instances of “-” (U+002D).
- Percent-encode all non-ASCII characters.
- If there is a VariantPath then separate it into individual matches of suffix, sort by lexicographical order according to codepoint without decoding any remaining percent-encoded characters, delete identical suffixes and join the remaining suffixes in that order; substitute the original VariantPath with this result.
- In the query and fragment: Decode all percent-encoded characters that match the unreserved rule in . Percent-encode all non-ASCII characters.
- Any Extended ARK with a total length less or equal than 255 characters.
- Any Extended ARK that has a Basic ARK part with length less or equal than 255 characters whose inflection is empty, ?, ?? or ?info.
- The HTTP status code 301, 302, 307 and 308 signify that the referent of the ARK is available at the URI indicated by the Location header. If the Vary header is present in the response, then this location in specific to the parameters indicated by the semantics of the Vary header. A HTTP ARK resolver MUST use HTTP status code 302 or 307 instead of 301 or 308 because ARK resolves provide a temporary location for the referent of the ARK, not a permanent relocation.
- The HTTP status code 303 signifies that a resource related to the referent of the ARK is tentatively available at the URI indicated by the Location header.
- The HTTP status code 404 signifies that this resolver does not possess a location for the referent of the ARK.
- The resolver SHOULD reply with HTTP status code 400 if the request-part is not a Basic ARK and the server is unwilling to process it. Note that this status code is not specific for the aforementioned condition; the HTTP semantics allow it to be used for other types of errors unrelated to the ARK system.
- The Extended ARK to be resolved.
- The prefix of the ARK resolver. If none is specified by the user, the client that resolves the ARK SHOULD default to https://n2t.net.
- How many redirection are to be tolerated. MUST be at least 5.
- The HTTP method to use for resolution. MUST be either GET or HEAD.
- If URI is not an address with http or https scheme the algorithm ends with success. Otherwise send a HTTP request to the resource identified by URI using HTTP method method; if sending this request fails then return failure.
- Dispatch based on the HTTP status code obtained:
- If the HTTP status code was 301, 302, 303, 307 or 308 then set URI to the URI indicated in the Location HTTP header. If that HTTP header is missing or not a valid URI, then return failure. If the HTTP status was 303 then set state to the symbol related.
- If the HTTP status code is 200, 204, 206, 226 or 304 then the algorithm finishes with success. If state is direct then the ARK is located at URI and the representation obtained is a representation of the resourced identified by the ARK. If state is related then URI identifies a resource related to the resource identified by the ARK and the representation obtained is related to the resource identified by the ARK.
- If the HTTP status code is in the range 400 to 599 then return failure.
- If the HTTP status code does not match any rule bove then the behavior is implementation-defined.
- Decrement max_redirects by 1.
Status: permanent
Applications/protocols that use this scheme name: Existing ARK resolvers including the central resolver . Existing NAAs registered in .
Contact: Mario Xerxes Castel√°n Castro (Ksenia) regarding this specification; The ARK Maintenance Agency regarding the ARK system in general.
Change controller: ARK Maintenance Agency .
References: This document. References <title>Augmented BNF for Syntax Specifications</title> <title>ARK Maintenance Agency web site</title> <title>Cool URIs for the Semantic Web</title> <title>HTTP Strict Transport Security (HSTS)</title> <title>Internationalized Resource Identifiers (IRIs)</title> <title>The ARK Identifier Scheme</title> <title>Key words for use in RFCs to Indicate Requirement Levels</title> <title>Name Assigning Authority Number (NAAN) Registry</title> <title>Special-Use Domain Names</title> <title>Unicode Technical Report #36: Unicode Security Considerations, revision 15</title> <title>Uniform Resource Identifier (URI): Generic Syntax</title> <title>Architecture of the World Wide Web, Volume One</title> W3C