Skip to content

Afbase/aturi validation rsky syntax#39

Merged
rudyfraser merged 12 commits intoblacksky-algorithms:mainfrom
afbase:afbase/aturi-validation-rsky-syntax
Jan 26, 2025
Merged

Afbase/aturi validation rsky syntax#39
rudyfraser merged 12 commits intoblacksky-algorithms:mainfrom
afbase:afbase/aturi-validation-rsky-syntax

Conversation

@afbase
Copy link
Contributor

@afbase afbase commented Jan 22, 2025

Closes #38

Should we validate AT URIs in our codebase even though TypeScript doesn't? The validation code exists but isn't used.

Okay upon further testing and review, the ensureValidAtUri in the typscript codebase does not consider:

  1. AtURIs with the query parameters.
  2. As far as I can tell, it is only used to validate the "subject" URIs that you may find in repository records.
  3. My implementation conforms to the typescript capability and so I have decided to not incorporate it in AtUri::new code.

You'll still be able to:

// New Rust-style (one step)
let at_uri = ensure_valid_at_uri(uri)?;
// or...
let _ = ensure_valid_at_uri(uri)?;

and similarly in the other validation functions implementations, ensure_valid_datetime, if ok, will return a DateTime<FixedOffset>>. So you'll be able to:

// New Rust-style (one step)
let validated_datetime = ensure_valid_datetime(date_str)?;
// or...
let _ = ensure_valid_datetime(date_str)?;

I'm doing so since the code parses the string to a datetime object successfully. The other validation code evaluates strings and it's a bit redundant to return a string again.

expect_valid("at://did:plc:asdf123/com.atproto.feed.post/asdf123");
expect_valid("at://did:plc:asdf123/com.atproto.feed.post/a");
expect_valid("at://did:plc:asdf123/com.atproto.feed.post/%23");
expect_valid("at://did:plc:asdf123/com.atproto.feed.post/$@!*)(:,;~.sdf123");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these don't look like valid record keys to me. Per the docs:

Regardless of the type, Record Keys must fulfill some baseline syntax constraints:

- restricted to a subset of ASCII characters — the allowed characters are alphanumeric (A-Za-z0-9), period, dash, underscore, colon, or tilde (.-_:~)

dollar signs, at signs, parentheses, semi-colons, asterisks, and exclamation points shouldn't be allowed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would you like me conform more closely to the docs or to what typescript has?

Typescript's validation is much more relaxed at this time 😞

expect_valid("at://did:plc:asdf123/com.atproto.feed.post/record");
expect_valid(&format!(
"at://did:plc:asdf123/com.atproto.feed.post/{}",
"o".repeat(800)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

must have at least 1 and at most 512 characters per the docs

@afbase
Copy link
Contributor Author

afbase commented Jan 23, 2025

@rudyfraser - give me a another day or so to work on this. There are some discrepancies between the typescript and atproto specifications.

  1. I'm going to make a markdown file of the discrepancies in the validation between this code, the typescript, and the atproto specs.
  2. I reworked the aturi validation but I'm going to go through the implementations and tests of the other files again to make sure I'm content with it.

I'll let you know when it's ready for review again

@rudyfraser
Copy link
Member

@rudyfraser - give me a another day or so to work on this. There are some discrepancies between the typescript and atproto specifications.

1. I'm going to make a markdown file of the discrepancies in the validation between this code, the typescript, and the atproto specs.

2. I reworked the aturi validation but I'm going to go through the implementations and tests of the other files again to make sure I'm content with it.

I'll let you know when it's ready for review again

Sounds good. Just to answer your other comment the atproto specs should be the source of truth where there are differences between the spec and the TS code. I imagine discrepancies should be rare (since it's the same org working on both, lol) but for a case like this it'd be the spec.

I'll keep an eye out for your next comment to review.

@afbase
Copy link
Contributor Author

afbase commented Jan 25, 2025

@rudyfraser - give me a another day or so to work on this. There are some discrepancies between the typescript and atproto specifications.

1. I'm going to make a markdown file of the discrepancies in the validation between this code, the typescript, and the atproto specs.

2. I reworked the aturi validation but I'm going to go through the implementations and tests of the other files again to make sure I'm content with it.

I'll let you know when it's ready for review again

Sounds good. Just to answer your other comment the atproto specs should be the source of truth where there are differences between the spec and the TS code. I imagine discrepancies should be rare (since it's the same org working on both, lol) but for a case like this it'd be the spec.

I'll keep an eye out for your next comment to review.

Okay so here's what I think is going on reviewing the typescript code and specification more thoroughly:

  1. It seems there are two AT-URI syntaxes, a general form and the more restricted form that is meant for the use with lexicon. This validation seems to follow the general form (and the typescript implementation as well) - which is a quite a bit more generous with what is acceptable.
  2. With that said, the AT-URI validation here is fairly weak. See this note in the typescript implementation:

NOTE: this package is currently more permissive than spec about AT URIs, so invalid cases are not errors

  1. The other components, i.e. DID, Handle, datetime, NSID, TID, and Record Key, seem to conform to the specifications on atproto.com.
  2. I made rsky-syntax/ATURI_VALIDATION.md which provides some details about this for others to get some idea about the details about this.

@afbase
Copy link
Contributor Author

afbase commented Jan 25, 2025

An idea: We could create a validation function that is more strict and conform to the specification. This would got beyond the reference implementation in Typescript.

@rudyfraser
Copy link
Member

For now could you open an issue for the more strict version? Since this conforms to the canonical one I think we're good to go here.

@rudyfraser rudyfraser merged commit 24befb0 into blacksky-algorithms:main Jan 26, 2025
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Proposal: Fully featured rsky-syntax

2 participants