feat: ✨ implement resolve uri by joelostblom · Pull Request #152 · seedcase-project/seedcase-flower

joelostblom · 2026-02-24T15:09:15Z

Description

Needs a thorough review.

Checklist

Ran just run-all

A bit ugly/verbose, but I couldn't find a faster way to have it both work as a type and be able to validate potential URLs the same way as HttpUrl, ie without calling other methods or functions.

joelostblom

This is ready for a first review! Mostly looking for feedback on the approach so I know if it makes sense. I'm figuring out how to add tests in the meanwhile.

joelostblom · 2026-02-25T13:52:24Z

src/seedcase_flower/cli.py

+    properties = _resolve_uri(uri)
+    # properties = _read_properties(path)


I commented this out for now since I don't think it makes sense to do all the changes required for _read_properties to pass type checking in this PR; that will be in the next one together with expanding that function fully

joelostblom · 2026-02-25T13:56:36Z

src/seedcase_flower/internals.py

+_AnnotatedHttps = Annotated[AnyUrl, UrlConstraints(allowed_schemes=["https"])]
+_adapter = TypeAdapter(_AnnotatedHttps)
+
+
+class HttpsUrl(str):
+    """Type and class with validation for https URLs."""
+
+    @classmethod
+    def __get_pydantic_core_schema__(cls, source, handler):  # type: ignore[no-untyped-def]
+        """Initialize adapter core schema."""
+        return _adapter.core_schema
+
+    def __new__(cls, value: str):  # type: ignore[no-untyped-def]
+        """Setup validation."""
+        validated = _adapter.validate_python(value)
+        return str.__new__(cls, validated)


This genAI generated snippet seems somewhat verbose, but there does not appear to exist another way of creating an HttpsUrl class that both works as a type and can be called (ie works the same as the existing HttpUrl class in pytdantic). See the following links for discussion and some alternatives that all seem less desirable to me:

Write our own validator

Use parse_obj_as

Call .model_validate

Very open to suggestions

I have no idea what's going on here, nor what's the purpose. Is this to check that the URL is an actual URL?

So this is just to check the URL, right? As we'll fetch the JSON from the URL immediately, there's an argument for not trying to pre-check the URL at all. If it's bad, the request will fail anyway. (Of course we could add some nice error handling around that.)

Yes to both check that strings adhere to valid https URL specification and to use it as a type in a function signature. I'm fine with changing and passing around strings, but see my reply below of why I thought we would prefer to create these specific types when we try to follow functional programming.

Ah, I think I understand it a bit more after reading your comment. As we only need to know whether an address is local or remote, a middle ground would be to have a class that captures that plus a standardised form of the URI (like so). That way we keep function signatures tidy but don't have to care about validating URIs beyond the match statement.

joelostblom · 2026-02-25T14:00:00Z

src/seedcase_flower/internals.py

-# TODO Extend to parse strings and return either URL or Path
-def _resolve_uri(uri: str) -> Path:
-    return Path(uri)
+type HttpsUrl_or_FileUrl = HttpsUrl | FileUrl


@lwjohnst86 As per our suggestion today, I defined a single type here instead of using | or Union in the function signature. I initially tried to achieve this with enum but it doesn't seem possible / I couldn't figure out how, so I went with type instead.

I'm unsure if the name Https_or_FileUrl is actually that helpful, but it also does not seem correct to give a name such as just URL or URI since it is really a subset of these. Even something like URL_subset does not seem that helpful.

src/seedcase_flower/internals.py

joelostblom · 2026-02-25T14:38:20Z

src/seedcase_flower/cli.py

    """
-    path: Path = _resolve_uri(uri)
-    properties: dict[str, Any] = _read_properties(path)
+    resolved_uri = _resolve_uri(uri)


Should resolved_uri be typed here or is it enough that the function return is typed already?

I'd prefer to explicitly write it out, at least within a bigger function like this one.

joelostblom · 2026-02-25T14:38:49Z

src/seedcase_flower/cli.py

-    path: Path = _resolve_uri(uri)
-    properties: dict[str, Any] = _read_properties(path)
+    resolved_uri = _resolve_uri(uri)
+    properties = _read_properties(resolved_uri)  # type: ignore


Ignoring type checking for now since this function will be implemented in a separate PR

lwjohnst86

Some small comments to start. Please include tests too! ☺️

lwjohnst86 · 2026-02-25T14:45:31Z

src/seedcase_flower/internals.py

-# TODO Extend to parse strings and return either URL or Path
-def _resolve_uri(uri: str) -> Path:
-    return Path(uri)
+type HttpsUrl_or_FileUrl = HttpsUrl | FileUrl


This can be an enum instead. And should be called a URI (where the default assumes a path URI).

I'm really surprised there aren't packages or something for handling URIs... 🤔

Suggested change

type HttpsUrl_or_FileUrl = HttpsUrl | FileUrl

class URI(Enum):

https = HttpsUrl

file = FileUrl

e.g. might look like above

Maybe I'm misunderstanding what _resolve_uri returns, but if it returns the address of the datapackage.json, then it cannot really return an enum. Enums are for fixed, finite sets of values. If _resolve_uri only determines whether the address is local or remote, then yeah it can return an enum.

Yes, I couldn't get this to work as an enum; it seems like it hides the type information from mypy. Take the following example that contains the same function, either with an enum, a type alias, or a union:

# supertypes.py from enum import Enum class Number(Enum): """Doc.""" INT = int FLOAT = float def fx(x: Number) -> int | float: """Doc.""" x / 2 return x type int_or_float = int | float def fy(y: int_or_float) -> int | float: """Doc.""" y / 2 return y def fz(z: int | float) -> int | float: """Doc.""" z / 2 return z

When running uv run mypy --pretty supertypes.py, we can see that mypy doesn't understand that a Number is a supertype of int and float, whereas this is understood when creating a new type (and of course also with the union):

supertypes.py:13: error: Unsupported operand types for / ("Number" and "int") [operator] x / 2 ^ supertypes.py:14: error: Incompatible return value type (got "Number", expected "int | float") [return-value] return x ^ Found 2 errors in 1 file (checked 1 source file)

That still gives me the same error for the division:

supertypes.py:36: error: Unsupported operand types for / ("Number" and "int") [operator] return x / 2 ^

And even if it would work, it doesn't seem that useful for combining types in general if we always need additional logic like that. Doesn't it seem that creating a new type/type alias with type/TypeAlias is more appropriate here?

Reading into it more, it seems enums in Python are more basic and with limited features than I expected. So my suggestion here doesn't actually work the way I was envisioning. I was imagining you use the enum to restrict the type of output and assign the URI path to the relevant enum entry. We could do a similar behaviour with a dataclass and treat it like an enum 😛 e.g.

@dataclass(frozen = True) class URI: https: HttpsUrl file: FileUrl def ... -> URI match ... case "file" URI.file = ... case "" URI.file = _check_path(...) ... return URI

Hmm, do you have an MRE for what that would look like? When I try the following, I get the same errors as before:

from dataclasses import dataclass @dataclass(frozen=True) class Number: """Doc.""" INT = int FLOAT = float def fx(x: Number) -> int | float: """Doc.""" x / 2 return x def fw(x: Number) -> int | float: """Doc.""" match Number: case INT: return x / 2 case FLOAT: return x / 2

supertypes.py:14: error: Unsupported operand types for / ("Number" and "int") [operator] x / 2 ^ supertypes.py:15: error: Incompatible return value type (got "Number", expected "int | float") [return-value] return x ^ supertypes.py:22: error: Unsupported operand types for / ("Number" and "int") [operator] return x / 2 ^ Found 3 errors in 1 file (checked 1 source file)

Sorry to butt in, but a URI needs to know 2 things: whether it's local or remote and what its address is.
That corresponds more to:

@dataclass class Uri: value: str local: bool

A single URI shouldn't have properties for both https and file?

Agree, but I thought the idea of this superclass was to take advantage of the validation and type specification that already exists in pydantic's FileUrl and our modified HttpsURL. If we don't want that, we could use something like AnyUrl directly and implement our own validation? Or could we somehow use the validation from the pydandtic classes inside our own Uri dataclass?

lwjohnst86 · 2026-02-25T14:46:01Z

src/seedcase_flower/internals.py

+type HttpsUrl_or_FileUrl = HttpsUrl | FileUrl
+
+
+def _resolve_uri(uri_or_path: str) -> HttpsUrl_or_FileUrl:


Suggested change

def _resolve_uri(uri_or_path: str) -> HttpsUrl_or_FileUrl:

def _resolve_uri(uri: str) -> URI:

See comment above.

Maybe a better word would be _parse_uri()...? 🤔

lwjohnst86 · 2026-02-25T14:50:36Z

src/seedcase_flower/internals.py

+    split_uri = parse.urlsplit(uri_or_path)
+    match split_uri.scheme:
+        case "":
+            return _check_path(uri_or_path)


Instead, append a "file" and check as if it is a path URI.

src/seedcase_flower/internals.py

lwjohnst86

Just forgot to comment on some stuff ☺️

lwjohnst86 · 2026-02-25T15:30:34Z

src/seedcase_flower/cli.py

    """
-    path: Path = _resolve_uri(uri)
-    properties: dict[str, Any] = _read_properties(path)
+    resolved_uri = _resolve_uri(uri)


I'd prefer to explicitly write it out, at least within a bigger function like this one.

lwjohnst86 · 2026-02-25T15:31:33Z

src/seedcase_flower/internals.py

+_AnnotatedHttps = Annotated[AnyUrl, UrlConstraints(allowed_schemes=["https"])]
+_adapter = TypeAdapter(_AnnotatedHttps)
+
+
+class HttpsUrl(str):
+    """Type and class with validation for https URLs."""
+
+    @classmethod
+    def __get_pydantic_core_schema__(cls, source, handler):  # type: ignore[no-untyped-def]
+        """Initialize adapter core schema."""
+        return _adapter.core_schema
+
+    def __new__(cls, value: str):  # type: ignore[no-untyped-def]
+        """Setup validation."""
+        validated = _adapter.validate_python(value)
+        return str.__new__(cls, validated)


I have no idea what's going on here, nor what's the purpose. Is this to check that the URL is an actual URL?

lwjohnst86 · 2026-02-25T15:32:27Z

src/seedcase_flower/internals.py

+type HttpsUrl_or_FileUrl = HttpsUrl | FileUrl
+
+
+def _resolve_uri(uri_or_path: str) -> HttpsUrl_or_FileUrl:


Maybe a better word would be _parse_uri()...? 🤔

src/seedcase_flower/internals.py

martonvago

I like this in general, just added some ideas about how we could simplify things a bit ☺️

martonvago · 2026-02-25T21:08:14Z

src/seedcase_flower/cli.py

+    resolved_uri = _resolve_uri(uri)
+    properties = _read_properties(resolved_uri)  # type: ignore


I don't know if this split between _resolve_uri and _read_properties is the simplest solution.
_resolve_uri can return either a local or a remote address, and these are loaded/fetched differently.
The way it is now, we first split the cases in the match statement, unite them in the output of _resolve_uri, and then presumably split them again in _read_properties. We could cut out the middle man and handle each case right when we split them. (I'll explain more there.)

That could work! And just drop the _resolve_urifunction completely and just keep the read properties one instead 👍👍

martonvago · 2026-02-25T21:12:47Z

src/seedcase_flower/internals.py

+_AnnotatedHttps = Annotated[AnyUrl, UrlConstraints(allowed_schemes=["https"])]
+_adapter = TypeAdapter(_AnnotatedHttps)
+
+
+class HttpsUrl(str):
+    """Type and class with validation for https URLs."""
+
+    @classmethod
+    def __get_pydantic_core_schema__(cls, source, handler):  # type: ignore[no-untyped-def]
+        """Initialize adapter core schema."""
+        return _adapter.core_schema
+
+    def __new__(cls, value: str):  # type: ignore[no-untyped-def]
+        """Setup validation."""
+        validated = _adapter.validate_python(value)
+        return str.__new__(cls, validated)


So this is just to check the URL, right? As we'll fetch the JSON from the URL immediately, there's an argument for not trying to pre-check the URL at all. If it's bad, the request will fail anyway. (Of course we could add some nice error handling around that.)

martonvago · 2026-02-25T21:19:50Z

src/seedcase_flower/internals.py

-# TODO Extend to parse strings and return either URL or Path
-def _resolve_uri(uri: str) -> Path:
-    return Path(uri)
+type HttpsUrl_or_FileUrl = HttpsUrl | FileUrl


Maybe I'm misunderstanding what _resolve_uri returns, but if it returns the address of the datapackage.json, then it cannot really return an enum. Enums are for fixed, finite sets of values. If _resolve_uri only determines whether the address is local or remote, then yeah it can return an enum.

martonvago · 2026-02-25T21:44:42Z

src/seedcase_flower/internals.py

+type HttpsUrl_or_FileUrl = HttpsUrl | FileUrl
+
+
+def _resolve_uri(uri_or_path: str) -> HttpsUrl_or_FileUrl:


So picking up from my earlier comment, how about looking forward to what we will do with the addresses and doing sg like:

def _resolve_uri(uri: str) -> dict[str, Any]: split_uri = parse.urlsplit(uri) match split_uri.scheme: case "": return _load_properties_from_path(uri) case "file": return _load_properties_from_path(split_uri.path) case "https": return _fetch_properties_from_url(split_uri.geturl()) case "gh" | "github": return _fetch_properties_from_url(split_uri._replace(...).geturl()) case _: raise ValueError(...)

Both _load_properties_from_path and _fetch_properties_from_url could take a string and try to load the JSON without checking that the string is a good address. Instead, we could anticipate common failure cases (file not found, not JSON, various network errors, etc.) and show nicely worded errors -- probably easier than typing and validating the input addresses.

(Maybe the one thing worth checking is that the gh-flavoured address contains just the organisation and the repo?)

I like 👍 just rename the whole function to_read_properties instead. And for the gh and github one, convert then onto https and fetch like the other https

I agree with what you wrote in your comment above regarding the split between _resolve_uri and _read_properties. It seemed somewhat forced when I was implementing it too, exactly because of what you said with the uniting and splitting again. I'm totally fine with this alternative approach (or even ducktyping for error handling as long as the messages we get are clear), but my (mis?)understanding of Luke and/or functional programming in general is that it might clash on two accounts:

Each parameter should be assigned a type that as closely as possible matches what it represents. Preferably errors are handled by this type instead of e.g. pattern matching a string (I understand that we do pattern matching to assign the type, so it seems like the approaches are equally fine to me from this perspective, but not sure if they are equal from a functional programming perspective).

Each function should just do one thing. With this new approach, we would have a (parent) function that both resolves the URI and then loads the properties. Again, this would be totally fine with me personally, but I thought that in functional programming, we want a function to do one thing (e.g. convert from one type to another) and then pass the output type to another function so that they are more standalone and composable rather than nesting two functions within a parent function.

Again, I might very well have misunderstood both of these "principles" of functional programming and I'm happy with either solution here.

Hmm, I see what you're saying. What if instead we resolve thr URI so a pure path is given the "file" prefix and the gh/github is converted into a full URL. And the https is output as is. Then the downstream read properties function only needs to do two switch statements rather than X number to actually convert them into correct URIs?

I think that's what the initial implementation suggestion is doing, isn't it? Paths and file URIs return FileUrl whereas https URLs and github URIs return HttpsURL.

Yea! Well, almost, aside from how it is being output, that's the only tension point 😛

I mean I personally think that some problems lend themselves very nicely to one type of solution and some problems to another type. To me, the solution with nested functions feels cleaner and simpler in this case. But I do get the one function -- one responsibility hangup, though of course I could be cheeky and argue that loading the properties from a URI is one responsibility.

Or taking another angle, we could say that loading the properties from a local path and fetching them from a remote address are two distinct tasks. This is true because they expect different inputs (a path vs a string) and are implemented differently. If this is the case, then what we need before loading the properties is to determine whether we have a local or a remote address and to standardise each type of address.
Something like:

@dataclass class Uri: value: str local: bool def _parse_uri(uri: str) -> Uri: split_uri = parse.urlsplit(uri) match split_uri.scheme: case "": return Uri(value=uri, local=True) case "file": return Uri(value=split_uri.path, local=True) case "https": return Uri(value=split_uri.geturl(), local=False) case "gh" | "github": return Uri( value=split_uri._replace(...).geturl(), local=False, ) case _: raise ValueError(...) parsed_uri = _parse_uri(uri) if parsed_uri.local: properties = _load_properties_from_path(parsed_uri.value) else: properties = _fetch_properties_from_url(parsed_uri.value)

Yea, that's a good point, the distinction between local and remote. Since the functions to read the properties for either source will very likely be very differently handled..!

This reverts commit c1bcb3f.

joelostblom

@martonvago @lwjohnst86 Ready for another look! I've tried to update based on our discussion. If it looks largely good to you, I will add tests next on top of #155

joelostblom · 2026-02-27T15:03:56Z

src/seedcase_flower/internals.py

+    if split_source.scheme == "":
+        split_source = split_source._replace(scheme="file")


I moved this outside the match logic above to make the cases clearer an not have to repeat the same functions for paths and file uris

joelostblom · 2026-02-27T15:14:47Z

src/seedcase_flower/cli.py

 @app.command()
 def build(
-    uri: str = "datapackage.json",
+    source: str = "datapackage.json",


Another controversial change... but hear me out!

I think it was a bit confusing that we were calling the CLI arg a uri although it could also be a file or folder path. We then ended up having functions named _convert_to_*uri() that already took a uri variable as the input argument which semantically doesn't make much sense and is confusing to reason about, e.g. uri = convert_to_https_uri(uri)...

We could move this to its own issue if it warrants a lot of discussion but since it is directly tied to the naming of the functions introduced here, I thought it made sense to bring it up here.

Yes, move this to another issue, as this impacts the design ☺️

Renaming functions is easy, but we don't want this PR to be tied down more than it has been already.

So for now, keep as URI, not source (I personally am not sure even source is a good name, but thats something to discuss).

Thanks for considering it! I opened #167 and reverted the changes here

joelostblom · 2026-02-27T15:17:08Z

src/seedcase_flower/cli.py

    """
-    path: Path = _resolve_uri(uri)
-    properties: dict[str, Any] = _read_properties(path)
+    uri: Uri = _parse_source(source)


Same here as above, we're not just resolving a URI in this function, we're parsing an input string into components, editing as needed, then converting to our URI type.

You can still parse a URI to confirm it is a URI. Like parsing a JSON file. e.g. parse_json(path) or like read_properties(path). Both content JSON/properties, but we still parse it to get into the right format for us.

Mostly agree... I think it would probably be more suitable to say you parse a URI candidate to check whether it is a URI. But that's a bit aside of my main point here, which is that you cannot parse a path and say you are parsing a URI.

joelostblom · 2026-02-27T15:17:35Z

src/seedcase_flower/internals.py

+                "The source must be either a path to an existing file/folder "
+                "or a URI with one of the following URI prefixes: "
+                "`file:`, `https:`, `gh:`, `github:`"
+            )


I kept this error, but raised #159 for specific gh uri errors

joelostblom · 2026-02-27T15:20:34Z

src/seedcase_flower/internals.py

-def _resolve_uri(uri: str) -> Path:
-    return Path(uri)
+@dataclass(frozen=True)
+class Uri:


We used the URI name on the call so I kept it for now, but we could consider calling it URL instead since by the time we are using this class, we have already converted the gh URIs and paths to https URLs and File URLs, respectively. OF course, URI is still valid since it is a superset including URLs, but URL would be more precise to at this point since non-URL URIs (gh:, file:, ...) will never be represented by this class.

URL is only for HTTP, URI is the umbrella term that includes file:. https://en.wikipedia.org/wiki/Uniform_Resource_Identifier. And this class would contain the value which contains paths and URLs. We could call it something else, but URL won't be accurate.

Hmmm, I don't think that is correct. My understanding is that a URI can be either a URN or a URL. The difference is whether it uniquely Names a resource, or Locates the resource with an actual address. HTTP is just the scheme. There are many other schemes that specifies valid URLs, e.g. ftp, irc, and file.

The link that you posted actually corroborates this understanding as exemplified by this excerpt from the second paragraph:

URIs which provide a means of locating and retrieving information resources on a network (either on the Internet or on another private network, such as a computer file system or an Intranet) are Uniform Resource Locators (URLs). ... Other URIs provide only a unique name, without a means of locating or retrieving the resource or information about it; these are Uniform Resource Names (URNs).

Some more examples on this link.

Ok, we must be reading these entries in completely different ways because the way I read the wiki entry on URI is very different from how you are describing it. Like, a file is not a URL, as per wiki "web address". And a URN seems to be it's own URI (urn: vs https:), and urn: is not file:.

Here it's called a file URI: https://en.wikipedia.org/wiki/File_URI_scheme

joelostblom · 2026-02-27T15:25:16Z

src/seedcase_flower/internals.py

+    else:
+        # TODO read from remote file
+        pass
+    return {"placeholder": uri.value}


I put up a draft of what I think this function will look like in #161 that I want to mention since it is so connected to the changes in this PR

lwjohnst86

Nice refactor! Some comments ☺️ Also, can you add a unit test for _parse_uri()? We can treat the "higher-level" functions within the build() and view() functions almost as the Python interface, so we can do unit testing of them to make sure they work fine. And the tests for build() and view() we treat as integration tests. Unfortunately, Python doesn't have a formal way of classifying/structuring unit tests from integration tests, so this is the best we have.

lwjohnst86 · 2026-02-27T16:02:48Z

src/seedcase_flower/cli.py

    """
-    path: Path = _resolve_uri(uri)
-    properties: dict[str, Any] = _read_properties(path)
+    uri: Uri = _parse_source(source)


You can still parse a URI to confirm it is a URI. Like parsing a JSON file. e.g. parse_json(path) or like read_properties(path). Both content JSON/properties, but we still parse it to get into the right format for us.

lwjohnst86 · 2026-02-27T16:05:31Z

src/seedcase_flower/internals.py

-def _resolve_uri(uri: str) -> Path:
-    return Path(uri)
+@dataclass(frozen=True)
+class Uri:


URL is only for HTTP, URI is the umbrella term that includes file:. https://en.wikipedia.org/wiki/Uniform_Resource_Identifier. And this class would contain the value which contains paths and URLs. We could call it something else, but URL won't be accurate.

This is a first implementation of the tests we can use to check that the CLI is behaving correctly. I think this covers most of what we currently have in main, at least we reach 💯 % coverage! Merging this before some of the existing PRs e.g. #140 and #152 will make it easier to add new tests specifically for the functionality added there instead of convoluting it with testing the general functionality of the app which is tested here instead. I tried to follow the cyclopts test docs as closely as possible, but let me know if you think something is missing. I also experimented with using the Copilot CLI for the first time to help me get started and understanding how to write the tests. It was quite useful, particularly to get started, but I did have to delete about half of its suggestions because there was a lot of redundancy. Thought I would mention it in case someone catches something odd in here so I can blame it and not me =p Needs a thorough review. ## Checklist - [x] Ran `just run-all` --------- Co-authored-by: Luke W. Johnston <lwjohnst86@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

tests/test_internals.py

…g sanitization Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

joelostblom

Thanks for the comments, read for another look!

The added tests were created by copilot and then reviewed and edited by yours truly. Let me know if the style should be different, e.g. no dosctring or testing more than one thing in a single test (they are very granular now, but in general it also seems good to test one thing per test and the small amount of code duplication does not seem to warrant the use of fixtures to me, but if you follow another style I can change it to match!).

It was also fun to see the automatically triggered github security review where one AI is correcting another xD

joelostblom · 2026-03-02T08:13:50Z

src/seedcase_flower/internals.py

-def _resolve_uri(uri: str) -> Path:
-    return Path(uri)
+@dataclass(frozen=True)
+class Uri:


Hmmm, I don't think that is correct. My understanding is that a URI can be either a URN or a URL. The difference is whether it uniquely Names a resource, or Locates the resource with an actual address. HTTP is just the scheme. There are many other schemes that specifies valid URLs, e.g. ftp, irc, and file.

The link that you posted actually corroborates this understanding as exemplified by this excerpt from the second paragraph:

URIs which provide a means of locating and retrieving information resources on a network (either on the Internet or on another private network, such as a computer file system or an Intranet) are Uniform Resource Locators (URLs). ... Other URIs provide only a unique name, without a means of locating or retrieving the resource or information about it; these are Uniform Resource Names (URNs).

Some more examples on this link.

joelostblom · 2026-03-02T08:22:07Z

src/seedcase_flower/cli.py

    """
-    path: Path = _resolve_uri(uri)
-    properties: dict[str, Any] = _read_properties(path)
+    uri: Uri = _parse_source(source)


Mostly agree... I think it would probably be more suitable to say you parse a URI candidate to check whether it is a URI. But that's a bit aside of my main point here, which is that you cannot parse a path and say you are parsing a URI.

joelostblom · 2026-03-02T08:26:26Z

src/seedcase_flower/cli.py

 @app.command()
 def build(
-    uri: str = "datapackage.json",
+    source: str = "datapackage.json",


Thanks for considering it! I opened #167 and reverted the changes here

joelostblom · 2026-03-02T09:20:00Z

src/seedcase_flower/cli.py

    """
-    path: Path = _resolve_uri(uri)
-    properties: dict[str, Any] = _read_properties(path)
+    uri: Uri = _parse_uri(uri)  # type: ignore # TODO fix in read_prop PR


I think a good example of why the current naming is wonky is the fact that we are reassigning the uri variable here. uri is indeed the most appropriate name at this point because it is now an instance of our Uri class. But before this line uri doesn't actually contain a uri, which is confusing to me at least.

mypy also complains about this:

src/seedcase_flower/cli.py:54: error: Name "uri" already defined on line 33 [no-redef] uri: Uri = _parse_uri(uri) ^~~ Found 1 error in 1 file (checked 13 source files)

One solution to this mypy thing is to use parse_uri within read_properties() e.g. read_properties(parse_uri(uri)). (maybe this is also why I liked resolve as a word better, but I don't have a strong preference). (this is where I really miss the ability to pipe...).

lwjohnst86

Just some initial comments, will look at the tests later :)

lwjohnst86 · 2026-03-02T10:01:07Z

src/seedcase_flower/cli.py

    """
-    path: Path = _resolve_uri(uri)
-    properties: dict[str, Any] = _read_properties(path)
+    uri: Uri = _parse_uri(uri)  # type: ignore # TODO fix in read_prop PR


One solution to this mypy thing is to use parse_uri within read_properties() e.g. read_properties(parse_uri(uri)). (maybe this is also why I liked resolve as a word better, but I don't have a strong preference). (this is where I really miss the ability to pipe...).

lwjohnst86 · 2026-03-02T10:03:00Z

src/seedcase_flower/cli.py

+        uri: The path to a local `datapackage.json` file or its parent folder.
+            Can also be an `https:` URL to a remote `datapackage.json` or a
+            `github:` / `gh:` URI pointing to a repo with a `datapackage.json`
+            in the repo root (in the format `gh:org/repo`).


Suggested change

in the repo root (in the format `gh:org/repo`).

in the repo root (in the format `gh:org/repo`). Can also take a reference to a tag or branch for `gh:` or `github:` URIs (e.g. `gh:org/repo@main` or `gh:org/repo@1.0.1).

Just to highlight that we also accept that.

lwjohnst86 · 2026-03-02T10:09:36Z

src/seedcase_flower/internals.py

-def _resolve_uri(uri: str) -> Path:
-    return Path(uri)
+@dataclass(frozen=True)
+class Uri:


Ok, we must be reading these entries in completely different ways because the way I read the wiki entry on URI is very different from how you are describing it. Like, a file is not a URL, as per wiki "web address". And a URN seems to be it's own URI (urn: vs https:), and urn: is not file:.

Here it's called a file URI: https://en.wikipedia.org/wiki/File_URI_scheme

joelostblom added 4 commits February 23, 2026 18:10

feat: ✨ Move over relevant code from sprout and cdp

5f6fae2

feat: ✨ Add Https type and class validator

6f0ef5d

A bit ugly/verbose, but I couldn't find a faster way to have it both work as a type and be able to validate potential URLs the same way as HttpUrl, ie without calling other methods or functions.

feat: ✨ Implement resolve_uri

f943218

refactor: ♻️ Use case/match for readabilty

a89633d

add-to-board-token bot added this to Iteration planning Feb 24, 2026

github-actions bot assigned joelostblom Feb 24, 2026

github-project-automation bot moved this to Todo in Iteration planning Feb 24, 2026

joelostblom added 9 commits February 24, 2026 16:15

refactor: ♻️ Make variable names more precise

8072033

fix: 🐛 Remove unnused import

7763097

fix: 🐛 Fix mypy errors

c375a30

chore: 🔧 Fix typing

26ad174

fix: 🐛 Remove old tmp code

662c3d6

fix: 🐛 Remove unused import

fd70520

fix: 🐛 Pass vulture checks

b7ad9a0

test: ✅ Comment code to be updated in separate PR to pass tests

df338f3

fix: 🐛 Remove check datapacakge from this PR

0044b9c

signekb moved this from Todo to In Progress in Iteration planning Feb 25, 2026

fix: 🐛 Create type alias

05fbaa8

joelostblom commented Feb 25, 2026

View reviewed changes

joelostblom marked this pull request as ready for review February 25, 2026 14:05

joelostblom requested a review from a team as a code owner February 25, 2026 14:05

joelostblom moved this from In Progress to In Review in Iteration planning Feb 25, 2026

fix: 🐛 Ignore line in mypy instead of changing logic

5aae406

joelostblom commented Feb 25, 2026

View reviewed changes

lwjohnst86 requested changes Feb 25, 2026

View reviewed changes

github-project-automation bot moved this from In Review to In Progress in Iteration planning Feb 25, 2026

lwjohnst86 reviewed Feb 25, 2026

View reviewed changes

joelostblom mentioned this pull request Feb 25, 2026

test: ✅ mocking of basic app functionality #155

Merged

1 task

martonvago reviewed Feb 25, 2026

View reviewed changes

joelostblom requested a review from lwjohnst86 February 26, 2026 08:02

joelostblom added 7 commits February 27, 2026 15:38

docs: 📝 Clarify what source can be

78da642

fix: 🐛 Avoid changes to read_properties in this PR

c1bcb3f

Revert "fix: 🐛 Avoid changes to read_properties in this PR"

2b5aeea

This reverts commit c1bcb3f.

feat: ✨ Check types

43a974e

fix: 🐛 Remove redundant parsin of gh uri

39a8c76

feat: ✨ Improve naming

3c091bf

feat: ✨ Add read_prop skeleton

030789e

joelostblom commented Feb 27, 2026

View reviewed changes

joelostblom requested a review from lwjohnst86 February 27, 2026 15:41

joelostblom moved this from In Progress to In Review in Iteration planning Feb 27, 2026

lwjohnst86 requested changes Feb 27, 2026

View reviewed changes

github-project-automation bot moved this from In Review to In Progress in Iteration planning Feb 27, 2026

joelostblom mentioned this pull request Mar 2, 2026

How to test the internal functionality of build and view? #156

Open

joelostblom added 7 commits March 2, 2026 09:48

refactor: ♻️ Revert to less precise name until we have discussed more

fd8c748

Merge branch 'main' into feat/resolve-uri

0b5d5d6

test: ✅ Update test to match new docstring

6a12807

fix: 🐛 Restore local file reading behavior to not break former test

7d1b019

refactor: ♻️ Rename test function to match refactor

a5cc597

chore: 🔧 Ignore mypy on lines to be fixed in read_prop PR

d2fb845

test: ✅ Add internal tests for _parse_uri

f5a206c

github-advanced-security bot found potential problems Mar 2, 2026

View reviewed changes

tests/test_internals.py Fixed Show fixed Hide fixed

tests/test_internals.py Fixed Show fixed Hide fixed

joelostblom and others added 2 commits March 2, 2026 10:32

build: 🔨 uv update

115b19e

Potential fix for code scanning alert no. 14: Incomplete URL substrin…

779f627

…g sanitization Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

joelostblom requested a review from lwjohnst86 March 2, 2026 09:37

joelostblom moved this from In Progress to In Review in Iteration planning Mar 2, 2026

joelostblom commented Mar 2, 2026

View reviewed changes

joelostblom mentioned this pull request Mar 2, 2026

Name URI argument more precisely #167

Open

lwjohnst86 requested changes Mar 2, 2026

View reviewed changes

github-project-automation bot moved this from In Review to In Progress in Iteration planning Mar 2, 2026

		properties = _resolve_uri(uri)
		# properties = _read_properties(path)

-type HttpsUrl_or_FileUrl = HttpsUrl | FileUrl
+class URI(Enum):
+    https = HttpsUrl
+    file = FileUrl

		type HttpsUrl_or_FileUrl = HttpsUrl \| FileUrl


		def _resolve_uri(uri_or_path: str) -> HttpsUrl_or_FileUrl:

	def _resolve_uri(uri_or_path: str) -> HttpsUrl_or_FileUrl:
	def _resolve_uri(uri: str) -> URI:

		resolved_uri = _resolve_uri(uri)
		properties = _read_properties(resolved_uri) # type: ignore

		if split_source.scheme == "":
		split_source = split_source._replace(scheme="file")

	in the repo root (in the format `gh:org/repo`).
	in the repo root (in the format `gh:org/repo`). Can also take a reference to a tag or branch for `gh:` or `github:` URIs (e.g. `gh:org/repo@main` or `gh:org/repo@1.0.1).

Conversation

joelostblom commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Uh oh!

joelostblom left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lwjohnst86 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lwjohnst86 Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joelostblom Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joelostblom Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

lwjohnst86 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

martonvago left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

joelostblom commented Feb 24, 2026 •

edited

Loading

lwjohnst86 Feb 25, 2026 •

edited

Loading

joelostblom Feb 26, 2026 •

edited

Loading

joelostblom Feb 26, 2026 •

edited

Loading