Skip to content

Conversation

@kbroch-rivosinc
Copy link
Collaborator

@kbroch-rivosinc kbroch-rivosinc commented Jan 9, 2025

Just proof of concept of this tool https://github.com/sourcemeta/jsonschema
Don't expect this PR to go anywhere mostly just capturing results.

Has a lint feature that found some anti-patterns in the schema files. The --fix option also formats the files so I did that commit separately to make it easier to see the anti-patterns.

Here's the log output:

jsonschema lint schemas/.
/Users/kbroch/rvi/repos/riscv-software-src/riscv-unified-db/schemas/cert_class_schema.json:
  Setting `type` alongside `const` is considered an anti-pattern, as the constant already implies its respective type (const_with_type)
    at schema location "/properties/$schema"
/Users/kbroch/rvi/repos/riscv-software-src/riscv-unified-db/schemas/cert_class_schema.json:
  Setting `type` alongside `const` is considered an anti-pattern, as the constant already implies its respective type (const_with_type)
    at schema location "/properties/kind"
/Users/kbroch/rvi/repos/riscv-software-src/riscv-unified-db/schemas/cert_class_schema.json:
  Setting `type` alongside `enum` is considered an anti-pattern, as the enumeration choices already imply their respective types (enum_with_type)
    at schema location "/properties/processor_kind"
/Users/kbroch/rvi/repos/riscv-software-src/riscv-unified-db/schemas/cert_class_schema.json:
  Setting `type` alongside `enum` is considered an anti-pattern, as the enumeration choices already imply their respective types (enum_with_type)
    at schema location "/properties/mandatory_priv_modes/items"
/Users/kbroch/rvi/repos/riscv-software-src/riscv-unified-db/schemas/cert_model_schema.json:
  Setting `type` alongside `const` is considered an anti-pattern, as the constant already implies its respective type (const_with_type)
    at schema location "/properties/kind"
/Users/kbroch/rvi/repos/riscv-software-src/riscv-unified-db/schemas/cert_model_schema.json:
  Setting `type` alongside `const` is considered an anti-pattern, as the constant already implies its respective type (const_with_type)
    at schema location "/properties/$schema"
/Users/kbroch/rvi/repos/riscv-software-src/riscv-unified-db/schemas/cert_model_schema.json:
  Setting `type` alongside `enum` is considered an anti-pattern, as the enumeration choices already imply their respective types (enum_with_type)
    at schema location "/properties/base"
/Users/kbroch/rvi/repos/riscv-software-src/riscv-unified-db/schemas/config_schema.json:
  Setting `type` alongside `enum` is considered an anti-pattern, as the enumeration choices already imply their respective types (enum_with_type)
    at schema location "/properties/type"
/Users/kbroch/rvi/repos/riscv-software-src/riscv-unified-db/schemas/config_schema.json:
  Setting `type` alongside `const` is considered an anti-pattern, as the constant already implies its respective type (const_with_type)
    at schema location "/properties/$schema"
/Users/kbroch/rvi/repos/riscv-software-src/riscv-unified-db/schemas/config_schema.json:
  Setting `type` alongside `const` is considered an anti-pattern, as the constant already implies its respective type (const_with_type)
    at schema location "/properties/kind"
/Users/kbroch/rvi/repos/riscv-software-src/riscv-unified-db/schemas/json-schema-draft-07.json:
  Setting the `items` keyword to the true schema does not add any further constraint (items_schema_default)
    at schema location "/properties/examples"
/Users/kbroch/rvi/repos/riscv-software-src/riscv-unified-db/schemas/json-schema-draft-07.json:
  Setting the `items` keyword to the true schema does not add any further constraint (items_schema_default)
    at schema location "/properties/enum"
/Users/kbroch/rvi/repos/riscv-software-src/riscv-unified-db/schemas/manual_schema.json:
  Setting `type` alongside `const` is considered an anti-pattern, as the constant already implies its respective type (const_with_type)
    at schema location "/properties/$schema"
/Users/kbroch/rvi/repos/riscv-software-src/riscv-unified-db/schemas/manual_schema.json:
  Setting `type` alongside `const` is considered an anti-pattern, as the constant already implies its respective type (const_with_type)
    at schema location "/properties/kind"
/Users/kbroch/rvi/repos/riscv-software-src/riscv-unified-db/schemas/manual_version_schema.json:
  Setting `type` alongside `const` is considered an anti-pattern, as the constant already implies its respective type (const_with_type)
    at schema location "/properties/$schema"
/Users/kbroch/rvi/repos/riscv-software-src/riscv-unified-db/schemas/manual_version_schema.json:
  Setting `type` alongside `const` is considered an anti-pattern, as the constant already implies its respective type (const_with_type)
    at schema location "/properties/kind"
/Users/kbroch/rvi/repos/riscv-software-src/riscv-unified-db/schemas/profile_class_schema.json:
  Setting `type` alongside `const` is considered an anti-pattern, as the constant already implies its respective type (const_with_type)
    at schema location "/properties/$schema"
/Users/kbroch/rvi/repos/riscv-software-src/riscv-unified-db/schemas/profile_class_schema.json:
  Setting `type` alongside `const` is considered an anti-pattern, as the constant already implies its respective type (const_with_type)
    at schema location "/properties/kind"
/Users/kbroch/rvi/repos/riscv-software-src/riscv-unified-db/schemas/profile_class_schema.json:
  Setting `type` alongside `enum` is considered an anti-pattern, as the enumeration choices already imply their respective types (enum_with_type)
    at schema location "/properties/processor_kind"
/Users/kbroch/rvi/repos/riscv-software-src/riscv-unified-db/schemas/profile_release_schema.json:
  Setting `type` alongside `const` is considered an anti-pattern, as the constant already implies its respective type (const_with_type)
    at schema location "/properties/$schema"
/Users/kbroch/rvi/repos/riscv-software-src/riscv-unified-db/schemas/profile_release_schema.json:
  Setting `type` alongside `const` is considered an anti-pattern, as the constant already implies its respective type (const_with_type)
    at schema location "/properties/kind"
/Users/kbroch/rvi/repos/riscv-software-src/riscv-unified-db/schemas/profile_schema.json:
  Setting `type` alongside `const` is considered an anti-pattern, as the constant already implies its respective type (const_with_type)
    at schema location "/properties/$schema"
/Users/kbroch/rvi/repos/riscv-software-src/riscv-unified-db/schemas/profile_schema.json:
  Setting `type` alongside `const` is considered an anti-pattern, as the constant already implies its respective type (const_with_type)
    at schema location "/properties/kind"

@kbroch-rivosinc kbroch-rivosinc self-assigned this Jan 9, 2025
@kbroch-rivosinc kbroch-rivosinc marked this pull request as draft January 9, 2025 06:47
@dhower-qc
Copy link
Collaborator

Cool tool. Regarding the anti-pattern it's pointing out, we do actually (currently) depend on both "type" and "const" being present in some cases. For example, when printing the type of a parameter in generated documentation, we always look at "type" rather than calculating the inferred type from "const".

That said, I see this tool also has a canonicalize command. I wonder if that would add "type" to a "const"-only schema?

@kbroch-rivosinc
Copy link
Collaborator Author

Cool tool. Regarding the anti-pattern it's pointing out, we do actually (currently) depend on both "type" and "const" being present in some cases. For example, when printing the type of a parameter in generated documentation, we always look at "type" rather than calculating the inferred type from "const".

That said, I see this tool also has a canonicalize command. I wonder if that would add "type" to a "const"-only schema?

Good to know the usage, so actually having a check to make sure both are included might be useful.

I tried canonicalize and got something else entirely (NOTE: starting point is from branch with type stripped):

❯ diff profile_schema.json profile_schema.canonicalize.json
8a9
>   "minProperties": 3,
11c12,14
<       "const": "profile_schema.json#"
---
>       "enum": [
>         "profile_schema.json#"
>       ]
14c17,19
<       "const": "profile"
---
>       "enum": [
>         "profile"
>       ]
18c23,24
<       "type": "string"
---
>       "type": "string",
>       "minLength": 0

But given this help description is BinPack related there's probably something to that:

   canonicalize <schema.json>

       Pre-process a JSON Schema into JSON BinPack's canonical form
       for static analysis.

And for fun lint that new result:

❯ sm-jsonschema lint profile_schema.canonicalize.json
/Users/kbroch/rvi/repos/riscv-software-src/riscv-unified-db/schemas/profile_schema.canonicalize.json:
  Setting `minProperties` to a number less than `required` does not add any further constraint (unsatisfiable_min_properties)
    at schema location ""
/Users/kbroch/rvi/repos/riscv-software-src/riscv-unified-db/schemas/profile_schema.canonicalize.json:
  An `enum` of a single value can be expressed as `const` (enum_to_const)
    at schema location "/properties/$schema"
/Users/kbroch/rvi/repos/riscv-software-src/riscv-unified-db/schemas/profile_schema.canonicalize.json:
  An `enum` of a single value can be expressed as `const` (enum_to_const)
    at schema location "/properties/kind"

@dhower-qc
Copy link
Collaborator

@kbroch-rivosinc, can we re-run this on main, and then work on getting it in the regression flow?

@kbroch-rivosinc
Copy link
Collaborator Author

@kbroch-rivosinc, can we re-run this on main, and then work on getting it in the regression flow?

Sure. Have you consider putting it in the pre-commit tool flow instead? We could put it in as a local hook similar to shellcheck if there isn't an existing hook defined.

@dhower-qc
Copy link
Collaborator

Sure. Have you consider putting it in the pre-commit tool flow instead? We could put it in as a local hook similar to shellcheck if there isn't an existing hook defined.

I was already assuming it was there, so yes ;)

@kbroch-rivosinc
Copy link
Collaborator Author

Sure. Have you consider putting it in the pre-commit tool flow instead? We could put it in as a local hook similar to shellcheck if there isn't an existing hook defined.

I was already assuming it was there, so yes ;)

Right, pre-commit runs in regression-precommit :). Ok started with this: sourcemeta/jsonschema#263

Otherwise, I'll do a local hook(s).

@jviotti
Copy link

jviotti commented Mar 26, 2025

Hey there! Author of the Sourcemeta CLI here!

we do actually (currently) depend on both "type" and "const" being present in some cases. For example, when printing the type of a parameter in generated documentation, we always look at "type" rather than calculating the inferred type from "const".

That is interesting. Which tool do you use? In general, type does nothing when const keyword is already present (which takes precedence over the former), hence the warning.

If it is a blocker, we should definitely support a way to disable certain rules, which we don't support at the moment, as 99% of people out there seemed fine with the defaults.

I tried canonicalize and got something else entirely

Yes, this is something else, and we deprecated that functionality as it's mostly for internal use. The idea is to denormalise schemas for advanced static analysis.


Overall, happy to work and collaborate on anything you guys need to make the CLI a good fit for your use case. Just let me know anything that is problematic and right now! As a member of the JSON Schema TSC, we are making a big push to make sure certain tools (like the CLI) fix real user's problems for real, so any feedback is REALLY appreciated.

@jviotti
Copy link

jviotti commented Apr 1, 2025

Check out v7.2.0 (https://github.com/sourcemeta/jsonschema/releases/tag/v7.2.0). It adds a --disable option to the lint command so you can disable the const_with_type and enum_with_type rules that conflict with your documentation generator.

@dhower-qc
Copy link
Collaborator

That is interesting. Which tool do you use?

It's in UnifiedBD, not a separate tool. It's a place where we are trying to generate and English of a schema field. We'll use "type" to say something like "it's a string" and "const" to say "that must be either "'blah', 'foo', or 'bar'". We could probably be smarter, but we aren't ;)

@dhower-qc
Copy link
Collaborator

Check out v7.2.0 (https://github.com/sourcemeta/jsonschema/releases/tag/v7.2.0). It adds a --disable option to the lint command so you can disable the const_with_type and enum_with_type rules that conflict with your documentation generator.

Cool, thanks

@jviotti
Copy link

jviotti commented Apr 1, 2025

It's in UnifiedBD, not a separate tool. It's a place where we are trying to generate and English of a schema field. We'll use "type" to say something like "it's a string" and "const" to say "that must be either "'blah', 'foo', or 'bar'". We could probably be smarter, but we aren't ;)

Fair enough. The new --disable option was a much needed feature, and hopefully accommodates those cases pretty well.

@kbroch-rivosinc kbroch-rivosinc changed the title Dev/kbroch/poc sourcemeta jsonschema tool add sourcemeta jsonschema linting to pre-commit and fix problems Apr 2, 2025
@kbroch-rivosinc kbroch-rivosinc changed the title add sourcemeta jsonschema linting to pre-commit and fix problems add sourcemeta jsonschema linting to pre-commit and fix lint issues Apr 2, 2025
kbroch-rivosinc added a commit that referenced this pull request Apr 2, 2025
@kbroch-rivosinc kbroch-rivosinc force-pushed the dev/kbroch/poc-sourcemeta-jsonschema-tool branch from a8ed35f to c6fee7d Compare April 2, 2025 15:26
kbroch-rivosinc added a commit that referenced this pull request Apr 2, 2025
@kbroch-rivosinc kbroch-rivosinc force-pushed the dev/kbroch/poc-sourcemeta-jsonschema-tool branch 2 times, most recently from 436d2be to b66d9a8 Compare April 2, 2025 17:14
kbroch-rivosinc added a commit that referenced this pull request Apr 2, 2025
kbroch-rivosinc added a commit that referenced this pull request Apr 2, 2025
@kbroch-rivosinc kbroch-rivosinc force-pushed the dev/kbroch/poc-sourcemeta-jsonschema-tool branch from b66d9a8 to f977660 Compare April 2, 2025 17:18
@kbroch-rivosinc
Copy link
Collaborator Author

@dhower-qc : I'm not sure what's going on here , looks like jsonschema is installed but I'm not getting the version that has the --disable feature. I deleted some caches relating to the PR thinking that was the problem but that didn't help. What am I missing?

When I run locally it WAI:

[email protected]:~/rvi/non-repo-repos/riscv-unified-db on  dev/kbroch/poc-sourcemeta-jsonschema-tool:main [⇡] via  v22.14.0 via 🐍 v3.13.2 via 💎 v3.2.3
❯ jsonschema | head
JSON Schema CLI - v7.2.3
Usage: jsonschema-darwin-arm64 <command> [arguments...]

Global Options:

   --verbose, -v    Enable verbose output
   --resolve, -r    Import the given JSON Schema (or directory of schemas)
                    into the resolution context

Commands:

[email protected]:~/rvi/non-repo-repos/riscv-unified-db on  dev/kbroch/poc-sourcemeta-jsonschema-tool:main [⇡] via  v22.14.0 via 🐍 v3.13.2 via 💎 v3.2.3
❯ which jsonschema
/Users/kbroch/.local/share/mise/installs/node/22.14.0/bin/jsonschema

[email protected]:~/rvi/non-repo-repos/riscv-unified-db on  dev/kbroch/poc-sourcemeta-jsonschema-tool:main [⇡] via  v22.14.0 via 🐍 v3.13.2 via 💎 v3.2.3
❯ jsonschema | head
JSON Schema CLI - v7.2.3
Usage: jsonschema-darwin-arm64 <command> [arguments...]

Global Options:

   --verbose, -v    Enable verbose output
   --resolve, -r    Import the given JSON Schema (or directory of schemas)
                    into the resolution context

Commands:

[email protected]:~/rvi/non-repo-repos/riscv-unified-db on  dev/kbroch/poc-sourcemeta-jsonschema-tool:main [⇡] via  v22.14.0 via 🐍 v3.13.2 via 💎 v3.2.3
❯ pre-commit run sourcemeta-jsonschema-lint --all-files
sourcemeta-jsonschema-lint...............................................Passed

@dhower-qc
Copy link
Collaborator

It looks like pre-commit is installing an older version jsonschema.

You updated jsonschema in package.json, but that only applies if you run pre-commit with the container env (e.g., with ./bin/pre-commit). When you run pre-commit through a GitHub action, it's running in the base container (runs-on: ubuntu-latest).

I think we'll either to need to:

  • Update the action in regress.yml to use ./bin/pre-commit instead of pre-commit/[email protected], or
  • Add the jsonschema version to the pre-commit config so it gets the right version

kbroch-rivosinc added a commit that referenced this pull request Apr 3, 2025
@kbroch-rivosinc kbroch-rivosinc force-pushed the dev/kbroch/poc-sourcemeta-jsonschema-tool branch from f977660 to 308d14c Compare April 3, 2025 15:55
kbroch-rivosinc added a commit that referenced this pull request Apr 8, 2025
@kbroch-rivosinc kbroch-rivosinc force-pushed the dev/kbroch/poc-sourcemeta-jsonschema-tool branch from 308d14c to 3db0b1f Compare April 8, 2025 22:20
kbroch-rivosinc added a commit that referenced this pull request Apr 8, 2025
@kbroch-rivosinc kbroch-rivosinc force-pushed the dev/kbroch/poc-sourcemeta-jsonschema-tool branch from c3875e5 to 44d220e Compare April 8, 2025 22:47
kbroch-rivosinc added a commit that referenced this pull request Apr 8, 2025
@kbroch-rivosinc kbroch-rivosinc force-pushed the dev/kbroch/poc-sourcemeta-jsonschema-tool branch from 44d220e to c447deb Compare April 8, 2025 22:56
@kbroch-rivosinc
Copy link
Collaborator Author

I figured out that this is a namespace collision between the jsonschema python tool and souremeta's tool (which is calling the python one). Working on a solution.

@jviotti
Copy link

jviotti commented Apr 18, 2025

@kbroch-rivosinc Ah, I heard that before. For what it's worth, the jsonschema Python CLI is deprecated (https://github.com/python-jsonschema/jsonschema/blob/main/jsonschema/cli.py#L26) in favor of its new name: check-jsonschema. The jsonschema one was even removed from Homebrew some time ago.

kbroch-rivosinc added a commit that referenced this pull request Apr 18, 2025
@kbroch-rivosinc kbroch-rivosinc force-pushed the dev/kbroch/poc-sourcemeta-jsonschema-tool branch from 9f3c84b to af4185c Compare April 18, 2025 20:46
@kbroch-rivosinc kbroch-rivosinc force-pushed the dev/kbroch/poc-sourcemeta-jsonschema-tool branch from af4185c to 22a00dc Compare April 18, 2025 20:48
@kbroch-rivosinc
Copy link
Collaborator Author

@kbroch-rivosinc Ah, I heard that before. For what it's worth, the jsonschema Python CLI is deprecated (https://github.com/python-jsonschema/jsonschema/blob/main/jsonschema/cli.py#L26) in favor of its new name: check-jsonschema. The jsonschema one was even removed from Homebrew some time ago.

yeah, I'm using check-jsonschema everywhere but the deprecated jsonschema is still in the path so that ends up getting called when the sourcemeta jsonschema hook runs. So once it goes from deprecated to removed I won't have this problem.

@dhower-qc
Copy link
Collaborator

circling back here. I tried to install/run sourcemeta jsonschema, but it seemed to hit a bug on our schemas. closing this PR for now

@dhower-qc dhower-qc closed this Aug 20, 2025
@jviotti
Copy link

jviotti commented Aug 20, 2025

@dhower-qc What is the bug you hit and what version of the CLI are you running? Do you have an example I can locally try?

kbroch-rivosinc added a commit that referenced this pull request Aug 20, 2025
@jviotti
Copy link

jviotti commented Aug 21, 2025

I tried running the JSON Schema CLI v11.1.1 (latest) on the tip of main (04cde30c2d352e1bb63057c917700e695a88be11) based on how you tried running it here (40a4cc8) and it seems to work OK:

$ jsonschema lint --exclude const_with_type spec/schemas --ignore spec/schemas/json-schema-draft-07.json
/Users/jviotti/Projects/playground/riscv-unified-db/spec/schemas/csr_schema.json:
  In Draft 7 and older dialects, keywords sibling to $ref are never evaluated (draft_ref_siblings)
    at schema location ""
/Users/jviotti/Projects/playground/riscv-unified-db/spec/schemas/ext_schema.json:
  In Draft 7 and older dialects, keywords sibling to $ref are never evaluated (draft_ref_siblings)
    at schema location ""
/Users/jviotti/Projects/playground/riscv-unified-db/spec/schemas/inst_schema.json:
  In Draft 7 and older dialects, keywords sibling to $ref are never evaluated (draft_ref_siblings)
    at schema location ""
/Users/jviotti/Projects/playground/riscv-unified-db/spec/schemas/inst_subtype_schema.json:
  In Draft 7 and older dialects, keywords sibling to $ref are never evaluated (draft_ref_siblings)
    at schema location "/properties/name"
/Users/jviotti/Projects/playground/riscv-unified-db/spec/schemas/inst_var_type_schema.json:
  Wrapping any keyword other than `$ref` in `allOf` is unnecessary and may even introduce a minor evaluation performance overhead (unnecessary_allof_wrapper_draft)
    at schema location ""
    - /allOf/0/if
    - /allOf/0/then
/Users/jviotti/Projects/playground/riscv-unified-db/spec/schemas/manual_schema.json:
  In Draft 7 and older dialects, keywords sibling to $ref are never evaluated (draft_ref_siblings)
    at schema location "/properties/state"
/Users/jviotti/Projects/playground/riscv-unified-db/spec/schemas/manual_version_schema.json:
  In Draft 7 and older dialects, keywords sibling to $ref are never evaluated (draft_ref_siblings)
    at schema location "/properties/state"
/Users/jviotti/Projects/playground/riscv-unified-db/spec/schemas/manual_version_schema.json:
  In Draft 7 and older dialects, keywords sibling to $ref are never evaluated (draft_ref_siblings)
    at schema location "/properties/version"
/Users/jviotti/Projects/playground/riscv-unified-db/spec/schemas/proc_cert_class_schema.json:
  Setting `type` alongside `enum` is considered an anti-pattern, as the enumeration choices already imply their respective types (enum_with_type)
    at schema location "/properties/processor_kind"
/Users/jviotti/Projects/playground/riscv-unified-db/spec/schemas/proc_cert_model_schema.json:
  Setting `type` alongside `enum` is considered an anti-pattern, as the enumeration choices already imply their respective types (enum_with_type)
    at schema location "/properties/base"
/Users/jviotti/Projects/playground/riscv-unified-db/spec/schemas/proc_cert_model_schema.json:
  Setting `type` alongside `enum` is considered an anti-pattern, as the enumeration choices already imply their respective types (enum_with_type)
    at schema location "/properties/in_scope_priv_modes/items"
/Users/jviotti/Projects/playground/riscv-unified-db/spec/schemas/profile_family_schema.json:
  In Draft 7 and older dialects, keywords sibling to $ref are never evaluated (draft_ref_siblings)
    at schema location "/properties/description"
/Users/jviotti/Projects/playground/riscv-unified-db/spec/schemas/profile_family_schema.json:
  Setting `type` alongside `enum` is considered an anti-pattern, as the enumeration choices already imply their respective types (enum_with_type)
    at schema location "/properties/processor_kind"

New releases tend to come with many bug fixes, so maybe you are just hitting some oddity from the old v9 version that was fixed already?

As an aside, we have a Google Summer of Code project on the JSON Schema organisation going on right now specifically on the linter, so if you have any comments or any feedback, this is the perfect time! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants