Skip to content

Conversation

@pdgendt
Copy link
Collaborator

@pdgendt pdgendt commented Jan 12, 2026

  • Convert pykwalify schema with jsonschema
  • Remove some manual testing in Python code with schema validation
  • Make name property required in west extension commends (would have raised a KeyError before if it was missing)

Fixes #807

@codecov
Copy link

codecov bot commented Jan 12, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.75%. Comparing base (11302e1) to head (16c003e).

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #904      +/-   ##
==========================================
- Coverage   85.95%   85.75%   -0.20%     
==========================================
  Files          11       11              
  Lines        3453     3440      -13     
==========================================
- Hits         2968     2950      -18     
- Misses        485      490       +5     
Files with missing lines Coverage Δ
src/west/commands.py 95.60% <100.00%> (+0.06%) ⬆️
src/west/manifest.py 94.86% <100.00%> (-0.58%) ⬇️

@pdgendt pdgendt force-pushed the json-schema branch 4 times, most recently from 4104153 to b4de220 Compare January 13, 2026 08:27
@pdgendt pdgendt marked this pull request as ready for review January 13, 2026 15:42
@pdgendt pdgendt requested review from kartben and marc-hb January 13, 2026 15:42
Copy link
Collaborator

@marc-hb marc-hb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!
Can this catch quote strings that look like numbers in YAML files and force them to quote them?

Just curious because we just saw two instances of this back to back in Zephyr.

I didn't look at the PR yet sorry.

--------------------------------------------------------------
DO NOT CHANGE THIS FILE WITHOUT UPDATING THE DOCUMENTATION!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you update the documentation (since name is now required)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It "was" already required, as it would throw an error if users did not provide it.

But we probably should bump the manifest version (and docs), so that users could keep compatibility with versions that depend on pykwalify. Will update.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in order for users to target a version without the tooling change,
we need to bump the schema version.

I don't understand this sorry. Can you elaborate the use case?

If the schema is 100% compatible, why would the schema version be bumped?

MAINTAINERS.rst:   Decide if west.manifest.SCHEMA_VERSION needs an update:
MAINTAINERS.rst-
MAINTAINERS.rst:   - SCHEMA_VERSION should be updated to X.Y if release vX.Y will have manifest
MAINTAINERS.rst-     syntax changes that earlier versions of west cannot parse.
MAINTAINERS.rst-
MAINTAINERS.rst:   - SCHEMA_VERSION should *not* be changed for west vX.Y if the manifest
MAINTAINERS.rst-     syntax is fully compatible with what west vX.(Y-1) can handle.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in order for users to target a version without the tooling change,
we need to bump the schema version.

I don't understand this sorry. Can you elaborate the use case?

If the schema is 100% compatible, why would the schema version be bumped?

MAINTAINERS.rst:   Decide if west.manifest.SCHEMA_VERSION needs an update:
MAINTAINERS.rst-
MAINTAINERS.rst:   - SCHEMA_VERSION should be updated to X.Y if release vX.Y will have manifest
MAINTAINERS.rst-     syntax changes that earlier versions of west cannot parse.
MAINTAINERS.rst-
MAINTAINERS.rst:   - SCHEMA_VERSION should *not* be changed for west vX.Y if the manifest
MAINTAINERS.rst-     syntax is fully compatible with what west vX.(Y-1) can handle.

The schema file has an entirely different syntax so... Yeah :)
It's the semantics that's meant to remain identical.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The manifest syntax is the same though? I get Marc's point, but I think it would be beneficial to have different (minor) versions in this case? Not sure how to proceed here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW in Zephyr I ended up picking a different name for the new manifest (one was .yml the other .yaml) - would doing something similar help?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did the same here :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I submitted some SCHEMA_VERSION clarifications in PR #909. I hope they explain why I think this should not be bumped.

FWIW in Zephyr I ended up picking a different name for the new manifest (one was .yml the other .yaml)

You meant the schema.y[a]ml, right? Not the west.yml manifest which should hopefully not be affected.

If I got this correctly, then I don't see how a mere rename of the schema file affects the SCHEMA_VERSION. It would be a very different story if pykwalify and jsonschema were not 100% compatible with each other! But I don't think the mere name of the schema file matters here in any case.

@pdgendt
Copy link
Collaborator Author

pdgendt commented Jan 13, 2026

Thanks! Can this catch quote strings that look like numbers in YAML files and force them to quote them?

Just curious because we just saw two instances of this back to back in Zephyr.

I didn't look at the PR yet sorry.

Not really, no. The version property, for example, currently allows both string and number, to support something like:

version: 0.8

Is this a problem? I think we would break a lot of users if we take this away?

EDIT: Users could already be forced by setting the type to string, no?

@marc-hb
Copy link
Collaborator

marc-hb commented Jan 13, 2026

The version property, for example, currently allows both string and number, to support something like: version: 0.8
Is this a problem?

It depends where. I didn't mean "everywhere always", sorry for the confusion.

It's not a problem at all where the code is ready and expects both. But it has recently been a problem in Zephyr in two places (zephyrproject-rtos/zephyr#101642 and zephyrproject-rtos/zephyr#101888 (comment)) where the code expects a string and the user (or file generator) entered a string that unfortunately looks like a valid number.

I think we would break a lot of users if we take this away?

Again, it depends where. This new parser can also save users time by "failing fast" and giving them a clear error message that they must quote the strings that look like numbers, instead of some obscure stack trace much later.

@marc-hb
Copy link
Collaborator

marc-hb commented Jan 13, 2026

So let's pretend for instance that:

  • A user uses a path that is also a valid integer. If it's not quoted then yaml.load() with automagically return that integer instead of a string. Very debatable yaml design choice but it is what it is.
  • The west code "naively" expects a string and a string only. Why would it expect anything else? It's typed as a string in the manifest.

This is exactly what just happened in Zephyr and it could happen anywhere in west too.

In that case, it would be a better experience that jsonschema "fails fast" and pinpoints the problematic integer with a useful quoting suggestion rather than west blowing up much later with a stack trace difficult to make sense of.

Can jsonschema do that? It should considering how many people on the Internet complain about this common yaml.load() problem... Maybe pkwalify already did that? I don't know sorry.

{ url = "https://files.pythonhosted.org/packages/15/18/b0e1fafe59051de9e79cdd431863b03593ecfa8341c110affad7c8121efc/ruamel.yaml.clib-0.2.14-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:e7cb9ad1d525d40f7d87b6df7c0ff916a66bc52cb61b66ac1b2a16d0c1b07640", size = 764456, upload-time = "2025-09-22T19:51:11.736Z" },
{ url = "https://files.pythonhosted.org/packages/e7/cd/150fdb96b8fab27fe08d8a59fe67554568727981806e6bc2677a16081ec7/ruamel_yaml_clib-0.2.14-cp314-cp314-win32.whl", hash = "sha256:9b4104bf43ca0cd4e6f738cb86326a3b2f6eef00f417bd1e7efb7bdffe74c539", size = 102394, upload-time = "2025-11-14T21:57:36.703Z" },
{ url = "https://files.pythonhosted.org/packages/bd/e6/a3fa40084558c7e1dc9546385f22a93949c890a8b2e445b2ba43935f51da/ruamel_yaml_clib-0.2.14-cp314-cp314-win_amd64.whl", hash = "sha256:13997d7d354a9890ea1ec5937a219817464e5cc344805b37671562a401ca3008", size = 122673, upload-time = "2025-11-14T21:57:38.177Z" },
{ url = "https://files.pythonhosted.org/packages/06/0c/0c411a0ec64ccb6d104dcabe0e713e05e153a9a2c3c2bd2b32ce412166fe/rpds_py-0.30.0-cp310-cp310-macosx_10_12_x86_64.whl", hash = "sha256:679ae98e00c0e8d68a7fda324e16b90fd5260945b45d3b824c892cec9eea3288", size = 370490, upload-time = "2025-11-30T20:21:33.256Z" },
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could wheels be upgraded in a separate, routine PR to make the dependency changes reviewable?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's automated when packages are added/removed, where uv needs to resolve dependencies. I can't move to a separate PR, but I've split the commits as much as possible.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's automated when packages are added/removed, where uv needs to resolve dependencies.

OK but how come switching from pkwalify to jsonschema changes the wheels version?

I can't move to a separate PR, but I've split the commits as much as possible.

Good enough.

Add the jsonschema package as a project dependency and update uv.lock.

Signed-off-by: Pieter De Gendt <[email protected]>
@pdgendt
Copy link
Collaborator Author

pdgendt commented Jan 14, 2026

So let's pretend for instance that:

  • A user uses a path that is also a valid integer. If it's not quoted then yaml.load() with automagically return that integer instead of a string. Very debatable yaml design choice but it is what it is.
  • The west code "naively" expects a string and a string only. Why would it expect anything else? It's typed as a string in the manifest.

This is exactly what just happened in Zephyr and it could happen anywhere in west too.

In that case, it would be a better experience that jsonschema "fails fast" and pinpoints the problematic integer with a useful quoting suggestion rather than west blowing up much later with a stack trace difficult to make sense of.

Can jsonschema do that? It should considering how many people on the Internet complain about this common yaml.load() problem... Maybe pkwalify already did that? I don't know sorry.

Both pykwalify and jsonschema would fail fast if a property needs to be of a string type. If you could point to a particular property or an issue, I could maybe try to improve. But it's outside the scope of this PR.

The same can be said for the revision string property, if by coincidence the sha is a number like 123456890, the schema validation would fail and quotes would be needed.

@marc-hb
Copy link
Collaborator

marc-hb commented Jan 14, 2026

Both pykwalify and jsonschema would fail fast if a property needs to be of a string type. If you could point to a particular property or an issue, I could maybe try to improve. But it's outside the scope of this PR.

Thanks! A change about a particular property would be outside the scope of this issue, agreed. I was looking for examples to illustrate my point. I agree this PR should not make a schema change at the same time.

On the other hand, gaining or losing that sort of validation capability in general would definitely be in the scope of this PR! That's what I've been wondering the whole time. I understand you tested that and made sure user feedback is (still) good when jsonschema catches and reports a string that looks like a (floating?!?) number?

@pdgendt
Copy link
Collaborator Author

pdgendt commented Jan 14, 2026

On the other hand, gaining or losing that sort of validation capability in general would definitely be in the scope of this PR! That's what I've been wondering the whole time. I understand you tested that and made sure user feedback is (still) good when jsonschema catches and reports a string that looks like a (floating?!?) number?

Which property are you referring to? If you refer to the version, do you require more testing than that already present in tests/test_manifest.py?

@marc-hb
Copy link
Collaborator

marc-hb commented Jan 14, 2026

Which property are you referring to?

Again, I'm just looking for examples, I'm not interested in any particular property. Apologies for the too deep digression on version.

This is about testing jsonschema itself, not about testing west.

If you refer to the version, do you require more testing than that already present in tests/test_manifest.py?

I wasn't focused on "version". I took a quick look at tests/test_manifest.py and I didn't find anything testing jsonschema (resp. pkwalify) itself. And that's OK, we can hopefully just trust jsonschema. Trust... but verify :-)

Also, I'm interested in the interactive feedback when validation failed. It's not realistic to expect automated testing to guarantee a "user experience" (We're lucky enough when there is any test coverage at all for error handling...)

So, I just went ahead and tested this myself; before and after this PR. As you predicted, the experience is OK in both cases. Neither pkwalify nor jsonschema suggests quoting numbers but they are close enough:

 zephyr]$ git diff
diff --git a/west.yml b/west.yml
index a11238717de4..4a53b99fa445 100644
--- a/west.yml
+++ b/west.yml
@@ -16,10 +16,10 @@
 
 manifest:
   defaults:
-    remote: upstream
+    remote: 12345
 
   remotes:
-    - name: upstream
+    - name: 12345
       url-base: https://github.com/zephyrproject-rtos

With pkwalify:

west update

FATAL ERROR: Malformed manifest file: zephyr/west.yml 
  Schema file: src/west/manifest-schema.yml
  Hint: Schema validation failed:
 - Value '12345' is not of type 'str'. Path: '/defaults/remote'.
 - Value '12345' is not of type 'str'. Path: '/remotes/0/name'.

with jsonschema:

west update

FATAL ERROR: Malformed manifest file: zephyr/west.yml 
  Schema file: src/west/manifest-schema.yaml
  Hint: 12345 is not of type 'string'

As expected, remote: "12345" works.

Note jsonschema does not give any location hint. Acceptable regression?

- Convert pykwalify schema with jsonschema
- Remove some manual testing in Python code with schema validation
- Make name property required in west extension commends (would have
  raised a KeyError before if it was missing)

Signed-off-by: Pieter De Gendt <[email protected]>
Remove the pykwalify package dependency from the project and update
uv.lock.

Signed-off-by: Pieter De Gendt <[email protected]>
We can update the check for empty group filters using jsonschema instead
of Python code.

Signed-off-by: Pieter De Gendt <[email protected]>
Move mutual exclusive remote/url and repo-path/url to jsonschema.

Signed-off-by: Pieter De Gendt <[email protected]>
While the schema validation tooling has changed, the schema validation
itself hasn't.
But in order for users to target a version without the tooling change,
we need to bump the schema version.

Signed-off-by: Pieter De Gendt <[email protected]>
@pdgendt
Copy link
Collaborator Author

pdgendt commented Jan 15, 2026

Note jsonschema does not give any location hint. Acceptable regression?

Updated the error message so it now provides more details:

$ west update
FATAL ERROR: Malformed manifest file: /home/pdgendt/zephyrproject/zephyr/west.yml
  Schema file: /home/pdgendt/west/src/west/manifest-schema.yaml
  Hint: 12345 is not of type 'string'

Failed validating 'type' in schema['properties']['defaults']['properties']['remote']:
    {'type': 'string'}

On instance['defaults']['remote']:
    12345

Copy link
Collaborator

@marc-hb marc-hb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have any alternative to suggest and the interwebs don't seem to have any either but I find it sad to discover that jsonscheme seems to introduce the very first, 3rd party, mandatory binary dependency for west, correct? This looks like a transparency and security regression to me. The 100+ lines of unreviewable checksums and git pollution in uv.lock is the icing on the cake :-(

submodules:
oneOf:
- type: boolean
- type: array
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice but that does not seem to match your "status quo" intention, does it?

Move to a separate commit?

additionalProperties: false
properties:
path:
type: ["string", "null"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why "null"? Does this mean empty?

- type: array
items:
type: string
- $ref: "#/$defs/import-object"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, a bit cryptic, add a comment?

# If present, a list of project groups to enable and disable. Prefix
# a group name with "-" to disable it; prefix with "+" to enable it.
group-filter:
$ref: "#/$defs/groups"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just a textual replacement or something smarter? I find the syntax quite cryptic, can you add a one-line comment?

items:
type: string

groups:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this have to be a the bottom? It was at the top with pkwalify and moving it makes the side-by-side (and: tedious) comparison significantly more time-consuming.

userdata: {}

$defs:
import-object:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this in pkwalify? I could not find it.

@pdgendt
Copy link
Collaborator Author

pdgendt commented Jan 20, 2026

jsonscheme seems to introduce the very first, 3rd party, mandatory binary dependency for west, correct? This looks like a transparency and security regression to me.

Is this a deal-breaker?

@marc-hb
Copy link
Collaborator

marc-hb commented Jan 20, 2026

Is this a deal-breaker?

No. Do we have a choice anyway? As you found, pkwalify seems orphaned and there is no alternative, is there?

@marc-hb
Copy link
Collaborator

marc-hb commented Jan 20, 2026

Actually, could we make validation optional?

Something like:

   try:
      import jsonschema
      try:
        jsonschema.validate()
      ...
   expect ImportError
       self.wrn("f{manifest} not validated, UNSUPPORTED do not report bugs")

I wouldn't suggest making this optional if west were used by Zephyr only. But west wants to be more generic and universal and maybe run on more "exotic" systems that Zephyr development would not be compatible with.

marc-hb added a commit to marc-hb/west that referenced this pull request Jan 21, 2026
As found in the review of zephyrproject-rtos#904 (migration from pkwalify to jsonschema),
the schema version check can be confusing.

- Add a couple sentences in MAINTAINERS.rst to clearly state what this
  version check actually performs and achieves and when (not) to bump the
  version.

- Rename "min_version" to "manifest_version" and swap the perspective

"min" and "max" are relative terms, ambiguous when losing track of the
point of view. "min_version" stood for
"minimum_west_version_needed_to_read_this_manifest". But this is the
manifest perspective, which is confusing when reading west code which is
the opposite point of view.  A given _version_ of west code is never going
to change when reading it or running it! So, flip the perspective and
look at things from the west point of view when in the west code: rename
the also vague _SCHEMA_VER to _MAX_SUPPORTED_SCHEMA_VER.

Signed-off-by: Marc Herbert <[email protected]>
Copy link

@bjarki-andreasen bjarki-andreasen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need this sooner than later, zephyr is currently failing to west packages pip --install because its missing jsonschema

@pdgendt
Copy link
Collaborator Author

pdgendt commented Feb 6, 2026

We need this sooner than later, zephyr is currently failing to west packages pip --install because its missing jsonschema

The issue is larger, the west packages Zephyr extension relies on whatever https://github.com/zephyrproject-rtos/zephyr/blob/0bb29433349fa8d7c472eddc3e462123a87e5138/scripts/zephyr_module.py has as dependencies. It can't shouldn't rely on either pykwalify or jsonschema to be present.

@marc-hb
Copy link
Collaborator

marc-hb commented Feb 11, 2026

We need this sooner than later, zephyr is currently failing to west packages pip --install because its missing jsonschema

I don't think this would have helped been enough because west and zephyr are separate from each other and you don't always have to upgrade west when you upgrade Zephyr. Sometimes yes but not systematically and not unnecessarily. They don't have to be "in sync". In other words, it must be possible to combine a cutting-edge Zephyr that has a newer jsonschema dependency with a slightly older west that does not require jsonschema yet.

Anyway zephyrproject-rtos/zephyr#103671 made neither pkwalify nor jsonschema a hard zephyr_module.py requirement which is even better.


Actually, could we make validation optional?
Something like...

... which now looks like a lot like zephyrproject-rtos/zephyr@a0a71dabf8401ab4d :-) Can we have the same here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Deprecate pyKwalify for YAML validation

5 participants