-
Notifications
You must be signed in to change notification settings - Fork 13
[DATASTD-2510] 6.0.0 JSON schema #347
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a comprehensive JSON validator for the Ed-Fi Data Standard v6.0.0, consisting of an OpenAPI specification file and a Python validation framework for ensuring JSON files in a data lake conform to the Ed-Fi schemas.
- Implements a complete JSON validation framework with CLI and programmatic APIs
- Provides comprehensive test coverage for validator functionality and CLI operations
- Includes sample data files and example usage scripts for demonstration
Reviewed Changes
Copilot reviewed 20 out of 21 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| json-validator/tests/test_validator.py | Test suite for DataLakeValidator class covering initialization, validation, and schema mapping |
| json-validator/tests/test_cli.py | Test suite for CLI functionality including argument parsing and output formatting |
| json-validator/tests/conftest.py | Test fixtures providing sample OpenAPI specs and temporary data lake structures |
| json-validator/tests/init.py | Test package initialization file |
| json-validator/test_simple.py | Simple demonstration script showing basic validator usage |
| json-validator/sample/ | Sample JSON files for testing with both valid and invalid data examples |
| json-validator/pytest.ini | Pytest configuration with test discovery and reporting settings |
| json-validator/pyproject.toml | Poetry project configuration with dependencies and tool settings |
| json-validator/prompt.md | Documentation explaining the validator requirements and data lake structure |
| json-validator/json_validator/validator.py | Core validator implementation with schema mapping and validation logic |
| json-validator/json_validator/cli.py | Command-line interface for the validator with argument parsing and output formatting |
| json-validator/json_validator/init.py | Package initialization file |
| json-validator/example.py | Example script demonstrating programmatic usage of the validator |
| json-validator/USAGE.md | Comprehensive usage guide with examples and API documentation |
| json-validator/README.md | Project overview and quick start guide |
| json-validator/.gitignore | Git ignore patterns for Python projects |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
|
It is always fun to see Copilot's review of Copilot's work. I didn't write a line of code in here, except for two lines adjusted in |
eng/json-validator/sample/tpdm/candidates/40ed08411356433d8e5ea259aa0dce86.json
Outdated
Show resolved
Hide resolved
…t to JSON file validation
5d1908e to
89798dc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 7 out of 10 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| data: Full OpenAPI specification data | ||
|
|
||
| Returns: | ||
| Filtered dictionary with only openapi, info, and components sections, and unnecessary fields removed from each $.component.schemas component. |
Copilot
AI
Nov 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The description mentions removing fields from $.component.schemas, but the correct JSONPath would be $.components.schemas (with an 's' in components). This is a minor documentation inconsistency.
| Filtered dictionary with only openapi, info, and components sections, and unnecessary fields removed from each $.component.schemas component. | |
| Filtered dictionary with only openapi, info, and components sections, and unnecessary fields removed from each $.components.schemas component. |
| except Exception as e: | ||
| logging.error(f"Error saving YAML to {file_path}: {e}") | ||
| raise |
Copilot
AI
Nov 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function uses bare except Exception at line 129 and 195. According to PEP 8 and the custom coding guideline, explicit exception handling is preferred. Consider catching specific exceptions like IOError or OSError for file operations instead of the generic Exception.
| except Exception as e: | ||
| logging.error(f"Failed to convert {json_file}: {e}") | ||
| error_count += 1 |
Copilot
AI
Nov 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function uses except Exception which is too broad. According to the custom coding guideline, explicit exception handling is preferred. Consider catching specific exceptions for each operation (e.g., FileNotFoundError, ValueError) rather than the generic Exception.
| except Exception as e: | ||
| logging.error(f"Conversion failed: {e}") | ||
| print(f"Error: {e}") | ||
| sys.exit(1) |
Copilot
AI
Nov 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function uses except Exception which is too broad. According to the custom coding guideline, explicit exception handling is preferred. This catches all exceptions including KeyboardInterrupt (which is handled separately at line 263), making the exception handling unclear.
| The following sections are **excluded** from the output: | ||
|
|
||
| * `paths` - API endpoints and operations | ||
| * `servers` - Server configuration | ||
| * `security` - Security scheme definitions | ||
| * `tags` - Tag metadata | ||
| * `externalDocs` - External documentation | ||
|
|
Copilot
AI
Nov 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation states that the script excludes components such as parameters, securitySchemes, and responses from the output. However, the actual filtering in the code (lines 86-88 of json_to_yaml.py) removes these from components, not from the top-level. The wording "The following sections are excluded" should clarify these are sub-sections within components, not top-level sections.
| high-level metadata and the component schema definitions. They should be treated | ||
| as valid schema definitions for JSON files, not as formal Ed-Fi API | ||
| Specification files. It should also be noted that this format is closely related | ||
| to [JSON Schema](https://json-schema.org/), but these are not compliant JSON Schema files. |
Copilot
AI
Nov 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The statement "It should also be noted that this format is closely related to JSON Schema, but these are not compliant JSON Schema files" could be clearer. Consider specifying what makes them non-compliant or how they differ from standard JSON Schema to help users understand the limitations.
| to [JSON Schema](https://json-schema.org/), but these are not compliant JSON Schema files. | |
| to [JSON Schema](https://json-schema.org/), but these are not compliant JSON Schema files. For example, they may omit certain JSON Schema keywords (such as `additionalProperties`, `patternProperties`, or complex validation rules), and are intended for basic structural validation rather than full JSON Schema validation. As such, they may not work with all JSON Schema validators or tools. |
| description = "Convert JSON files to YAML format with proper formatting and validation" | ||
| authors = ["Ed-Fi Alliance <[email protected]>"] | ||
| readme = "README.md" | ||
| packages = [{include = "json_to_yaml.py", format = "sdist"}] |
Copilot
AI
Nov 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The packages configuration uses format = "sdist" which restricts the package to source distributions only. For a script-based tool, consider using {include = "json_to_yaml.py"} without the format restriction, or properly structuring as a package. This configuration may prevent wheel builds.
| packages = [{include = "json_to_yaml.py", format = "sdist"}] | |
| packages = [{include = "json_to_yaml.py"}] |
mragonech
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving the general concept of the utility. The Json and and yaml are new to me, so I'll relay on the developer's expertise for the code
Contains modified Open API specification files for the core Ed-Fi Data Standard v6.0.0, and a Python script to help create this file from the ODS/API's
swagger.jsonoutput.