Skip to content

Commit 175ef95

Browse files
authored
Merge pull request #14 from linkml/up2
New Value Sets and Infrastructure Improvements
2 parents 223c645 + a29175d commit 175ef95

File tree

121 files changed

+59265
-15966
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

121 files changed

+59265
-15966
lines changed

.claude/hooks/README.md

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
# Claude Code Hooks
2+
3+
This directory contains hooks that integrate with Claude Code to provide automated validation and other features.
4+
5+
## validate_schema_hook.py
6+
7+
This hook automatically validates LinkML schema files when they are written or edited using Claude Code.
8+
9+
### Features
10+
- Automatically runs validation when saving YAML files in the schema directory
11+
- Blocks file modifications if validation fails
12+
- Shows detailed validation output in the Claude Code interface
13+
- Filters out noise from warning messages for cleaner output
14+
- Uses the project's `just validate-schema PATH` command for validation
15+
16+
### How it works
17+
1. Intercepts Write, Edit, and MultiEdit operations on YAML files containing "schema" in the path
18+
2. Runs the validation command: `just validate-schema <file>`
19+
3. Displays validation results with filtered output for readability
20+
4. Returns exit code 2 to block the operation if validation fails
21+
22+
### Validation Command
23+
The hook uses the existing `just validate-schema PATH` command which:
24+
- Validates ontology mappings in enum definitions
25+
- Checks for label mismatches between expected and actual ontology terms
26+
- Uses the configured OAK adapters for strict validation of configured prefixes
27+
- Treats label mismatches as errors for configured ontologies (NCIT, GO, CHEBI, etc.)
28+
29+
### Configuration
30+
The hook is configured in `.claude/settings.json` as a PostToolUse hook that runs after Write, Edit, and MultiEdit operations.
31+
32+
### Testing
33+
You can test the hook by editing any schema file and seeing if validation runs automatically. The hook will:
34+
- ✅ Allow valid schema modifications
35+
- ❌ Block invalid schema modifications with validation errors
36+
- 📋 Show helpful validation output including ontology label mismatches
37+
38+
### Exit Codes
39+
- **Exit 0**: Validation passed, allow operation
40+
- **Exit 2**: Validation failed, block operation (see [Claude Code hooks documentation](https://docs.claude.com/en/docs/claude-code/hooks#exit-code-2-behavior))
Lines changed: 124 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,124 @@
1+
#!/usr/bin/env python3
2+
"""
3+
Hook to automatically validate LinkML schema files after they are written or edited.
4+
This hook runs `just validate-schema PATH` and displays the results to provide immediate feedback.
5+
6+
**NOTE**
7+
8+
Be sure to exit with code 2 if you want to block the operation.
9+
https://docs.claude.com/en/docs/claude-code/hooks#exit-code-2-behavior
10+
"""
11+
12+
import sys
13+
import json
14+
import subprocess
15+
import os
16+
from pathlib import Path
17+
18+
19+
def main():
20+
# Read the hook input from stdin
21+
data = json.load(sys.stdin)
22+
23+
# Extract the file path from the tool input
24+
tool_name = data.get("tool_name", "")
25+
file_path = data.get("tool_input", {}).get("file_path", "")
26+
27+
# Only process Write and Edit tool calls
28+
if tool_name not in ["Write", "Edit", "MultiEdit"]:
29+
sys.exit(0)
30+
31+
# Check if this is a YAML file in the schema directory
32+
if not file_path.endswith(".yaml") or "schema" not in file_path:
33+
sys.exit(0)
34+
35+
# Convert to Path object for easier manipulation
36+
file_path = Path(file_path)
37+
38+
# Check if the file exists (it should after Write/Edit)
39+
if not file_path.exists():
40+
print(f"⚠️ File not found: {file_path}", file=sys.stderr)
41+
sys.exit(0)
42+
43+
# Run the validation command
44+
try:
45+
# Build the validation command
46+
cmd = ["just", "validate-schema", str(file_path)]
47+
48+
# Run the command and capture output
49+
result = subprocess.run(
50+
cmd,
51+
capture_output=True,
52+
text=True,
53+
cwd=os.path.dirname(
54+
os.path.dirname(os.path.dirname(__file__))
55+
), # Project root
56+
)
57+
58+
# Display the validation output
59+
print("\n" + "=" * 60, file=sys.stderr)
60+
print(f"🔍 Schema Validation Results for {file_path.name}", file=sys.stderr)
61+
print("=" * 60, file=sys.stderr)
62+
63+
# Show stdout (the actual validation results)
64+
if result.stdout:
65+
# Filter out noise from warning messages
66+
lines = result.stdout.split("\n")
67+
filtered_lines = []
68+
for line in lines:
69+
# Filter out common noise patterns
70+
if any(pattern in line for pattern in [
71+
"/eutils/__init__.py",
72+
"UserWarning",
73+
"pkg_resources is deprecated",
74+
"RuntimeWarning: 'src.valuesets.validators.enum_evaluator'",
75+
"found in sys.modules after import"
76+
]):
77+
continue
78+
else:
79+
filtered_lines.append(line)
80+
81+
output = "\n".join(filtered_lines).strip()
82+
if output:
83+
print(output, file=sys.stderr)
84+
85+
# Show any errors
86+
if result.returncode != 0 and result.stderr:
87+
# Filter stderr similarly
88+
lines = result.stderr.split("\n")
89+
filtered_lines = []
90+
for line in lines:
91+
if not any(pattern in line for pattern in [
92+
"/eutils/__init__.py",
93+
"UserWarning",
94+
"pkg_resources is deprecated"
95+
]):
96+
filtered_lines.append(line)
97+
98+
error_output = "\n".join(filtered_lines).strip()
99+
if error_output:
100+
print("\n⚠️ Schema validation errors:", file=sys.stderr)
101+
print(error_output, file=sys.stderr)
102+
103+
print("=" * 60 + "\n", file=sys.stderr)
104+
105+
# Return non-zero exit code if validation failed
106+
if result.returncode != 0:
107+
print("❌ Schema validation failed - blocking file modification", file=sys.stderr)
108+
print("Fix validation errors before saving the file.", file=sys.stderr)
109+
sys.exit(2) # Block the operation
110+
111+
except subprocess.CalledProcessError as e:
112+
print(f"❌ Failed to run schema validation: {e}", file=sys.stderr)
113+
sys.exit(2) # Block on validation errors
114+
except Exception as e:
115+
print(f"❌ Unexpected error during schema validation: {e}", file=sys.stderr)
116+
# Block on hook failures to ensure schema integrity
117+
sys.exit(2)
118+
119+
# Exit 0 if validation passed
120+
sys.exit(0)
121+
122+
123+
if __name__ == "__main__":
124+
main()

.claude/settings.json

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,5 +10,18 @@
1010
"WebSearch",
1111
"Write"
1212
]
13+
},
14+
"hooks": {
15+
"PostToolUse": [
16+
{
17+
"matcher": "Edit|MultiEdit|Write",
18+
"hooks": [
19+
{
20+
"type": "command",
21+
"command": "$CLAUDE_PROJECT_DIR/.claude/hooks/validate_schema_hook.py"
22+
}
23+
]
24+
}
25+
]
1326
}
1427
}

CONTRIBUTING.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,23 @@ Please use our [Discussions forum][discussions] to ask general questions or cont
5151

5252
Please submit a [Pull Request][pulls] to submit a new term for consideration.
5353

54+
#### Term Caching System
55+
56+
This project uses an ontology term caching system to improve validation performance and reduce external API calls. When you contribute new ontology mappings:
57+
58+
1. **Cache Updates**: Adding new ontology mappings may result in changes to the cache files in the `cache/` directory
59+
2. **Include Cache Changes**: These cache updates should be included in your Pull Request
60+
3. **Validation Process**: Run `just validate` before submitting to ensure all ontology mappings are valid
61+
4. **Cache Structure**: The cache organizes terms by ontology prefix (e.g., `cache/ncit/`, `cache/vo/`) for efficient lookup
62+
63+
**Standard Operating Procedure for Contributors:**
64+
65+
- When adding enums with `meaning:` annotations pointing to ontology terms:
66+
- Run validation locally with `just validate`
67+
- Include any generated cache files in your commit
68+
- Ensure all ontology IDs are correct (never guess - use [OLS](https://www.ebi.ac.uk/ols4/) to verify)
69+
- Follow the project's naming conventions (e.g., `UPPER_CASE` for enum values)
70+
5471
<a id="best-practices"></a>
5572

5673
## Best Practices

cache/afo/terms.csv

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
curie,label,retrieved_at
2+
AFO:AFQ_0000112,,2025-10-19T08:36:02.778895
3+
AFO:AFQ_0000113,,2025-10-19T08:36:02.779656
4+
AFO:AFQ_0000114,,2025-10-19T08:36:02.779835
5+
AFO:AFQ_0000115,,2025-10-19T08:36:02.779968

0 commit comments

Comments
 (0)