Skip to content

Commit 6e826fd

Browse files
Add a github workflow for updating language data
* Also update the Readme
1 parent 8cefa89 commit 6e826fd

File tree

2 files changed

+191
-5
lines changed

2 files changed

+191
-5
lines changed
Lines changed: 171 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,171 @@
1+
name: Update Embedded Writing System Data
2+
3+
on:
4+
workflow_dispatch:
5+
inputs:
6+
use_staging:
7+
description: 'Use SLDR staging data for testing'
8+
required: false
9+
default: false
10+
type: boolean
11+
update_langtags:
12+
description: 'Update langtags.json'
13+
required: false
14+
default: true
15+
type: boolean
16+
update_iana:
17+
description: 'Update ianaSubtagRegistry.txt'
18+
required: false
19+
default: true
20+
type: boolean
21+
22+
env:
23+
LANGTAGS_PRODUCTION_URL: "https://ldml.api.sil.org/index.html?query=langtags&ext=json"
24+
LANGTAGS_STAGING_URL: "https://ldml.api.sil.org/index.html?query=langtags&ext=json&staging=1"
25+
IANA_URL: "https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry"
26+
27+
jobs:
28+
update-langtags:
29+
runs-on: ubuntu-latest
30+
31+
steps:
32+
- name: Checkout repository
33+
uses: actions/checkout@v4
34+
with:
35+
token: ${{ secrets.GITHUB_TOKEN }}
36+
fetch-depth: 0
37+
38+
- name: Setup .NET
39+
uses: actions/setup-dotnet@v4
40+
with:
41+
dotnet-version: '8.0.x' # Adjust version as needed
42+
43+
- name: Download latest langtags.json
44+
id: langtags
45+
run: |
46+
if [[ "${{ github.event.inputs.use_staging }}" == "true" ]]; then
47+
url="$LANGTAGS_PRODUCTION_URL"
48+
else
49+
url="$LANGTAGS_STAGING_URL"
50+
fi
51+
echo "Downloading from: $url"
52+
curl -f -o "langtags.json.new" "$url"
53+
54+
# Validate JSON format
55+
if ! jq empty "langtags.json.new" 2>/dev/null; then
56+
echo "Error: Downloaded file is not valid JSON"
57+
exit 1
58+
fi
59+
60+
echo "Successfully downloaded and validated JSON file"
61+
echo "url=$url" >> $GITHUB_OUTPUT
62+
63+
- name: Download latest iana language-subtag-registry
64+
run: |
65+
echo "Downloading from: ${{ env.IANA_URL }}"
66+
curl -f -o "ianaSubtagRegistry.txt.new" "$IANA_URL"
67+
68+
# Validate JSON format
69+
if ! jq empty "langtags.json.new" 2>/dev/null; then
70+
echo "Error: Downloaded file is not valid JSON"
71+
exit 1
72+
fi
73+
74+
echo "Successfully downloaded and validated JSON file"
75+
76+
- name: Check for changes
77+
id: changes
78+
run: |
79+
if ! cmp -s "SIL.WritingSystems/Resources/langtags.json" "langtags.json.new"; then
80+
echo "Changes detected in langtags.json"
81+
echo "has_changes=true" >> $GITHUB_OUTPUT
82+
fi
83+
if ! cmp -s "SIL.WritingSystems/Resources/ianaSubtagRegistry.txt" "ianaSubtagRegistry.txt.new"; then
84+
echo "Changes detected in ianaSubtagRegistry.txt"
85+
echo "has_changes=true" >> $GITHUB_OUTPUT
86+
fi
87+
88+
- name: Update data files
89+
if: steps.changes.outputs.has_changes == 'true'
90+
run: |
91+
mv "langtags.json.new" "SIL.WritingSystems/Resources/langtags.json"
92+
mv "ianaSubtagRegistry.txt.new" "SIL.WritingSystems/Resources/ianaSubtagRegistry.txt"
93+
94+
- name: Restore, Build, and Test SIL.WritingSystems.Tests
95+
if: steps.changes.outputs.has_changes == 'true'
96+
run: |
97+
echo "## Test Summary" >> $GITHUB_STEP_SUMMARY
98+
echo "Restoring, building, and testing SIL.WritingSystems.Tests..." >> $GITHUB_STEP_SUMMARY
99+
dotnet restore
100+
dotnet build SIL.WritingSystems.Tests/SIL.WritingSystems.Tests.csproj --configuration Release
101+
if dotnet test SIL.WritingSystems.Tests/SIL.WritingSystems.Tests.csproj --no-build --configuration Release --logger trx --results-directory TestResults; then
102+
echo "✅ Tests passed." >> $GITHUB_STEP_SUMMARY
103+
else
104+
echo "❌ Tests failed." >> $GITHUB_STEP_SUMMARY
105+
exit 1
106+
fi
107+
108+
- name: Configure Git as triggering user
109+
if: steps.changes.outputs.has_changes == 'true'
110+
run: |
111+
ACTOR="${{ github.actor }}"
112+
EMAIL="${ACTOR}@users.noreply.github.com"
113+
git config --local user.name "$ACTOR"
114+
git config --local user.email "$EMAIL"
115+
116+
- name: Create Pull Request
117+
if: steps.changes.outputs.has_changes == 'true'
118+
uses: peter-evans/create-pull-request@v5
119+
with:
120+
base: main-copy-for-testing
121+
token: ${{ secrets.GITHUB_TOKEN }}
122+
commit-message: |
123+
Update embedded writing system data
124+
125+
- Updated: $(date -u '+%Y-%m-%d %H:%M:%S UTC')
126+
- langtags.json: ${{ github.event.inputs.update_langtags == 'true' && 'Updated' || 'No changes' }}
127+
- ianaSubtagRegistry.txt: ${{ github.event.inputs.update_iana == 'true' && 'Updated' || 'No changes' }}
128+
- SLDR staging: ${{ github.event.inputs.use_staging }}
129+
- All WritingSystems tests passed
130+
title: "Update embedded writing system data"
131+
body: |
132+
## Automated Writing System Data Update
133+
134+
This PR updates the embedded writing system data files as described in `SIL.WritingSystems/Readme.md`.
135+
136+
**Workflow Run:** [View Summary](https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }})
137+
138+
**Files Updated:**
139+
- langtags.json: ${{ github.event.inputs.update_langtags == 'true' && format('✅ Updated from {0}', steps.langtags.outputs.url) || '⏭️ Skipped' }}
140+
- ianaSubtagRegistry.txt: ${{ github.event.inputs.update_iana == 'true' && '✅ Updated from https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry' || '⏭️ Skipped' }}
141+
142+
**Next Steps:**
143+
- Review the changes
144+
- Run any additional manual tests if needed
145+
- Merge when ready
146+
branch: update-writing-system-data
147+
delete-branch: true
148+
149+
- name: Create summary
150+
if: always()
151+
run: |
152+
echo "## Writing System Data Update Summary" >> $GITHUB_STEP_SUMMARY
153+
echo "- **SLDR Staging Mode**: ${{ github.event.inputs.use_staging }}" >> $GITHUB_STEP_SUMMARY
154+
echo "- **Changes Detected**: ${{ steps.changes.outputs.has_changes }}" >> $GITHUB_STEP_SUMMARY
155+
echo "- **Update langtags.json**: ${{ github.event.inputs.update_langtags }}" >> $GITHUB_STEP_SUMMARY
156+
echo "- **Update ianaSubtagRegistry.txt**: ${{ github.event.inputs.update_iana }}" >> $GITHUB_STEP_SUMMARY
157+
158+
if [[ "${{ steps.changes.outputs.has_changes }}" == "true" ]]; then
159+
echo "- **Action Taken**: Files updated, tests run, PR created" >> $GITHUB_STEP_SUMMARY
160+
else
161+
echo "- **Action Taken**: No changes detected, no updates needed" >> $GITHUB_STEP_SUMMARY
162+
fi
163+
164+
echo "" >> $GITHUB_STEP_SUMMARY
165+
echo "### Data Sources" >> $GITHUB_STEP_SUMMARY
166+
if [[ "${{ github.event.inputs.update_langtags }}" == "true" ]]; then
167+
echo "- **langtags.json**: ${{ steps.langtags.outputs.url }}" >> $GITHUB_STEP_SUMMARY
168+
fi
169+
if [[ "${{ github.event.inputs.update_iana }}" == "true" ]]; then
170+
echo "- **ianaSubtagRegistry.txt**: https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry" >> $GITHUB_STEP_SUMMARY
171+
fi

SIL.WritingSystems/Readme.md

Lines changed: 20 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,32 @@
1-
## SIL.WritingSystems Library
1+
# SIL.WritingSystems Library
22

33
This library contains many classes that make working with writing systems and language tags easier
44

5-
### Updating langtags.json
5+
## SIL Locale Data Repository
66

7-
To update langtags.json to the latest follow the following steps:
7+
Much of the writing system data that this library provides comes from the [SIL Locale Data repository (SLDR)](https://github.com/silnrsi/sldr?tab=readme-ov-file#sil-locale-data-repository-sldr)
8+
To test with updated SLDR data from the staging area you can set an environment variable
9+
10+
`SLDR_USE_STAGING=true`
11+
12+
## Updating embedded writing system and language data
13+
14+
There is a github action that can be run to update the `langtags.json` and `ianaSubtagRegistry.txt` which are embedded in the library.
15+
It will download the latest and, after a successful run of the WritingSystems tests, create a PR to update both files.
16+
17+
### langtags.json
18+
19+
The list of language tag identifiers is curated by the Writing Systems Technology group and provided in a `langtags.json` file.
20+
This library is used as the final fallback in case of problems with the data served from https://ldml.api.sil.org/langtags.json
21+
22+
To manually update langtags.json to the latest follow the following steps:
823

924
1. Run the unit test suite by hand and note (or fix) any failures to ByHand and SkipOnTeamCity category tests
1025
1. Replace `Resources\langtags.json` with the content from https://ldml.api.sil.org/langtags.json
1126
1. Run the unit test suite by hand and fix any tests that relied on old langtags data
1227
1. Commit the changes
1328

14-
### Updating ianaSubtagRegistry.txt
15-
To update ianaSubtagRegistry.txt to the latest, replace `Resources\ianaSubtagRegistry.txt` with
29+
### ianaSubtagRegistry.txt
30+
To manually update ianaSubtagRegistry.txt to the latest, replace `Resources\ianaSubtagRegistry.txt` with
1631
the content from https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry
1732

0 commit comments

Comments
 (0)