Skip to content

Commit 2bc85b8

Browse files
Add a github workflow for updating language data
* Also update the Readme
1 parent 8cefa89 commit 2bc85b8

File tree

2 files changed

+189
-5
lines changed

2 files changed

+189
-5
lines changed
Lines changed: 169 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,169 @@
1+
name: Update Embedded Writing System Data
2+
3+
on:
4+
workflow_dispatch:
5+
inputs:
6+
use_staging:
7+
description: 'Use SLDR staging data for testing'
8+
required: false
9+
default: false
10+
type: boolean
11+
update_langtags:
12+
description: 'Update langtags.json'
13+
required: false
14+
default: true
15+
type: boolean
16+
update_iana:
17+
description: 'Update ianaSubtagRegistry.txt'
18+
required: false
19+
default: true
20+
type: boolean
21+
22+
env:
23+
LANGTAGS_PRODUCTION_URL: "https://ldml.api.sil.org/index.html?query=langtags&ext=json"
24+
LANGTAGS_STAGING_URL: "https://ldml.api.sil.org/index.html?query=langtags&ext=json&staging=1"
25+
IANA_URL: "https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry"
26+
27+
jobs:
28+
update-langtags:
29+
runs-on: windows-latest
30+
31+
steps:
32+
- name: Checkout repository
33+
uses: actions/checkout@v4
34+
with:
35+
token: ${{ secrets.GITHUB_TOKEN }}
36+
fetch-depth: 0
37+
38+
- name: Setup .NET
39+
uses: actions/setup-dotnet@v4
40+
with:
41+
dotnet-version: '8.0.x' # Adjust version as needed
42+
43+
- name: Download latest langtags.json
44+
id: langtags
45+
run: |
46+
if [[ "${{ github.event.inputs.use_staging }}" == "true" ]]; then
47+
url="$LANGTAGS_PRODUCTION_URL"
48+
else
49+
url="$LANGTAGS_STAGING_URL"
50+
fi
51+
echo "Downloading from: $url"
52+
curl -f -o "langtags.json.new" "$url"
53+
54+
# Validate JSON format
55+
if ! jq empty "langtags.json.new" 2>/dev/null; then
56+
echo "Error: Downloaded file is not valid JSON"
57+
exit 1
58+
fi
59+
60+
echo "Successfully downloaded and validated JSON file"
61+
echo "url=$url" >> $GITHUB_OUTPUT
62+
63+
- name: Download latest iana language-subtag-registry
64+
run: |
65+
echo "Downloading from: ${{ env.IANA_URL }}"
66+
curl -f -o "ianaSubtagRegistry.txt.new" "$IANA_URL"
67+
68+
# Validate JSON format
69+
if ! jq empty "langtags.json.new" 2>/dev/null; then
70+
echo "Error: Downloaded file is not valid JSON"
71+
exit 1
72+
fi
73+
74+
echo "Successfully downloaded and validated JSON file"
75+
76+
- name: Check for changes
77+
id: changes
78+
run: |
79+
if ! cmp -s "SIL.WritingSystems/Resources/langtags.json" "langtags.json.new"; then
80+
echo "Changes detected in langtags.json"
81+
echo "has_changes=true" >> $GITHUB_OUTPUT
82+
fi
83+
if ! cmp -s "SIL.WritingSystems/Resources/ianaSubtagRegistry.txt" "ianaSubtagRegistry.txt.new"; then
84+
echo "Changes detected in ianaSubtagRegistry.txt"
85+
echo "has_changes=true" >> $GITHUB_OUTPUT
86+
fi
87+
88+
- name: Update data files
89+
if: steps.changes.outputs.has_changes == 'true'
90+
run: |
91+
mv "langtags.json.new" "SIL.WritingSystems/Resources/langtags.json"
92+
mv "ianaSubtagRegistry.txt.new" "SIL.WritingSystems/Resources/ianaSubtagRegistry.txt"
93+
94+
- name: Restore, Build, and Test SIL.WritingSystems.Tests
95+
if: steps.changes.outputs.has_changes == 'true'
96+
run: |
97+
echo "## Test Summary" >> $GITHUB_STEP_SUMMARY
98+
echo "Restoring, building, and testing SIL.WritingSystems.Tests..." >> $GITHUB_STEP_SUMMARY
99+
if dotnet test SIL.WritingSystems.Tests/SIL.WritingSystems.Tests.csproj --configuration Release --logger trx --results-directory TestResults; then
100+
echo "✅ Tests passed." >> $GITHUB_STEP_SUMMARY
101+
else
102+
echo "❌ Tests failed." >> $GITHUB_STEP_SUMMARY
103+
exit 1
104+
fi
105+
106+
- name: Configure Git as triggering user
107+
if: steps.changes.outputs.has_changes == 'true'
108+
run: |
109+
ACTOR="${{ github.actor }}"
110+
EMAIL="${ACTOR}@users.noreply.github.com"
111+
git config --local user.name "$ACTOR"
112+
git config --local user.email "$EMAIL"
113+
114+
- name: Create Pull Request
115+
if: steps.changes.outputs.has_changes == 'true'
116+
uses: peter-evans/create-pull-request@v5
117+
with:
118+
base: main-copy-for-testing
119+
token: ${{ secrets.GITHUB_TOKEN }}
120+
commit-message: |
121+
Update embedded writing system data
122+
123+
- Updated: $(date -u '+%Y-%m-%d %H:%M:%S UTC')
124+
- langtags.json: ${{ github.event.inputs.update_langtags == 'true' && 'Updated' || 'No changes' }}
125+
- ianaSubtagRegistry.txt: ${{ github.event.inputs.update_iana == 'true' && 'Updated' || 'No changes' }}
126+
- SLDR staging: ${{ github.event.inputs.use_staging }}
127+
- All WritingSystems tests passed
128+
title: "Update embedded writing system data"
129+
body: |
130+
## Automated Writing System Data Update
131+
132+
This PR updates the embedded writing system data files as described in `SIL.WritingSystems/Readme.md`.
133+
134+
**Workflow Run:** [View Summary](https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }})
135+
136+
**Files Updated:**
137+
- langtags.json: ${{ github.event.inputs.update_langtags == 'true' && format('✅ Updated from {0}', steps.langtags.outputs.url) || '⏭️ Skipped' }}
138+
- ianaSubtagRegistry.txt: ${{ github.event.inputs.update_iana == 'true' && '✅ Updated from https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry' || '⏭️ Skipped' }}
139+
140+
**Next Steps:**
141+
- Review the changes
142+
- Run any additional manual tests if needed
143+
- Merge when ready
144+
branch: update-writing-system-data
145+
delete-branch: true
146+
147+
- name: Create summary
148+
if: always()
149+
run: |
150+
echo "## Writing System Data Update Summary" >> $GITHUB_STEP_SUMMARY
151+
echo "- **SLDR Staging Mode**: ${{ github.event.inputs.use_staging }}" >> $GITHUB_STEP_SUMMARY
152+
echo "- **Changes Detected**: ${{ steps.changes.outputs.has_changes }}" >> $GITHUB_STEP_SUMMARY
153+
echo "- **Update langtags.json**: ${{ github.event.inputs.update_langtags }}" >> $GITHUB_STEP_SUMMARY
154+
echo "- **Update ianaSubtagRegistry.txt**: ${{ github.event.inputs.update_iana }}" >> $GITHUB_STEP_SUMMARY
155+
156+
if [[ "${{ steps.changes.outputs.has_changes }}" == "true" ]]; then
157+
echo "- **Action Taken**: Files updated, tests run, PR created" >> $GITHUB_STEP_SUMMARY
158+
else
159+
echo "- **Action Taken**: No changes detected, no updates needed" >> $GITHUB_STEP_SUMMARY
160+
fi
161+
162+
echo "" >> $GITHUB_STEP_SUMMARY
163+
echo "### Data Sources" >> $GITHUB_STEP_SUMMARY
164+
if [[ "${{ github.event.inputs.update_langtags }}" == "true" ]]; then
165+
echo "- **langtags.json**: ${{ steps.langtags.outputs.url }}" >> $GITHUB_STEP_SUMMARY
166+
fi
167+
if [[ "${{ github.event.inputs.update_iana }}" == "true" ]]; then
168+
echo "- **ianaSubtagRegistry.txt**: https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry" >> $GITHUB_STEP_SUMMARY
169+
fi

SIL.WritingSystems/Readme.md

Lines changed: 20 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,32 @@
1-
## SIL.WritingSystems Library
1+
# SIL.WritingSystems Library
22

33
This library contains many classes that make working with writing systems and language tags easier
44

5-
### Updating langtags.json
5+
## SIL Locale Data Repository
66

7-
To update langtags.json to the latest follow the following steps:
7+
Much of the writing system data that this library provides comes from the [SIL Locale Data repository (SLDR)](https://github.com/silnrsi/sldr?tab=readme-ov-file#sil-locale-data-repository-sldr)
8+
To test with updated SLDR data from the staging area you can set an environment variable
9+
10+
`SLDR_USE_STAGING=true`
11+
12+
## Updating embedded writing system and language data
13+
14+
There is a github action that can be run to update the `langtags.json` and `ianaSubtagRegistry.txt` which are embedded in the library.
15+
It will download the latest and, after a successful run of the WritingSystems tests, create a PR to update both files.
16+
17+
### langtags.json
18+
19+
The list of language tag identifiers is curated by the Writing Systems Technology group and provided in a `langtags.json` file.
20+
This library is used as the final fallback in case of problems with the data served from https://ldml.api.sil.org/langtags.json
21+
22+
To manually update langtags.json to the latest follow the following steps:
823

924
1. Run the unit test suite by hand and note (or fix) any failures to ByHand and SkipOnTeamCity category tests
1025
1. Replace `Resources\langtags.json` with the content from https://ldml.api.sil.org/langtags.json
1126
1. Run the unit test suite by hand and fix any tests that relied on old langtags data
1227
1. Commit the changes
1328

14-
### Updating ianaSubtagRegistry.txt
15-
To update ianaSubtagRegistry.txt to the latest, replace `Resources\ianaSubtagRegistry.txt` with
29+
### ianaSubtagRegistry.txt
30+
To manually update ianaSubtagRegistry.txt to the latest, replace `Resources\ianaSubtagRegistry.txt` with
1631
the content from https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry
1732

0 commit comments

Comments
 (0)