Skip to content

Commit b579b88

Browse files
authored
Merge pull request #128 from Hack23/copilot/analyse-cia-data-download
Fix translations for all 14 languages and complete dynamic statistics implementation
2 parents 3c5051d + dd0310d commit b579b88

28 files changed

+2904
-109
lines changed

.github/skills/iso-27001-controls/SKILL.md

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -84,9 +84,19 @@ Access: Open to all users
8484
```
8585
8686
**A.8.11 - Data Masking**
87-
- ✅ No personal data collected
88-
- ✅ No cookies or tracking
89-
- ✅ External links only (no PII stored)
87+
- ❌ NOT applicable to Riksdagsmonitor
88+
- **Reason**: A.8.11 applies to masking sensitive data in non-production environments
89+
- **Context**: Riksdagsmonitor processes only **public government data** (Swedish Offentlighetsprincipen)
90+
- **Data Type**: Public officials in official capacity (MPs, ministers, voting records, parliamentary documents)
91+
- **Legal Basis**: GDPR Article 6(1)(e) public interest, Article 9(2)(e) manifestly public political opinions
92+
- **Journalist Exemption**: Swedish Press Freedom Act (Tryckfrihetsförordningen)
93+
- **No Masking Needed**: All data is public, no test environments with production data copies
94+
95+
**More Appropriate Controls**:
96+
- ✅ **A.5.33** - Protection of records (source attribution, audit trails via Git)
97+
- ✅ **A.5.34** - Privacy and protection of PII (public officials, official capacity only)
98+
- ✅ **A.8.10** - Information deletion (retention policies, no excessive storage)
99+
- ✅ **A.8.19** - Security of information in use (HTTPS-only, CSP headers)
90100
91101
**A.8.23 - Web Filtering**
92102
```html

.github/skills/osint-methodologies/SKILL.md

Lines changed: 24 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -64,14 +64,32 @@ The **riksdag-regering-mcp** MCP server provides 32 specialized tools:
6464
## ISMS Compliance
6565

6666
### ISO 27001:2022
67-
- **A.5.10**: Acceptable use (objective, non-partisan)
68-
- **A.5.33**: Protection of records (source attribution)
69-
- **A.8.8**: Technical vulnerabilities (API monitoring)
67+
- **A.5.10**: Acceptable use (objective, non-partisan journalism)
68+
- **A.5.33**: Protection of records (source attribution, audit trails)
69+
- **A.5.34**: Privacy and PII (public officials in official capacity only)
70+
- **A.8.8**: Technical vulnerabilities (API monitoring, dependency scanning)
71+
- **A.8.10**: Information deletion (documented retention policies)
72+
- **A.8.19**: Security in use (HTTPS-only, CSP headers)
73+
74+
**Not Applicable**:
75+
- **A.8.11 (Data Masking)**: NOT applicable - processes only public government data,
76+
no sensitive data requiring masking, journalist/OSINT platform covering public officials
7077

7178
### NIST CSF 2.0
72-
- **ID.AM-5**: Resources prioritized by classification
73-
- **PR.DS-5**: Protections against data leaks
74-
- **DE.CM-1**: Network monitored for events
79+
- **ID.AM-5**: Resources prioritized by classification (PUBLIC data only)
80+
- **PR.DS-5**: Protections against data leaks (HTTPS-only, public data)
81+
- **DE.CM-1**: Network monitored for events (CI/CD security scanning)
82+
83+
### GDPR Compliance
84+
- **Article 6(1)(e)**: Public interest processing (democratic transparency, political accountability)
85+
- **Article 6(1)(f)**: Legitimate interests (journalistic purposes, transparency)
86+
- **Article 9(2)(e)**: Political opinions manifestly made public (voting records, party affiliation)
87+
- **Article 9(2)(g)**: Processing for substantial public interest (journalism exemption)
88+
89+
### Swedish Law
90+
- **Offentlighetsprincipen**: Constitutional right to access public documents (Tryckfrihetsförordningen 2:1)
91+
- **Press Freedom Act**: Journalist exemption for covering public officials
92+
- **Public Officials**: Reduced privacy expectations for official government activities
7593

7694
## References
7795

Lines changed: 197 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,197 @@
1+
name: Update CIA Production Statistics
2+
3+
# Schedule: Daily at 03:00 CET (after CIA extraction at 02:57)
4+
# Manual trigger: Allow manual workflow dispatch for testing
5+
on:
6+
schedule:
7+
# Run at 03:00 CET (02:00 UTC winter, 01:00 UTC summer due to DST)
8+
# Using 02:00 UTC for consistency (safe buffer after 02:57 extraction)
9+
- cron: '0 2 * * *'
10+
workflow_dispatch: # Allow manual trigger
11+
inputs:
12+
force_update:
13+
description: 'Force update even if cache is fresh'
14+
required: false
15+
type: boolean
16+
default: false
17+
18+
permissions:
19+
contents: write # Required to commit and push updated statistics
20+
21+
jobs:
22+
update-stats:
23+
name: Fetch and Update CIA Production Statistics
24+
runs-on: ubuntu-latest
25+
26+
steps:
27+
- name: Harden Runner
28+
uses: step-security/harden-runner@0080882f6c36860b6ba35c610c98ce87d4e2f26f # v2.10.2
29+
with:
30+
egress-policy: audit # Monitor network activity
31+
allowed-endpoints: >
32+
api.github.com:443
33+
github.com:443
34+
raw.githubusercontent.com:443
35+
registry.npmjs.org:443
36+
objects.githubusercontent.com:443
37+
38+
- name: Checkout repository
39+
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
40+
with:
41+
token: ${{ secrets.GITHUB_TOKEN }}
42+
fetch-depth: 1
43+
44+
- name: Setup Node.js
45+
uses: actions/setup-node@39370e3970a6d050c480ffad4ff0ed4d3fdee5af # v4.1.0
46+
with:
47+
node-version: '24'
48+
cache: 'npm'
49+
50+
- name: Install dependencies
51+
run: npm ci
52+
53+
- name: Fetch CIA production statistics
54+
id: fetch_stats
55+
run: |
56+
echo "Fetching CIA production statistics..."
57+
node scripts/load-cia-stats.js
58+
59+
# Check if stats file was created/updated
60+
if [ -f "cia-data/production-stats.json" ]; then
61+
echo "stats_fetched=true" >> $GITHUB_OUTPUT
62+
63+
# Extract key statistics for summary
64+
TOTAL_PERSONS=$(jq -r '.counts.total_persons' cia-data/production-stats.json)
65+
TOTAL_VOTES=$(jq -r '.counts.total_votes' cia-data/production-stats.json)
66+
LAST_UPDATED=$(jq -r '.metadata.last_updated' cia-data/production-stats.json)
67+
68+
echo "total_persons=$TOTAL_PERSONS" >> $GITHUB_OUTPUT
69+
echo "total_votes=$TOTAL_VOTES" >> $GITHUB_OUTPUT
70+
echo "last_updated=$LAST_UPDATED" >> $GITHUB_OUTPUT
71+
72+
echo "✅ Statistics fetched successfully"
73+
echo " Total Persons: $TOTAL_PERSONS"
74+
echo " Total Votes: $TOTAL_VOTES"
75+
echo " Last Updated: $LAST_UPDATED"
76+
else
77+
echo "stats_fetched=false" >> $GITHUB_OUTPUT
78+
echo "❌ Failed to fetch statistics"
79+
exit 1
80+
fi
81+
82+
- name: Update website files
83+
id: update_files
84+
if: steps.fetch_stats.outputs.stats_fetched == 'true'
85+
run: |
86+
echo "Updating website files with new statistics..."
87+
node scripts/update-stats-from-cia.js
88+
89+
# Check if any files were modified
90+
if git diff --quiet; then
91+
echo "files_changed=false" >> $GITHUB_OUTPUT
92+
echo "ℹ️ No changes needed - statistics are already up to date"
93+
else
94+
echo "files_changed=true" >> $GITHUB_OUTPUT
95+
96+
# Count changed files
97+
CHANGED_FILES=$(git diff --name-only | wc -l)
98+
echo "changed_count=$CHANGED_FILES" >> $GITHUB_OUTPUT
99+
100+
echo "✅ Updated $CHANGED_FILES files"
101+
git diff --stat
102+
fi
103+
104+
- name: Commit and push changes
105+
if: steps.update_files.outputs.files_changed == 'true'
106+
run: |
107+
git config --global user.name "github-actions[bot]"
108+
git config --global user.email "github-actions[bot]@users.noreply.github.com"
109+
110+
git add cia-data/production-stats.json
111+
git add index*.html
112+
113+
# Create commit message with statistics
114+
cat > /tmp/commit_msg.txt << EOF
115+
Update statistics from CIA production database
116+
117+
Automated daily update from extraction_summary_report.csv
118+
119+
Statistics:
120+
- Total Persons: ${{ steps.fetch_stats.outputs.total_persons }}
121+
- Total Votes: ${{ steps.fetch_stats.outputs.total_votes }}
122+
- Last Updated: ${{ steps.fetch_stats.outputs.last_updated }}
123+
- Files Changed: ${{ steps.update_files.outputs.changed_count }}
124+
125+
Source: https://github.com/Hack23/cia/blob/master/service.data.impl/sample-data/extraction_summary_report.csv
126+
Workflow: .github/workflows/update-cia-stats.yml
127+
EOF
128+
129+
git commit -F /tmp/commit_msg.txt
130+
git push
131+
132+
echo "✅ Changes committed and pushed"
133+
134+
- name: Create summary
135+
if: always()
136+
run: |
137+
echo "## CIA Production Statistics Update" >> $GITHUB_STEP_SUMMARY
138+
echo "" >> $GITHUB_STEP_SUMMARY
139+
140+
if [ "${{ steps.fetch_stats.outputs.stats_fetched }}" == "true" ]; then
141+
echo "### ✅ Statistics Fetched Successfully" >> $GITHUB_STEP_SUMMARY
142+
echo "" >> $GITHUB_STEP_SUMMARY
143+
echo "| Metric | Value |" >> $GITHUB_STEP_SUMMARY
144+
echo "|--------|-------|" >> $GITHUB_STEP_SUMMARY
145+
echo "| Total Persons | ${{ steps.fetch_stats.outputs.total_persons }} |" >> $GITHUB_STEP_SUMMARY
146+
echo "| Total Votes | ${{ steps.fetch_stats.outputs.total_votes }} |" >> $GITHUB_STEP_SUMMARY
147+
echo "| Last Updated | ${{ steps.fetch_stats.outputs.last_updated }} |" >> $GITHUB_STEP_SUMMARY
148+
echo "" >> $GITHUB_STEP_SUMMARY
149+
150+
if [ "${{ steps.update_files.outputs.files_changed }}" == "true" ]; then
151+
echo "### 📝 Files Updated" >> $GITHUB_STEP_SUMMARY
152+
echo "" >> $GITHUB_STEP_SUMMARY
153+
echo "Updated ${{ steps.update_files.outputs.changed_count }} files with new statistics." >> $GITHUB_STEP_SUMMARY
154+
echo "" >> $GITHUB_STEP_SUMMARY
155+
echo "Changes committed and pushed to repository." >> $GITHUB_STEP_SUMMARY
156+
else
157+
echo "### ℹ️ No Changes Needed" >> $GITHUB_STEP_SUMMARY
158+
echo "" >> $GITHUB_STEP_SUMMARY
159+
echo "Statistics are already up to date." >> $GITHUB_STEP_SUMMARY
160+
fi
161+
else
162+
echo "### ❌ Statistics Fetch Failed" >> $GITHUB_STEP_SUMMARY
163+
echo "" >> $GITHUB_STEP_SUMMARY
164+
echo "Failed to fetch statistics from CIA production database." >> $GITHUB_STEP_SUMMARY
165+
echo "" >> $GITHUB_STEP_SUMMARY
166+
echo "Check workflow logs for details." >> $GITHUB_STEP_SUMMARY
167+
fi
168+
169+
echo "" >> $GITHUB_STEP_SUMMARY
170+
echo "**Source:** [extraction_summary_report.csv](https://github.com/Hack23/cia/blob/master/service.data.impl/sample-data/extraction_summary_report.csv)" >> $GITHUB_STEP_SUMMARY
171+
172+
- name: Notify on failure
173+
if: failure()
174+
run: |
175+
echo "❌ Workflow failed - check logs for details"
176+
echo "This may indicate:"
177+
echo " - Network connectivity issues"
178+
echo " - CIA extraction_summary_report.csv unavailable"
179+
echo " - Parse errors in CSV format"
180+
echo " - File write permission issues"
181+
182+
# ISMS Compliance
183+
# - ISO 27001:2022 A.5.33 - Protection of records (audit trails via Git, source attribution)
184+
# - ISO 27001:2022 A.8.3 - Information lifecycle management (automated daily updates)
185+
# - ISO 27001:2022 A.8.10 - Information deletion (proper retention policies, no excessive storage)
186+
# - ISO 27001:2022 A.8.19 - Security in use (HTTPS-only data transmission)
187+
# - NIST CSF 2.0 PR.DS-5 - Data integrity (automated validation checks)
188+
# - NIST CSF 2.0 DE.CM-1 - Network monitoring (harden-runner egress auditing)
189+
# - CIS Control 3.1 - Data inventory (documented public data sources)
190+
# - CIS Control 3.14 - Data integrity validation (automated verification)
191+
# - GDPR Article 6(1)(e) - Public interest processing (democratic transparency, political accountability)
192+
# - GDPR Article 9(2)(e) - Political opinions manifestly made public (voting records, party affiliation)
193+
# - Swedish Offentlighetsprincipen - Public access to government information (constitutional right)
194+
#
195+
# Note: A.8.11 (Data Masking) NOT applicable - processes only public government data from
196+
# Swedish Riksdag/Government. No sensitive data requiring masking. Journalist/OSINT platform
197+
# covering public officials in official capacity, protected by Press Freedom Act (Tryckfrihetsförordningen).

README.md

Lines changed: 25 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -119,10 +119,12 @@ See [SECURITY_ARCHITECTURE.md](SECURITY_ARCHITECTURE.md) for detailed security c
119119

120120
## ✨ Features
121121

122-
- **349 Members of Parliament** - Individual MP tracking and performance metrics
122+
- **349 Current MPs** - Individual MP tracking and performance metrics
123+
- **2,494 Historical Politicians** - Complete database from 1971-2024 (50+ years)
123124
- **8 Political Parties** - Party performance, coalition dynamics, voting patterns
124125
- **45 Risk Rules** - Systematic transparency through behavioral analysis
125-
- **50+ Years of Data** - Historical trends and longitudinal analysis (1971-2024)
126+
- **3.5+ Million Votes** - Comprehensive voting record analysis
127+
- **109,000+ Documents** - Parliamentary documents processed and analyzed
126128

127129
## 🌐 Live Platform
128130

@@ -135,7 +137,27 @@ See [SECURITY_ARCHITECTURE.md](SECURITY_ARCHITECTURE.md) for detailed security c
135137

136138
## 📊 CIA Data Products Integration
137139

138-
Riksdagsmonitor integrates with the CIA platform through automated schema validation and data quality assurance.
140+
Riksdagsmonitor integrates with the CIA platform through automated data pipelines, schema validation, and daily statistics updates.
141+
142+
### Production Database Statistics
143+
144+
**Live Statistics** (Updated Daily at 03:00 CET):
145+
- **2,494 Politicians** - Complete historical database (1971-2024)
146+
- **349 Current MPs** - Active Members of Parliament
147+
- **3.5+ Million Votes** - Comprehensive voting records
148+
- **109,000+ Documents** - Parliamentary documents processed
149+
- **8,740 Committee Documents** - Committee work tracked
150+
- **2,308 Rule Violations** - Transparency issues identified
151+
152+
**Data Source**: [extraction_summary_report.csv](https://github.com/Hack23/cia/blob/master/service.data.impl/sample-data/extraction_summary_report.csv)
153+
**Update Schedule**: Daily automated fetch via GitHub Actions
154+
**Last Extraction**: See `cia-data/production-stats.json``metadata.last_updated` (updated daily)
155+
156+
**Implementation**:
157+
- `scripts/load-cia-stats.js` - Fetches and parses production statistics
158+
- `scripts/update-stats-from-cia.js` - Updates website files
159+
- `.github/workflows/update-cia-stats.yml` - Automated daily workflow
160+
- `cia-data/production-stats.json` - Cached statistics (24-hour freshness)
139161

140162
### Schema Integration
141163
- **Automated Validation** - All CIA exports validated against JSON schemas

0 commit comments

Comments
 (0)