Commit 7d84182
fix: correct data quality bugs in 34 database view definitions and ensure JPA backward compatibility (#8486)
* Initial plan
* fix: correct data quality issues in 4 database views (party_summary, behavioral_trends, role_member views)
ROOT CAUSE 5: Wrong join in view_riksdagen_party_summary - used dsc.hjid = dprc.hjid
(PK coincidence) instead of correct FK dsc.document_person_reference_co_1 = dprc.hjid.
This caused total_documents ~6 for M party instead of expected ~292K.
ROOT CAUSE 6: Wrong motion type filter - used label LIKE '%motion%' but actual labels
are codes like 'MJ408'. Now uses sub_type: Partimotion, Enskild motion, Kommittémotion.
ROOT CAUSE 7: Hardcoded zeros for total_collaborative_motions, total_follow_up_motions,
party/committee/individual_focused_members, highly_collaborative_members.
Now computed from actual sub_type and document profile data.
ROOT CAUSE 8: Wrong status filter in view_politician_behavioral_trends - used
rule_violation.status = 'ACTIVE' but actual enum is OK/MINOR/MAJOR/CRITICAL.
ROOT CAUSE 9: Non-existent document types in role_member views - 'ip'/'frs' don't
exist, 'bet'/'yttr' have no person linkage. Replaced with sub_type classification.
Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Agent-Logs-Url: https://github.com/Hack23/cia/sessions/2fe8d220-26b9-4554-8614-0d9c6ce7ab3a
* fix: correct data quality in 11 additional database views (vote case, doc joins, type filters)
Session 2 adds changesets 1.79-005 through 1.79-015 fixing:
ROOT CAUSE 10: Vote case sensitivity in mv_annual_voting_metrics (materialized view)
- 'Ja'/'Nej'/'Avstår'/'Frånvarande' never match UPPERCASE data
- Cascades to view_riksdagen_election_year_behavioral_patterns
- Fixed: avg_yes_rate, avg_no_rate, avg_abstain_rate now populated
ROOT CAUSE 11: Vote case sensitivity in seasonal quarterly activity
- 'Frånvarande' never matches UPPERCASE, attendance_rate always ~100%
- q_stddev_attendance = 0, attendance_z_score = 0
- Cascades to seasonal_activity_patterns, seasonal_anomaly_detection,
q4_election_year_comparison
ROOT CAUSE 12: Wrong document join (dsc.hjid = dprc.hjid) in 5 more views
- view_risk_score_evolution, view_ministry_effectiveness_trends,
view_ministry_risk_evolution, view_ministry_productivity_matrix,
view_politician_risk_summary (stale in schema)
- Fixed: document_count now populated in risk_score_evolution
ROOT CAUSE 13: Non-existent document type 'ds' in ministry views
- document_data only has: mot, bet, prop, kammakt
- Fixed: replaced 'ds' with 'bet' (committee reports)
ROOT CAUSE 14: Wrong sub_type filters in committee_productivity views
- committee_document_data sub_types: bet, (empty), utl, ap
- Not 'mot' or 'prop'
ROOT CAUSE 15: Vote case sensitivity in crisis resilience & momentum
- view_riksdagen_crisis_resilience_indicators: crisis_absence_rate = 0
- view_riksdagen_party_momentum_analysis: participation_rate = 0
Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Agent-Logs-Url: https://github.com/Hack23/cia/sessions/56b292be-89fa-4381-8f3c-c8f89326b6cc
* fix: add 3 more view fixes (role_tier case, network hardcoded zeros) and regenerate schema via pg_dump
Session 3 adds changesets 1.79-016 through 1.79-018:
ROOT CAUSE 16: role_tier case sensitivity in 2 views
- view_riksdagen_politician_role_evolution generates UPPERCASE role_tier
(MINISTER, SPEAKER, PARTY_LEADER, COMMITTEE_CHAIR, COMMITTEE_MEMBER)
- view_riksdagen_pre_election_quarterly_activity compared lowercase
→ politicians_with_new_roles and leadership_appointments always 0
- view_riksdagen_election_proximity_trends compared lowercase
→ leadership_role_count and committee_assignment_count always 0
ROOT CAUSE 17: hardcoded placeholder zeros in network analysis
- view_election_cycle_network_analysis had literal (0)::bigint for
influential_politicians, avg_network_centrality, power_broker_count
- Replaced with calculated values from vote participation data
All 18 changesets applied via Liquibase (mvn liquibase:update).
full_schema.sql regenerated strictly via pg_dump per maintenance guide.
Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Agent-Logs-Url: https://github.com/Hack23/cia/sessions/f18a1e39-e11f-4b98-97f2-13d2f9bc52f8
* fix: correct integer division in power_broker_count calculation and regenerate schema via pg_dump
Code review identified integer division truncation in 1.79-018 where
LEAST(a,b)/GREATEST(a,b) would always truncate to 0 for unequal values.
Fixed to use LEAST(a,b) directly as the power broker count metric.
full_schema.sql regenerated strictly via pg_dump per maintenance guide.
Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Agent-Logs-Url: https://github.com/Hack23/cia/sessions/f18a1e39-e11f-4b98-97f2-13d2f9bc52f8
* fix: replace non-existent doc type 'frs' with 'kammakt' in committee_role_member view and regenerate schema via pg_dump
Changeset 1.79-019 fixes view_riksdagen_committee_role_member where the
initiatives count used non-existent document type 'frs' (written questions).
Only mot, bet, prop, kammakt exist in document_data. Replaced with 'kammakt'.
full_schema.sql regenerated strictly via pg_dump per maintenance guide.
All 110 views now pass scan for known data quality issues:
- No title-case vote comparisons (all use UPPERCASE JA/NEJ/FRÅNVARANDE/AVSTÅR)
- No non-existent document types (ip/frs/yttr/ds)
- No wrong rule_violation status ('ACTIVE')
- No wrong document-to-person join chains
- No lowercase role_tier comparisons
Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Agent-Logs-Url: https://github.com/Hack23/cia/sessions/6d225996-45b0-4582-a083-7557a830c95a
* fix: replace non-temporal view_party_performance_metrics joins with temporal data in 4 views and fix career trajectory cross-join
Changesets 1.79-020 to 1.79-023:
- view_riksdagen_party_electoral_trends: seat_count_proxy and documents_produced
were constant per party (from non-temporal view_party_performance_metrics),
making stddev_seats_party=0 and all seat_change/documents_change columns 0.
Fixed: derive from party_total_votes/number_ballots (temporal) and
politician_document matview per year.
- view_riksdagen_party_longitudinal_performance: same non-temporal join caused
membership_change, documents_change, discipline_change to always be 0.
Fixed: same temporal data approach.
- view_riksdagen_politician_career_trajectory: empty (0 rows) due to expensive
cross-joins between vote_data (3.7M rows), assignment_data, and doc refs
without time constraints. Fixed: pre-aggregate leadership_roles and
documents_authored in separate CTEs to avoid Cartesian product.
- view_election_cycle_comparative_analysis: CROSS JOIN to non-temporal
view_party_performance_metrics made change_performance_pct and
change_documents_pct always 0. Fixed: temporal data from ballot/doc views.
full_schema.sql regenerated via pg_dump per maintenance guide.
Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Agent-Logs-Url: https://github.com/Hack23/cia/sessions/75b03ed1-2a37-439b-b6af-c4773b4a6c6f
* fix: address PR review comments — fix column aliases, add missing JPA columns, recreate index
Changesets 1.79-024 to 1.79-029:
- view_committee_productivity (1.79-024): fix column alias mapping so
reports→reports_count, other_documents→motions_count matches JPA entity
- view_riksdagen_party_role_member (1.79-025): document that
total_interpellations/total_written_questions now contain
party_motions/committee_motions since interpellations don't exist in data
- view_riksdagen_committee_role_member (1.79-026): same documentation fix
- view_riksdagen_party_electoral_trends (1.79-027): add 13 missing columns
required by JPA entity (rank_by_engagement, rank_by_effectiveness,
quartile_by_performance, electoral_trend, party_size_category,
seat_forecast, performance_forecast, election_readiness_score,
is_pre_election_period, is_election_period, is_post_election_period,
trend_position_seats, electoral_tier)
- view_riksdagen_party_longitudinal_performance (1.79-028): add 29 missing
columns required by JPA entity (rank_by_win_rate, rank_by_participation,
percentile_win_rate, percentile_participation, percentile_approval,
quartile_by_win_rate, quartile_by_overall_performance, etc.)
- Recreate missing index idx_mv_annual_voting_metrics_year (1.79-029)
full_schema.sql regenerated via pg_dump per maintenance guide.
Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Agent-Logs-Url: https://github.com/Hack23/cia/sessions/fb2bc0cc-7181-44bd-a83e-42e636f4c200
* fix: address second round PR review — fix JPA type mismatches, midterm z-score, and rebellion detection
Changesets 1.79-030 to 1.79-033:
- view_riksdagen_party_electoral_trends (1.79-030): Fix JPA type mismatches:
seat_change_absolute→bigint (JPA Long), seat_forecast→text (JPA String),
performance_forecast→text (JPA String), trend_position_seats→text (JPA String),
projected_seat_change→numeric (JPA BigDecimal)
- view_riksdagen_party_longitudinal_performance (1.79-031): Fix JPA type mismatches:
membership_change→bigint (JPA Long), trend_position→text (JPA String)
- view_election_cycle_comparative_analysis (1.79-032): Add midterm_stddev_docs to
election_baseline and use it for non-election-year document z-score calculation
instead of incorrectly using election_stddev_docs for midterm years
- view_riksdagen_crisis_resilience_indicators (1.79-033): Fix meaningless rebellion
detection that compared vote value (JA/NEJ) to party code (S/M/SD). Now uses the
proper rebel flag from view_riksdagen_vote_data_ballot_politician_summary which
correctly determines rebellion by comparing individual vote against party majority.
full_schema.sql regenerated via pg_dump per maintenance guide.
Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Agent-Logs-Url: https://github.com/Hack23/cia/sessions/0e2de8c8-1029-478a-91e4-95713201c8c5
* fix: add midterm_stddev_docs to behavioral patterns view and restore view comments
Changeset 1.79-034:
- view_riksdagen_election_year_behavioral_patterns: Added midterm_stddev_docs to
election_baseline CTE and used it for non-election-year document z-score
calculation instead of incorrectly using election_stddev_docs for midterm years.
Mirrors the fix applied to view_election_cycle_comparative_analysis in 1.79-032.
- Added COMMENT ON VIEW for view_riksdagen_election_year_behavioral_patterns and
view_riksdagen_politician_career_trajectory for DB documentation.
full_schema.sql regenerated via pg_dump per maintenance guide — no manual edits.
Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Agent-Logs-Url: https://github.com/Hack23/cia/sessions/ff4d9fee-b7ed-4efc-b435-b1c9357447a7
* fix: regenerate full_schema.sql via pg_dump after validating all 34 changesets applied
All 34 changesets (1.79-001 through 1.79-034) validated as applied.
Liquibase status confirms database is up to date.
full_schema.sql regenerated strictly via pg_dump per maintenance guide — never manually edited.
Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Agent-Logs-Url: https://github.com/Hack23/cia/sessions/15ef740e-875d-4797-ba50-e9fc6c077b01
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>1 parent d1f3dd8 commit 7d84182
File tree
3 files changed
+5919
-1237
lines changed- service.data.impl/src/main/resources
3 files changed
+5919
-1237
lines changed
0 commit comments