@@ -14,7 +14,7 @@ This directory contains static reference data used by the IQB prototype.
1414
1515We maintain two data formats in ` ./cache/ ` :
1616
17- ### v0 - JSON Format
17+ ### v0 - JSON Format (Golden Files)
1818
1919Per-country JSON files with pre-aggregated percentiles:
2020
@@ -44,9 +44,8 @@ Raw query results stored efficiently for flexible analysis:
4444 - ` stats.json ` - Query metadata (start time, duration, bytes processed/billed, template hash)
4545- ** Use case** : Efficient filtering, large-scale analysis, direct PyArrow/Pandas processing
4646
47- ** Migration** : We're transitioning to v1 as the primary format. v0 remains available for
48- backward compatibility and casual use. If Parquet proves too heavy for some workflows,
49- v0 will continue to be maintained.
47+ ** Migration** : The [ ../library] ( ../library ) ` IQBCache ` uses v1. We are keeping v0 data
48+ around as golden files, for backward compatibility, and casual use.
5049
5150## How This Data Was Generated
5251
@@ -159,8 +158,8 @@ for details.
159158
160159## Future Improvements (Phase 2+)
161160
162- - Direct Parquet reading in cache.py (PyArrow predicate pushdown for efficient filtering)
161+ - Finer geographic resolution (cities, provinces, ASNs) - IN PROGRESS
162+ - Remote storage for ` cache/v1 ` data (GitHub releases)
163163- Additional datasets (Ookla, Cloudflare)
164- - Finer geographic resolution (cities, provinces, ASNs)
165164- Finer time granularity (daily, weekly)
166- - Remote storage for v1 cache (GitHub releases, GCS buckets)
165+ - Remote storage for ` cache/v1 ` data ( GCS buckets)
0 commit comments