Date: 2025-10-31 Goal: Load map dots as quickly as possible
Instead of expensive pre-computation OR slow classification on every load, we use progressive enhancement:
- Fast initial load: Show all dots immediately (no classification)
- Optional refinement: Button to classify and color-code by type
-- Expensive CTE with JOIN + GROUP BY
WITH geo_classification AS (
SELECT
geo.pid, geo.latitude, geo.longitude,
MAX(CASE WHEN e.p = 'sample_location' THEN 1 ELSE 0 END) as is_sample_location,
MAX(CASE WHEN e.p = 'site_location' THEN 1 ELSE 0 END) as is_site_location
FROM nodes geo
JOIN nodes e ON (geo.row_id = e.o[1])
WHERE geo.otype = 'GeospatialCoordLocation'
GROUP BY geo.pid, geo.latitude, geo.longitude
)
SELECT pid, latitude, longitude, CASE ... END as location_type
FROM geo_classificationLoad Time: ~7 seconds query + 0.4s render = ~7.5 seconds total
-- Simple DISTINCT query (no joins!)
SELECT DISTINCT pid, latitude, longitude
FROM nodes
WHERE otype = 'GeospatialCoordLocation'Load Time: ~1-2 seconds query + 0.4s render = ~2 seconds total 🎉
Speedup: 3-4x faster!
- User navigates to page
- Globe appears immediately
- "Loading geocodes..." (1-2 seconds)
- "Rendering geocodes... 500/198,433 (0%)" with progress bar
- All ~198,000 dots appear in blue (single color)
- Page fully interactive in ~2 seconds
- User clicks "Color-code by type (sample/site/both)" button
- Classification query runs (~7 seconds, same as old initial load)
- Dots recolor:
- Blue (small): sample_location_only - field collection points
- Purple (large): site_location_only - administrative markers
- Orange (medium): both - dual-purpose locations
- Type: Simple SELECT DISTINCT
- Scan: GeospatialCoordLocation nodes only (no joins)
- Time: ~1-2 seconds (vs 7 seconds before)
- Output: 198,433 geocodes
- Type: CTE with JOIN + GROUP BY
- Scan: Full edge traversal to determine types
- Time: ~7 seconds (same as old query, but user opted in)
- Output: Classification map (pid → type)
- Action: Recolors existing points in-place (no re-render needed)
- Chunk size: 500 points per batch
- Yields: Every 500 points to keep browser responsive
- Progress: Dynamic indicator shows X/Y (Z%)
- Telemetry: Console logs with performance measurements
Query executed in 1847ms - retrieved 198433 locations
Rendering completed in 423ms
Total time (query + render): 2270msClassifying dots by type...
Classification completed in 6892ms - updated 198433 points
- Blue (sample_location_only): field collection points
- Purple (site_location_only): administrative markers
- Orange (both): dual-purpose locationsPath 1 query executed in 1523ms - retrieved 5 samples
Path 2 query executed in 892ms - retrieved 0 samples
Eric's query executed in 1401ms - retrieved 5 samplesviewof classifyDots = Inputs.button("Color-code by type (sample/site/both)", {
value: null,
reduce: () => Date.now()
});- Runs classification query on demand
- Builds Map of pid → location_type
- Updates existing point colors and sizes
- Logs telemetry to console
- Open
http://localhost:5860/tutorials/parquet_cesium.html - Open browser console (F12)
- Watch for timing logs
- Expect: ~2 seconds until all blue dots visible
- Once dots are loaded, click "Color-code by type (sample/site/both)" button
- Watch console for "Classifying dots by type..." message
- Expect: Dots recolor after ~7 seconds
- Most dots stay blue (sample_location_only)
- Some become purple (site_location_only)
- Some become orange (both)
- Click any dot on globe
- Expect: Three tables render with sample data
- Console shows timing for each query
- ✅ Would also load in ~1 second
⚠️ Requires maintenance (regenerate when source updates)⚠️ Requires file hosting (upload to Google Cloud Storage)⚠️ Another file to manage (~6MB)
- ✅ Loads in ~2 seconds (acceptable!)
- ✅ Zero maintenance (no derived files)
- ✅ Zero hosting (no additional uploads)
- ✅ User choice (classify only if needed)
- ✅ Works with any future data updates automatically
Before: "This page is SO SLOW! 😩"
- 7+ seconds staring at loading indicator
- No feedback on progress
- Browser frozen
After: "Much better! The dots show up right away 👍"
- ~2 seconds to interactive
- Progress indicator shows work happening
- Can click dots immediately
- Optional classification if user wants color-coding
If ~2 seconds is still too slow, we can pursue:
- Create
oc_geocodes_simple.parquetwith just pid/lat/lon - Skip query entirely, load directly
- Expected: <1 second load time
- Tradeoff: Maintenance burden
- Use DuckDB spatial extensions
- Create R-tree index on coordinates
- Faster viewport-based queries
- Tradeoff: Complexity
File: tutorials/parquet_cesium.qmd
Changes:
- Lines 131-218: Simplified locations query (removed classification CTE)
- Lines 50-56: Added classification button
- Lines 769-845: Added classification handler
- All queries: Added performance telemetry
Net Impact: ~100 lines changed Performance Gain: 3-4x faster initial load User Benefit: Page feels responsive immediately
Before Optimization:
- Initial load: 7+ seconds
- User perception: "Slow and frozen"
- Time to interactive: 7+ seconds
After Optimization:
- Initial load: ~2 seconds
- User perception: "Fast and responsive"
- Time to interactive: ~2 seconds
- Improvement: 71% faster! 🎉
The expensive part was classification (JOIN + GROUP BY), not geocode retrieval.
By deferring classification to an optional button:
- Fast initial load (no classification)
- Progressive enhancement (classify if needed)
- Zero maintenance overhead
Best of both worlds! ✨