Import USDA FoodData Central database into the product_catalog and product_barcodes tables.
| Dataset | Records | Size | Description |
|---|---|---|---|
| Foundation Foods | ~365 | 6.8 MB | Generic foods with comprehensive nutrients |
| Branded Foods | ~1.5M | 3.3 GB | Products with barcodes, brands, nutrition |
-
Download USDA data files from https://fdc.nal.usda.gov/download-datasets/
- Download "Foundation Foods" JSON
- Download "Branded Foods" JSON
-
Place files in the data directory:
cp ~/Downloads/FoodData_Central_foundation_food_json_*.json ./data/ cp ~/Downloads/FoodData_Central_branded_food_json_*.json ./data/
-
Run database migrations (from project root):
supabase db push
-
Set environment variables:
# Option 1: Create .env file echo "SUPABASE_URL=your-project-url" > .env echo "SUPABASE_SERVICE_ROLE_KEY=your-service-role-key" >> .env # Option 2: Export directly export SUPABASE_URL="your-project-url" export SUPABASE_SERVICE_ROLE_KEY="your-service-role-key"
# Install dependencies
npm install
# Import Foundation Foods only (~30 seconds)
npm run import:foundation
# or: node index.js --foundation
# Import Branded Foods only (~2-4 hours)
npm run import:branded
# or: node index.js --branded
# Import both datasets
npm run import:all
# or: node index.js --all
# Start fresh (ignore checkpoint)
node index.js --branded --no-resume- Streaming: Branded Foods uses streaming JSON parser for memory efficiency
- Batching: Inserts in batches of 1000 for performance
- Resume: Automatically resumes from checkpoint if interrupted
- Barcodes: Inserts GTINs to
product_barcodestable with FK - Deduplication: Uses
ON CONFLICTfor upsert semantics
| USDA Nutrient ID | Column |
|---|---|
| 1008 | calories |
| 1003 | protein |
| 1005 | carbs |
| 1004 | fat |
| 1079 | fiber |
| 2000 | sugar |
Stored in micros column with structure:
{
"usda_fdc_id": "12345",
"calcium": { "amount": 100, "unit": "mg" },
"iron": { "amount": 2.5, "unit": "mg" },
"sodium": { "amount": 500, "unit": "mg" }
}- Verify SUPABASE_URL and SUPABASE_SERVICE_ROLE_KEY are correct
- Check that migrations have been applied
- Ensure JSON files are in
./data/directory - File names must match pattern
FoodData_Central_*_json_*.json
- Branded Foods uses streaming, but ensure you have ~2GB RAM available
- Reduce batch size in
config.jsif needed
- Check
checkpoint.jsonexists and is readable - Use
--no-resumeto start fresh