- Root directory cluttered with 50+ files
- Mix of utilities, documentation, and one-off scripts
- Credentials in
conf.py(gitignored but risky) - No clear structure for automation vs utilities
pypubpub/
├── pypubpub/ # Main package (existing)
│ ├── __init__.py
│ ├── Pubv6.py
│ └── utils.py
│
├── scripts/ # Automation scripts
│ ├── coda_integration/
│ │ ├── fetch_from_coda.py
│ │ ├── setup_coda.py
│ │ └── test_coda_connection.py
│ │
│ ├── pubpub_automation/
│ │ ├── create_eval_package.py
│ │ └── setup_credentials.py
│ │
│ └── utilities/
│ ├── fix_backslash_urls.py
│ ├── fix_doi_periods.py
│ ├── scan_links.py
│ └── ... (other utility scripts)
│
├── docs/ # Documentation
│ ├── AUTOMATION_GUIDE.md
│ ├── CODA_WORKFLOW.md
│ ├── CODA_SETUP_GUIDE.md
│ ├── QUICKSTART_CODA.md
│ └── SETUP_SUMMARY.md
│
├── examples/ # Example scripts
│ ├── evaluation_packages/
│ │ └── scale_use_heterogeneity/
│ │ ├── create_package.py
│ │ └── README.md
│ └── README.md
│
├── evaluation_data/ # Data (GITIGNORED)
│ ├── public/ # Safe to commit
│ └── confidential/ # NEVER commit
│
├── tests/ # Tests (existing)
│
├── unjournalpubpub_production_moved/ # Keep as is
│ └── conf.py # GITIGNORED
│
├── .env # GITIGNORED - Coda/API keys
├── .env.example # Safe template
├── .gitignore # Updated
├── README.md # Clear overview
├── CLAUDE.md # Project guide
└── requirements.txt
Coda Integration:
fetch_from_coda.py→scripts/coda_integration/setup_coda.py→scripts/coda_integration/test_coda_connection.py→scripts/coda_integration/
PubPub Automation:
create_eval_scale_use.py→examples/evaluation_packages/scale_use_heterogeneity/setup_credentials.py→scripts/pubpub_automation/extract_prati_ratings.py→examples/evaluation_packages/scale_use_heterogeneity/extract_pdf_ratings.py→examples/evaluation_packages/scale_use_heterogeneity/
Utilities:
fix_*.py→scripts/utilities/scan_links.py→scripts/utilities/check_*.py→scripts/utilities/add_to_collection.py→scripts/utilities/audit_collections.py→scripts/utilities/delete_untitled_pubs.py→scripts/utilities/restore_dois*.py→scripts/utilities/test_*.py(non-pytest) →scripts/utilities/
Documentation:
AUTOMATION_GUIDE.md→docs/CODA_WORKFLOW.md→docs/SETUP_SUMMARY.md→docs/FIXING_BACKSLASHES_GUIDE.md→docs/QUICKSTART_FIX_BACKSLASHES.md→docs/*_REPORT.md→docs/reports/(or delete if not needed)README_LINK_FIXES.md→docs/
README.md- Main project overviewCLAUDE.md- Project guide for Claude.gitignore.env.examplerequirements.txtpyproject.tomlLICENSE
- JSON reports (move to evaluation_data/public or delete)
scan_output.log- One-off test scripts
Before committing:
- Verify
conf.pyis gitignored - Verify
.envis gitignored - Verify
evaluation_data/confidential/is gitignored - Create
.env.examplewithout real secrets - Remove any hardcoded API keys/passwords from scripts
- Add security warning to README
- Update all scripts to use environment variables
- Create directory structure
- Move files to new locations
- Update import paths in moved scripts
- Update .gitignore with additional patterns
- Create new README
- Test that automation still works
- Commit changes
- Keep
unjournalpubpub_production_moved/as-is (legacy) conf.pymust NEVER be committed (already gitignored)- All credentials must be in
.envor environment variables - No real API keys in any committed code