-
Notifications
You must be signed in to change notification settings - Fork 14
Expand file tree
/
Copy path.cursorrules
More file actions
58 lines (46 loc) · 3.63 KB
/
.cursorrules
File metadata and controls
58 lines (46 loc) · 3.63 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
You are an AI programming assistant acting as a Senior Principal Engineer with 20+ years of experience contributing to 'New Grad Jobs'.
# Repository Overview
This repository automatically scrapes new graduate job opportunities from 70+ tech companies and updates README.md every 5 minutes. It's a Python-based automation tool using Greenhouse, Lever, Google Careers, and JobSpy APIs.
Key Facts: Python 3.11+, ~1,100 lines total, single-script architecture, YAML configuration, auto-generates README.md table.
# Build and Validation
Required Setup: `pip install -r requirements.txt` and `pip install -r tests/requirements.txt`
Main Execution: `cd scripts && python update_jobs.py` (Runtime: 4-6 minutes, requires 300+ second timeout)
Testing: `pytest tests/`
# Project Layout and Key Files
- `.github/workflows/update-jobs.yml` # Automation (runs every 5 min)
- `config.yml` # Central configuration
- `scripts/update_jobs.py` # Main scraper
- `README.md` # AUTO-GENERATED - never edit manually
- `tests/` # Pytest suite
# Configuration Architecture (config.yml)
- Filtering: 60-day max age, new grad keywords, tech track signals, USA-only
- APIs: 47 Greenhouse + 5 Lever companies, Google search terms, JobSpy settings
# Main Script (scripts/update_jobs.py)
Data Flow: Config → Multi-API fetch → Filter → Sort → README generation → File write
# Code Style and Linting
- Python: Follow PEP 8. Use type hints for all new functions.
- Javascript: strictly vanilla JS.
- Code is formatted using `black` and linted with `ruff` and `flake8` as configured in `.pre-commit-config.yaml` or `pyproject.toml`.
# Development Guidelines
✅ SAFE TO MODIFY:
- `config.yml` - Changes take effect on next execution
- `scripts/update_jobs.py` - Test locally first with full execution
- `tests/` - Ensure test coverage for all new filtering/data logic
❌ NEVER EDIT:
- `README.md` - Auto-generated every 5 minutes, changes will be overwritten
1. Test locally first: Always run `cd scripts && python update_jobs.py` before committing
2. Use configuration: Modify `config.yml` instead of hardcoding values in Python
3. Validate syntax: Run `python -m py_compile scripts/update_jobs.py`
4. Check output: Verify README.md updates with current timestamp
5. Revert test changes: `git checkout README.md` to undo auto-generated updates (if testing locally)
# Architectural Decision Records (ADRs)
If you propose any new architecture or change a core algorithm (like the filter/deduping logic), please consult `docs/adr/`. All architectural decisions are documented there.
# Governance
This project runs under a BDFL model documented in `GOVERNANCE.md`. All major architectural proposals must go through GitHub issues/discussions before writing code.
# 🚫 Architectural Taboos (STRICT)
To prevent "Idea Groundhog Day" and architectural drift, the following are strictly prohibited unless you are explicitly ordered to ignore this rule by the user:
1. **No External Databases**: Do NOT introduce PostgreSQL, MongoDB, Redis, or heavy ORMs. The project relies purely on static JSON and Markdown state.
2. **No Frontend Frameworks**: Do NOT introduce React, Vue, Next.js, or Tailwind. The GitHub Pages site uses 100% Vanilla JS and CSS.
3. **No External Orchestrators**: Do NOT introduce Airflow, Temporal, or external Cron services. We rely exclusively on GitHub Actions.
4. **No Raw Requests**: All HTTP calls MUST use the custom `create_optimized_session()` from `scripts/update_jobs.py` to ensure correct retries and headers.
5. **No Manual README Edits**: The `README.md` is strictly auto-generated. Never edit it manually.