-
Notifications
You must be signed in to change notification settings - Fork 22
Tanvir/website database api routes #4657
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Contributor
|
The latest updates on your projects. Learn more about Vercel for GitHub.
1 Skipped Deployment
|
01a17ba to
9925581
Compare
- Extract crawl_website_job to utils/website/jobs.py for better separation - Add WebsiteCrawlConfig domain model with default values - Implement selective sync functions for websites (sync_websites_to_tpuf, sync_websites_to_query_index) - Track website IDs during crawl for incremental syncing - Update delete operations to use selective deletion - Add comprehensive test suite (12 route tests + 10 sync tests) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
…ation - Fixed af951c45da91 to reference 1a06a4d351f9 instead of missing 2d743e49aaa1 - Created merge migration to combine two branches from initial schema - Regenerated websites table migration with proper revision chain - Migration chain: 1a06a4d351f9 -> [af951c45da91, 62afaf912daa] -> 7440621afbb0 -> 8e63cf285ea3 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
0879e22 to
46c7b9f
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Details
This PR builds on the website crawler infrastructure (#4656) to add the API and database layer:
Database:
websitestable migrationAPI Endpoints:
POST /sources/website/{domain}/index- Start website crawlingGET /sources/website/{domain}/status- Check crawl job statusGET /sources/website/{domain}/{website_id}- Get specific pageGET /sources/website/{domain}- List all indexed pagesPOST /sources/website/{domain}/reindex- Re-crawl websiteDELETE /sources/website/{domain}/delete- Delete specific websiteDELETE /sources/website/{domain}/delete-all- Delete all websitesFeatures:
Dependencies
Test plan