-
Notifications
You must be signed in to change notification settings - Fork 621
[Portal] Separate LLM content extraction from search data extraction #6854
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Portal] Separate LLM content extraction from search data extraction #6854
Conversation
|
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
|
How to use the Graphite Merge QueueAdd either label to this PR to merge it via the merge queue:
You must have a Graphite account in order to use the merge queue. Sign up using this link. An organization admin has enabled the Graphite Merge Queue in this repository. Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue. This stack of pull requests is managed by Graphite. Learn more about stacking. |
size-limit report 📦
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #6854 +/- ##
=======================================
Coverage 55.31% 55.31%
=======================================
Files 896 896
Lines 57023 57024 +1
Branches 3971 3968 -3
=======================================
+ Hits 31541 31542 +1
Misses 25385 25385
Partials 97 97
🚀 New features to boost your workflow:
|

PR-Codex overview
This PR focuses on enhancing the extraction of LLM content from HTML files for documentation purposes and refactoring related functionalities.
Detailed summary
extractLLMData.tsto extract LLM content and write it to files.extractSearchData.tsto remove LLM content extraction.package.jsonto include a new script for LLM content extraction.extractContentinindex.tsto exclude LLM content.extractContentForLLMinllm-extract.tsto handle LLM extraction logic.