-
Notifications
You must be signed in to change notification settings - Fork 0
Document avendehut cli tool #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: isisosirishorus <[email protected]>
|
Cursor Agent can help with this pull request. Just |
Co-authored-by: isisosirishorus <[email protected]>
.env.example
Outdated
| @@ -0,0 +1,10 @@ | |||
| # OneDrive / Microsoft Graph configuration | |||
| ONEDRIVE_CLIENT_ID= | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cursoragent add a todo as placeholders
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I significantly enhanced the application by implementing full OneDrive integration, allowing it to list and process files directly from Microsoft Graph. This involved adding new dependencies and updating configuration examples, alongside minor CLI improvements.
- I replaced the OneDrive stub in
avendehut/utils/onedrive.pywith a functional implementation usingmsgraph-sdkandazure-identity. This includes methods for authenticating viaDeviceCodeCredential, handling paginated results from the Graph API, and recursively listing files from a specified OneDrive path. - I added
azure-identityandmsgraph-sdktopyproject.tomlto support the new OneDrive integration. - I updated
.env.exampleto includeONEDRIVE_TENANT_IDand addedTODOcomments for clearer setup instructions. - I standardized the help options across all
clickcommands inavendehut/cli.pyandavendehut/commands/*.pyby adding"-help"tocontext_settings. - I updated the repository URL in
README.mdto reflect the correct GitHub path.
Learn more about Cursor Agents
README.md
Outdated
|
|
||
| 1. Clone the repository. | ||
| ```bash | ||
| git clone https://github.com/yourusername/avendehut.git |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cursoragent Put the right path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
avendehut/commands/build.py
Outdated
|
|
||
|
|
||
| @click.command() | ||
| @click.option("--src", type=click.Path(exists=True, file_okay=False, path_type=Path), required=True, help="Source folder (local or onedrive:/path)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cursoragent for all the commands, add a -help parameter to describe the options available
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
avendehut/utils/onedrive.py
Outdated
|
|
||
|
|
||
| def list_onedrive_files(prefix_path: str) -> Iterable[Path]: # pragma: no cover - stub | ||
| raise NotImplementedError("OneDrive listing not implemented. Use local paths for now.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cursoragent implement this. Use official sdk to iterate pages and returning resu,to of files in each folder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Co-authored-by: isisosirishorus <[email protected]>
avendehut/utils/onedrive.py
Outdated
|
|
||
| def _get_graph_client() -> GraphServiceClient: | ||
| client_id = os.environ["ONEDRIVE_CLIENT_ID"] | ||
| tenant_id = os.environ.get("ONEDRIVE_TENANT_ID", "consumers") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cursoragent we are using clientid/clientsecret,not interactive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
avendehut/utils/onedrive.py
Outdated
| try: | ||
| token = credential.get_token("Files.Read.All") | ||
| except Exception: | ||
| cred = DeviceCodeCredential(client_id=os.environ["ONEDRIVE_CLIENT_ID"], tenant_id=os.environ.get("ONEDRIVE_TENANT_ID", "consumers")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cursoragent don't repeat code. Do a clean and better iteration for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pyproject.toml
Outdated
| name = "avendehut" | ||
| version = "0.1.0" | ||
| description = "CLI to scan books, extract metadata, and generate a searchable HTML catalog." | ||
| authors = ["Your Name <[email protected]>"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cursoragent populatewith my data. You know the repo you are
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I significantly refactored the OneDrive integration to use app-only authentication via ClientSecretCredential, enhancing automation capabilities by removing interactive user prompts. I also improved the README.md with practical usage examples for the CLI commands.
- I updated
avendehut/utils/onedrive.pyto switch fromDeviceCodeCredential(interactive) toClientSecretCredentialfor OneDrive authentication, enabling app-only access. This now requiresONEDRIVE_CLIENT_SECRETandONEDRIVE_TENANT_IDenvironment variables. - I refactored the OneDrive file listing logic in
_iterate_childrenwithinavendehut/utils/onedrive.pyto leverage the Microsoft Graph SDK's built-in pagination, removing the need for manualrequestscalls to follow@odata.nextLink. - I added a new "Usage" section to
README.mdwithbashexamples foravendehut build,open,search, andexportcommands to provide clear guidance. - I updated the
authors,homepage, andrepositoryfields inpyproject.tomlto reflect thekhnumdevprofile.
4 tasks completed
- Refactor OneDrive to client secret and clean pagination
- Add basic Usage section to README
- Update pyproject authors and repo to khnumdev
- Run test suite after changes
Learn more about Cursor Agents
| - `OUT_FOLDER` (local output path) | ||
| - Register an app and configure permissions by following Microsoft's official docs: [Register an application](https://learn.microsoft.com/en-us/graph/auth-register-app-v2). | ||
|
|
||
| ### CLI Commands |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cursoragent add a basic "usage" section
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cursoragent add basic usage section, to process from local and for one drive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Co-authored-by: isisosirishorus <[email protected]>
|
|
||
| ### Installation | ||
|
|
||
| 1. Clone the repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cursoragent This should be for development.
Ideally, I'd like to have this as a package published.
- Prepare CI with an action to publish package in some python repository
- Add here instructions to install that package
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cursoragent still pending
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to increase your spend limit or enable usage-based billing to run background agents. Go to Cursor
| reader = PdfReader(str(path)) | ||
| info = reader.metadata or {} | ||
| title = (getattr(info, "title", None) or path.stem) | ||
| author = getattr(info, "author", None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cursoragent use any attributes apart from these, like Subject: A brief description or the central idea of the document.
Keywords/Tags: Terms that help categorize and search for the document's content.
Creation Date: The date and time the document was first created.
Modification Date: The date and time the document was last changed.
Creator/Application: The software that created the original document (e.g., Microsoft Word).
Producer: The software that converted the document into its PDF format.
File Name: The name of the file itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| title: str | ||
| authors: List[str] | ||
| year: Optional[int] | ||
| language: Optional[str] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cursoragent it may happen that same book is prsent in different formats, like EPUB of PDF. Normally file name is the same for both, but with differnt extensions.
Probably the model should be updated to a "book" and it should have different formats with the links of the files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - `OUT_FOLDER` (local output path) | ||
| - Register an app and configure permissions by following Microsoft's official docs: [Register an application](https://learn.microsoft.com/en-us/graph/auth-register-app-v2). | ||
|
|
||
| ### CLI Commands |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cursoragent add basic usage section, to process from local and for one drive
|
|
||
|
|
||
| def iter_source_files(src: Path) -> Iterable[Path]: | ||
| for root, _dirs, files in os.walk(src): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cursoragent is this using onedrive or local? It should the one based on the path provided
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Co-authored-by: isisosirishorus <[email protected]>
…nt-fixes-e297 Apply pending review comment fixes
…Drive opt-in. - Support -h/-help/--help across commands; document help usage. - Fix README repo path and clarify OneDrive env and setup. - Add .env.example placeholders and tidy variables. - Make OneDrive deps optional via lazy imports; clarify build src type. - Minor robustness in build: validate src, error on onedrive path until sync. - Ensure EPUB parsing lazy-imports ebooklib to speed tests.
|
@copilot apply pending feedback and finish the PR. Ensure this is working,run any test required. |
Add initial
README.mdandCONTRIBUTORS.mddocumentation for theavendehutCLI tool.