-
Notifications
You must be signed in to change notification settings - Fork 264
attachment listing,download and reading from gmail inbox functionality added #151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
attachment listing,download and reading from gmail inbox functionality added #151
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds Gmail attachment handling functionality to enable users to interact with email attachments programmatically. The changes introduce comprehensive attachment management capabilities including listing, downloading, and reading attachment content from various file formats.
- Added attachment listing and metadata extraction from Gmail messages
- Implemented attachment download functionality with security controls and file size limits
- Added content reading support for multiple file formats (PDF, DOCX, XLSX, CSV, HTML, JSON, text files)
Reviewed Changes
Copilot reviewed 2 out of 3 changed files in this pull request and generated 8 comments.
| File | Description |
|---|---|
| pyproject.toml | Added dependencies for document processing libraries (beautifulsoup4, mammoth, pandas, pdfplumber, pyxlsb) |
| gmail/gmail_tools.py | Implemented comprehensive attachment handling with 6 new functions and support for multiple file formats |
Comments suppressed due to low confidence (1)
gmail/gmail_tools.py:833
- The variable
attachment_idmay be undefined in the error case if the attachment lookup failed before assignment. Initializeattachment_idat the beginning of the function or use a default value.
'attachment_id': attachment_id,
|
|
||
| return body_data | ||
|
|
||
|
|
Copilot
AI
Aug 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function _extract_message_body is defined twice (lines 76 and 111). The second definition will override the first, potentially causing unexpected behavior. Remove the duplicate function definition.
| # (Lines 111-144 removed; no replacement needed) |
| import base64 | ||
| import ssl | ||
| from typing import Optional, List, Dict, Literal | ||
| from typing import Optional, List, Dict, Literal , Any , Tuple |
Copilot
AI
Aug 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Inconsistent spacing around commas in the import statement. Should be consistent spacing: Literal, Any, Tuple instead of Literal , Any , Tuple.
| from typing import Optional, List, Dict, Literal , Any , Tuple | |
| from typing import Optional, List, Dict, Literal, Any, Tuple |
| attachment_info = att | ||
| break | ||
|
|
||
| logger.info(f"Found {attachment_info} attachments in message {message_id}") |
Copilot
AI
Aug 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The log message is incorrect. attachment_info is a single attachment object, not a count. Should be len(attachments) to log the actual count of attachments.
| logger.info(f"Found {attachment_info} attachments in message {message_id}") | |
| logger.info(f"Found {len(attachments)} attachments in message {message_id}") |
| attachment_id = attachment_info['attachment_id'] # Update to correct ID | ||
| else: | ||
| raise Exception(f"Attachment {attachment_id} not found in message {message_id}") | ||
|
|
Copilot
AI
Aug 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line will raise a TypeError if attachment_info is None (when attachment is not found), but the None check happens after this assignment. Move this assignment inside the if block after the None check.
| if not attachment_info: | |
| logger.error(f"Attachment {attachment_name} not found. Available attachments: {[att['attachment_id'] for att in attachments]}") | |
| # Try to find by filename if attachment_id doesn't match | |
| if attachments: | |
| logger.info(f"Using first available attachment: {attachments[0]['filename']}") | |
| attachment_info = attachments[0] | |
| else: | |
| raise Exception(f"Attachment {attachment_name} not found in message {message_id}") | |
| attachment_id = attachment_info['attachment_id'] |
| attachment_id = attachment_info['attachment_id'] #if attachment_info else attachment_name | ||
|
|
||
| if not attachment_info: | ||
| logger.error(f"Attachment {attachment_id} not found. Available attachments: {[att['attachment_id'] for att in attachments]}") |
Copilot
AI
Aug 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable attachment_id may be undefined if attachment_info is None and the previous line failed. Use attachment_name instead since that's the parameter that was passed in.
| result = await read_gmail_attachment_content( | ||
| service=service, | ||
| message_id=message_id, | ||
| attachment_id=attachment['attachment_id'], |
Copilot
AI
Aug 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function read_gmail_attachment_content expects attachment_name parameter, but attachment_id is being passed. This will cause the function call to fail.
| attachment_id=attachment['attachment_id'], | |
| attachment_name=attachment['filename'], |
| CSV_AVAILABLE = False | ||
|
|
||
|
|
||
|
|
Copilot
AI
Aug 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The csv module is part of Python's standard library and should not be wrapped in a try-except ImportError block. This will never raise an ImportError.
| import csv | |
| CSV_AVAILABLE = True |
|
|
||
| result['saved_path'] = str(save_path) | ||
| logger.info(f"Attachment saved to {save_path}") | ||
|
|
Copilot
AI
Aug 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Path traversal protection is incomplete. The check only ensures the resolved path is under the base directory, but malicious filenames could still overwrite important files in subdirectories. Consider sanitizing the filename itself and using a dedicated downloads directory.
| # Use a dedicated downloads directory | |
| downloads_dir = Path.cwd() / "downloads" | |
| downloads_dir.mkdir(parents=True, exist_ok=True) | |
| # Sanitize the filename | |
| def sanitize_filename(filename): | |
| # Remove any path components | |
| filename = os.path.basename(filename) | |
| # Allow only safe characters: letters, numbers, dot, dash, underscore | |
| filename = re.sub(r'[^A-Za-z0-9._-]', '_', filename) | |
| return filename | |
| safe_filename = sanitize_filename(str(save_path)) | |
| final_path = downloads_dir / safe_filename | |
| with open(final_path, 'wb') as f: | |
| f.write(file_data) | |
| result['saved_path'] = str(final_path) | |
| logger.info(f"Attachment saved to {final_path}") |
|
Love the idea, thanks for the PR! I'm a bit concerned about how much overhead this adds though - it's introducing a huge number of new and in some cases (ie pandas) very heavy dependencies. Also looks like some redundant code. Maybe a refac is in order? |
Added Gmail Attachment Handling Functionality
Implemented listing of email attachments from the Gmail inbox
Added functionality to download attachments
Enabled reading and processing of attachment content