-
Notifications
You must be signed in to change notification settings - Fork 122
feat: Google drive impersonator #733
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
…ithub.com/deepsense-ai/ragbits into extension/source/googledrive/impersonator
…r environment variable checks
Code Coverage Summary
Diff against main
Results for commit: 6316196 Minimum allowed coverage is ♻️ This comment has been updated with latest results |
Trivy scanning results. Report Summary ┌───────────────────┬──────┬─────────────────┬─────────┐
For OSS Maintainers: VEX NoticeIf you're an OSS maintainer and Trivy has detected vulnerabilities in your project that you believe are not actually exploitable, consider issuing a VEX (Vulnerability Exploitability eXchange) statement. To disable this notice, set the TRIVY_DISABLE_VEX_NOTICE environment variable. uv.lock (uv)Total: 22 (MEDIUM: 16, HIGH: 4, CRITICAL: 2) ┌──────────────┬────────────────┬──────────┬────────┬───────────────────┬───────────────┬──────────────────────────────────────────────────────────────┐ |
…gleDriveSource - Introduced GoogleDriveExportFormat enum to standardize export MIME types. - Updated _GOOGLE_EXPORT_MIME_MAP and _EXPORT_EXTENSION_MAP to use the new enum. - Modified fetch method to accept an optional export_format parameter for overriding MIME types. - Enhanced _determine_file_extension method to support MIME type overrides. - Added unit test for verifying correct file extension determination with overridden MIME types.
Co-authored-by: Konrad Czarnota <[email protected]> Co-authored-by: GlockPL <[email protected]> Co-authored-by: GlockPL <[email protected]>
|
||
@classmethod | ||
def set_credentials_file_path(cls, path: str) -> None: | ||
"""Set the path to the service account credentials file.""" | ||
cls._credentials_file_path = path | ||
|
||
@classmethod | ||
def set_impersonation_target(cls, target_mail: str) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be instance level property, not on a class level.
Why? Because user may want to impersonate as different account to reach specific file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please modify this impersonate_target_email to make it an instance property.
For other stuff I'm open to discussion
from ragbits.core.sources.google_drive import GoogleDriveSource | ||
|
||
target_email = "[email protected]" | ||
credentials_file = "service-account-key.json" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd put link to the part about creating this credential_file here.
This how-to is very long and I believe someone may forgot what this credential file is all about long ago. Or if they come directly to this place, they're cooked
ValueError: If the provided email address is invalid (empty or missing '@'). | ||
""" | ||
# check if email is a valid email. | ||
if not target_mail or "@" not in target_mail: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use a regex for this @.@?
# check if email is a valid email. | ||
if not target_mail or "@" not in target_mail: | ||
raise ValueError("Invalid email address provided for impersonation.") | ||
cls.impersonate = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this variable at all, couldn't we just see if impersonate_target_email is None below?
async def fetch( | ||
self, | ||
*, | ||
export_format: "GoogleDriveExportFormat | None" = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this in quotes?
@@ -186,7 +249,8 @@ async def fetch(self) -> Path: | |||
file_local_dir = local_dir / self.file_id | |||
file_local_dir.mkdir(parents=True, exist_ok=True) | |||
|
|||
export_mime_type, file_extension = self._determine_file_extension() | |||
override_mime = export_format.value if export_format else None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we extract value from export_format? Can't we pass it directly to _determine_file_extension, and expect this Enum type?
I don't get it because in _determine_file_extension we make it Enum again
@@ -62,12 +86,31 @@ class GoogleDriveSource(Source): | |||
|
|||
_google_drive_client: ClassVar["GoogleAPIResource | None"] = None | |||
_credentials_file_path: ClassVar[str | None] = None | |||
impersonate: ClassVar[bool | None] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if we want to make this instance property. Impersonate (imo not needed) and impersonate_target_email shouldn't be class variables here anymore as then this will be non thread safe.
@@ -186,7 +249,8 @@ async def fetch(self) -> Path: | |||
file_local_dir = local_dir / self.file_id | |||
file_local_dir.mkdir(parents=True, exist_ok=True) | |||
|
|||
export_mime_type, file_extension = self._determine_file_extension() | |||
override_mime = export_format.value if export_format else None | |||
export_mime_type, file_extension = self._determine_file_extension(override_mime=override_mime) | |||
local_file_name = f"{self.file_name}{file_extension}" | |||
path = file_local_dir / local_file_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If my file has name file.pdf
will it be saved as file.pdf.pdf
?
export_mime_type = override_mime | ||
try: | ||
export_format = GoogleDriveExportFormat(override_mime) | ||
file_extension = _EXPORT_EXTENSION_MAP.get(export_format, ".bin") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not for this Pr but why do we assume .bin is a default extension?
…se-ai/ragbits into feat/gdrive-impersonator
Co-authored-by: Konrad Czarnota <[email protected]> Co-authored-by: GlockPL <[email protected]> Co-authored-by: GlockPL <[email protected]>
Co-authored-by: GlockPL <[email protected]>
…el defaults and instance-specific settings - Introduced class-level default impersonation target for GoogleDriveSource. - Added options for setting impersonation on specific instances and during source creation. - Improved email validation for impersonation targets. - Updated tests to reflect changes in impersonation handling.
…se-ai/ragbits into feat/gdrive-impersonator
- Introduced `set_default_impersonation_target` class method to set a default email for impersonation. - Added email validation to ensure the provided address is valid. - Updated the client initialization to use the impersonation target email. - Adjusted unit tests to reflect changes in method signatures.
…o feat/gdrive-impersonator
…or (#769) Co-authored-by: GlockPL <[email protected]> Co-authored-by: Konrad Czarnota <[email protected]> Co-authored-by: GlockPL <[email protected]> Co-authored-by: akotyla <[email protected]> Co-authored-by: jakubduda-dsai <[email protected]> Co-authored-by: ds-sebastianchwilczynski <[email protected]> Co-authored-by: dazy-ds <[email protected]>
Co-authored-by: GlockPL <[email protected]> Co-authored-by: Konrad Czarnota <[email protected]> Co-authored-by: akotyla <[email protected]> Co-authored-by: jakubduda-dsai <[email protected]> Co-authored-by: ds-sebastianchwilczynski <[email protected]> Co-authored-by: dazy-ds <[email protected]>
Co-authored-by: GlockPL <[email protected]> Co-authored-by: Mateusz Hordyński <[email protected]>
No description provided.