Skip to content

Conversation

@leftler
Copy link

@leftler leftler commented Oct 28, 2025

Add save_drive_file_to_disk tool for downloading files directly to filesystem

Closes #254

Problem

The existing get_drive_file_content tool downloads Google Drive files to memory (io.BytesIO) and attempts to return content as text. This approach has several limitations:

  • Memory constraints: Large files consume excessive memory by loading entire contents into RAM
  • Binary file handling: Binary files cannot be properly handled or saved locally
  • Context window pollution: AI context gets filled with file content that may not be analyzable
  • No local file access: No way to save files directly to disk for subsequent processing by other tools

Solution

This PR adds a new save_drive_file_to_disk MCP tool that downloads Google Drive files directly to the local filesystem without streaming content through the AI context.

Features

Direct disk download: Uses MediaIoBaseDownload with file handles to write directly to disk
Smart path handling: Converts relative paths to workspace root automatically
Native file support: Handles both Google native files (with export) and regular files
Automatic directory creation: Creates necessary parent directories as needed
Shared drive support: Full support for shared drives with supportsAllDrives=True
Detailed feedback: Returns comprehensive success message with file metadata and location
Export format mapping: Automatically exports Google Docs/Sheets/Slides to Office formats

Technical Implementation

New function: save_drive_file_to_disk()

  • Location: gdrive/drive_tools.py (lines 412-509)
  • Parameters:
    • user_google_email: User's Google email (required)
    • file_id: Google Drive file ID (required)
    • local_file_path: Local destination path (required)
  • Tool tier: Added to extended tier in core/tool_tiers.yaml

Export format mappings:

  • Google Docs → .docx (Word)
  • Google Sheets → .xlsx (Excel)
  • Google Slides → .pptx (PowerPoint)

Use Cases

This feature is particularly useful for:

  • 📦 Downloading large datasets or media files for local processing
  • 🖼️ Saving binary files (images, PDFs, archives) that cannot be text-analyzed
  • 🤖 Building automation workflows that need files on disk
  • ⚙️ Batch processing operations on multiple Drive files
  • 💾 Archiving or backup operations

Code Changes

Files modified:

  • gdrive/drive_tools.py: +97 lines (new function)
  • core/tool_tiers.yaml: +1 line (tool tier registration)

No conflicts: This PR adds new functionality without modifying existing tools. It complements recent upload improvements from PR #239.

Testing

Tested with:

  • ✅ Large files (>100MB)
  • ✅ Binary files (images, PDFs)
  • ✅ Google native files (Docs, Sheets, Slides)
  • ✅ Shared drive files
  • ✅ Relative and absolute paths
  • ✅ Directory auto-creation

Example Usage

# Download a large image file
save_drive_file_to_disk(
    user_google_email="[email protected]",
    file_id="1ABC123xyz",
    local_file_path="downloads/image.jpg"
)

# Export Google Doc to Word
save_drive_file_to_disk(
    user_google_email="[email protected]", 
    file_id="1DEF456xyz",
    local_file_path="docs/report.docx"
)

Checklist

  • Code follows project style guidelines
  • Function properly decorated with MCP server registration
  • Error handling implemented with @handle_http_errors
  • Authentication handled via @require_google_service
  • Logging added for debugging
  • No conflicts with existing functionality
  • Tool tier configuration updated
  • Rebased on latest main branch

Related

@taylorwilsdon taylorwilsdon requested review from Copilot and taylorwilsdon and removed request for Copilot November 3, 2025 19:33
@taylorwilsdon taylorwilsdon self-assigned this Nov 3, 2025
@taylorwilsdon
Copy link
Owner

Appreciate this @leftler couple of nits to pick but we can get this in if they're fixed!

Returns:
str: Success message with file details and saved location.
"""
import os
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imports go at the top

if not os.path.isabs(local_file_path):
# Server runs from google-workspace-mcp-dev/google_workspace_mcp
# This file is in gdrive/drive_tools.py, so workspace root is ../../../ from here
workspace_root = os.path.abspath(os.path.join(os.path.dirname(__file__), "..", "..", ".."))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks specific to your environment - there is no established pattern around having google_workspace_mcp nested in some other folder, and this calculates workspace root incorrectly. The file is at gdrive/drive_tools.py, so:

  • os.path.dirname(file) → gdrive/
  • "..", "..", ".." → goes up 3 levels from gdrive/ to the workspace root - it's only 1 level deep in the actual repo structure

# Server runs from google-workspace-mcp-dev/google_workspace_mcp
# This file is in gdrive/drive_tools.py, so workspace root is ../../../ from here
workspace_root = os.path.abspath(os.path.join(os.path.dirname(__file__), "..", "..", ".."))
local_file_path = os.path.join(workspace_root, local_file_path)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably restrict / validate the resolved path somehow because as it stands, this introduces a major attack vector (a malicious payload could have it write files to /etc/passwd, firewall config or sudoers files etc)

@leftler
Copy link
Author

leftler commented Nov 4, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add save_drive_file_to_disk tool for downloading files directly to filesystem

3 participants