Skip to content

pasie15/google-drive-video-copier

Repository files navigation

Google Drive Video Copier

A Python tool to copy video files from one Google Drive folder to another. This tool supports both service account and OAuth client authentication, recursive folder traversal, and various options for file copying.

Features

  • Dual Authentication Support: Use either a service account or OAuth client for authentication
  • Recursive Folder Traversal: Search for video files in subfolders
  • File Copy Functionality:
    • Conflict resolution (skip existing files)
    • Progress logging
    • Advanced rate limiting with exponential backoff, jitter, and circuit breaker pattern
    • Recovery mechanism for failed operations
  • Command-line Interface with various options
  • Integration with MCP Tools for error reporting
  • Unit Test Scaffolding for core functionality

Requirements

  • Python 3.6+
  • Google API Python Client
  • Google Auth Library
  • tqdm (for progress bars)
pip install google-api-python-client google-auth google-auth-oauthlib google-auth-httplib2 tqdm

Authentication Setup

Service Account Authentication

  1. Create a service account in the Google Cloud Console
  2. Download the service account JSON key file
  3. Share the Google Drive folders with the service account email

OAuth Client Authentication

  1. Create an OAuth client ID in the Google Cloud Console
  2. Download the client secrets JSON file
  3. The first time you run the tool, it will open a browser window for authentication

Usage

python gdrive_copier.py --source-id <source_folder_id> --dest-id <destination_folder_id> [options]

Required Arguments

  • --source-id: ID of the source folder in Google Drive
  • --dest-id: ID of the destination folder in Google Drive

Authentication Options

  • --credentials: Type of credentials to use (service_account or desktop_client, default: service_account)
  • --service-account-file: Path to the service account JSON file (default: service_account.json)
  • --client-secrets-file: Path to the client secrets JSON file (default: google_desktop_client1.json)
  • --token-file: Path to the token file for OAuth client (default: token.json)

Operation Options

  • --dry-run: Perform a dry run without copying files
  • --no-skip-existing: Do not skip existing files in the destination folder
  • --no-recursive: Do not search recursively in subfolders
  • --mime-type: MIME type filter (default: video/)

Rate Limiting Options

  • --no-rate-limit: Disable rate limiting
  • --initial-delay: Initial delay in seconds for rate limiting (default: 1.0)
  • --max-delay: Maximum delay in seconds for rate limiting (default: 60.0)
  • --max-retries: Maximum number of retries for rate limiting (default: 10)
  • --batch-size: Number of operations per batch (default: 10)
  • --inter-batch-delay: Delay between batches in seconds (default: 60.0)

Circuit Breaker Options

  • --no-circuit-breaker: Disable circuit breaker pattern

Recovery Options

  • --no-recovery: Disable recovery logging
  • --recovery-log-file: Path to the recovery log file (default: failed_copies.json)
  • --recover: Attempt to recover failed copy operations from the recovery log file

Logging Options

  • --log-level: Logging level (debug, info, warning, error, critical, default: info)
  • --log-file: Path to the log file (default: gdrive_copier.log)
  • --no-console-log: Disable logging to console

Configuration Options

  • --config-file: Path to the configuration file
  • --save-config: Save configuration to the specified file

Examples

Basic Usage

python gdrive_copier.py --source-id 1abc123def456 --dest-id 7xyz890uvw321

Using OAuth Client Authentication

python gdrive_copier.py --source-id 1abc123def456 --dest-id 7xyz890uvw321 --credentials desktop_client

Dry Run

python gdrive_copier.py --source-id 1abc123def456 --dest-id 7xyz890uvw321 --dry-run

Save Configuration

python gdrive_copier.py --source-id 1abc123def456 --dest-id 7xyz890uvw321 --save-config config.json

Use Saved Configuration

python gdrive_copier.py --config-file config.json

Running Tests

python test_gdrive_copier.py

Project Structure

  • gdrive_copier.py: Main script with command-line interface
  • auth.py: Authentication module supporting both service account and OAuth client
  • drive_operations.py: Module for Google Drive operations (search, copy, etc.)
  • config.py: Configuration and settings
  • utils.py: Utility functions (logging, error handling, etc.)
  • test_gdrive_copier.py: Unit tests
  • .gitignore: Prevents sensitive files from being committed to Git

Security Considerations

  • Credential Files: Never commit your credential files (service_account.json, google_desktop_client1.json, token.json) to version control. These files are included in the .gitignore file.
  • Access Control: Ensure that the Google Drive folders are shared with the appropriate permissions. The service account or user account should have at least "Editor" access to both source and destination folders.
  • Token Storage: OAuth tokens are stored locally in token.json. Keep this file secure and do not share it.

Rate Limiting and Error Handling

Advanced Rate Limiting

The tool implements a sophisticated rate limiting system to handle Google Drive API quotas and limits:

  • Exponential Backoff with Jitter: Automatically increases wait time between retries with randomized jitter to prevent synchronized retries
  • Circuit Breaker Pattern: Prevents repeated calls to failing services by temporarily blocking requests after multiple failures
  • Error Type Differentiation: Distinguishes between different types of rate limit errors for appropriate handling
  • Batch Processing: Limits the number of operations per time period to stay within API quotas
  • Recovery Mechanism: Logs failed operations to a recovery file for later retry

Configuration Parameters

  • Exponential backoff base: 2 with ±50% jitter
  • Max retries: 10
  • Circuit breaker threshold: 5 failures within 2 minutes
  • Batch size: 10 files with 60 seconds between batches
  • Recovery file: failed_copies.json

Error Handling and Logging

  • All operations are logged to both console and file (unless disabled)
  • Errors during file operations are captured in error_report.json
  • Failed copy operations are logged in failed_copies.json for recovery
  • Transfer operations are logged in transfer_log.csv

Troubleshooting

Common Issues

  1. Authentication Errors:

    • Ensure your credential files are valid and have the correct permissions
    • For service accounts, verify the folders are shared with the service account email
    • For OAuth, try deleting token.json and re-authenticating
  2. File Not Found Errors:

    • Verify the folder IDs are correct
    • Check that the source folder contains video files
    • Ensure the user/service account has access to both folders
  3. Rate Limiting:

    • The tool implements exponential backoff with jitter and circuit breaker pattern
    • If you still hit quota limits, try:
      • Reducing batch size (--batch-size)
      • Increasing inter-batch delay (--inter-batch-delay)
      • Running with fewer files or spreading operations over time
    • Use the recovery feature (--recover) to retry failed operations later

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

A Python tool to copy video files from one Google Drive folder to another

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages