CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

R Package Development Commands

This is an R package providing access to the YouTube Data API v3. Use standard R package development workflow:

Build and Check:

# Build package
devtools::build()

# Check package (R CMD check)
devtools::check()

# Install development version
devtools::install()

# Load for development
devtools::load_all()

Testing:

# Run all tests
devtools::test()

# Run specific test file
testthat::test_file("tests/testthat/test-get-all-comments.R")

# Run mock tests (no API required)
testthat::test_file("tests/testthat/test-pagination-mocks.R")

# Run integration tests (requires API key)
yt_set_key("YOUR_API_KEY")
testthat::test_file("tests/testthat/test-integration-ready.R")

Documentation:

# Update documentation from roxygen comments
devtools::document()

# Build pkgdown site
pkgdown::build_site()

# Check package with lintr (if available)
lintr::lint_package()

Core Architecture

HTTP Layer (`tuber.R`)

The foundation is built on tuber_GET, tuber_POST, tuber_DELETE functions that handle all YouTube API communication. These functions:

Handle authentication (OAuth2 tokens or API keys)
Manage HTTP errors and API responses
Include user-agent headers and proper error checking

Dual Authentication System

Functions support two authentication methods via the auth parameter:

auth = "token" (default): OAuth2 via yt_oauth(), stored in .httr-oauth, requires yt_check_token()
auth = "key": API keys via yt_set_key()/yt_get_key(), stored in .Renviron as YOUTUBE_KEY, requires yt_check_key()

OAuth2 is required for write operations (uploads, playlist management), while API keys work for most read operations.

Function Categories

Functions are organized by YouTube resource types (see _pkgdown.yml for complete categorization):

Channel functions: get_channel_stats, list_channel_videos, etc.
Video functions: get_video_details, get_stats, etc.
Playlist functions: get_playlist_items, create_playlist, etc.
Comment functions: get_all_comments, get_comment_threads, etc.
Search functions: yt_search, yt_topic_search
Caption functions: get_captions, list_captions, etc.

Standardized Pagination Patterns

CRITICAL: All pagination functions follow consistent patterns after recent fixes:

Use nextPageToken in while loops with proper null checks: while (!is.null(page_token) && is.character(page_token))
Accumulate res$items arrays, not entire response objects
Use efficient list indexing: list[[length(list) + 1]] <- new_item instead of c(list, new_item)
Include safety breaks: if (length(items) >= max_results) break

Fixed Functions: get_playlist_items, get_all_comments, get_comment_threads all now use proper pagination.

Response Processing Architecture

Functions typically follow this pattern:

Build query parameters with validation
Make initial tuber_GET call
Handle pagination if max_results > API_limit or equivalent
Apply simplify parameter logic to convert nested JSON to data.frames
Add result metadata as attributes (total_results, actual_results, api_limit_reached)

YouTube API Constraints

Quota limits: 10,000 units/day default, search costs 100 units each
Result limits: Search ~500 items max, playlists 50/request, comments 100/request
Rate limiting: Functions should handle HTTP 403/429 responses gracefully
Date formats: Must use RFC 339 format (YYYY-MM-DDTHH:MM:SSZ)

Testing Infrastructure

Three-tier testing system:

Mock tests (test-pagination-mocks.R): Test pagination logic without API calls using with_mocked_bindings()
Unit tests (existing files): Test basic functionality with encrypted token
Integration tests (test-integration-ready.R): Full API testing with user-provided API key

Tests automatically skip when authentication is unavailable. Use yt_set_key() for API key testing.

Parameter Validation Patterns

Most functions validate:

Required string parameters with is.character() checks
Enumerated values with %in% checks
API-specific constraints (e.g., max_results ranges per endpoint)
RFC 339 date format validation
Video-specific parameters only when type = "video"

Key Development Notes

All API calls must go through tuber_GET/tuber_POST/tuber_DELETE - never call httr directly
Functions use both httr (OAuth) and httr2 (API key) for different authentication methods
Many functions support both simplified data.frame output and raw list output via simplify parameter
When implementing new pagination, follow the standardized pattern in yt_search.R

Issues Fixed

Issue #107: `get_comment_threads` pagination problems ✅ FIXED

Problem: When max_results > 100, function repeated the most recent 100 comments instead of fetching unique additional comments
Solution: Implemented proper deduplication during pagination loop using collected comment IDs tracking and standardized pagination pattern from yt_search.R
Status: Fixed and tested

Issue #95: `get_playlist_items` returns count instead of items ✅ FIXED

Problem: Function returned only playlist metadata instead of actual video items due to incorrect response processing
Solution: Rewrote pagination and simplification logic to properly handle API response structure, improved error handling for empty responses
Status: Fixed and tested

Issue #112: missing TRUE/FALSE error ✅ ALREADY FIXED

Problem: Error in get_all_comments() with if (n_replies[i] == 1) when n_replies was undefined
Status: Already fixed in current codebase - the problematic code pattern no longer exists

Issue #118: 404 error with autogenerated channels ✅ FIXED

Problem: list_channel_videos() failed with 404 error for autogenerated YouTube channels (e.g., music topic channels)
Solution: Added intelligent detection of autogenerated channels and provide helpful error message with alternative approach using yt_search()
Status: Fixed with improved error handling

Issue #103/#122: Authentication token issues ✅ LIKELY FIXED

Problem: Users unable to read OAuth tokens from .httr-oauth file
Status: Improved error handling in yt_oauth() function already implemented - can likely be closed

Issue #115: Quota management ✅ FIXED

Status: Comprehensive quota management system implemented in quota_management.R
Features: Automatic quota tracking, daily reset, rate limiting warnings
Status: Complete - can be closed

Issue #88: `list_channel_videos` function error ✅ FIXED

Problem: Function returned only pageInfo instead of video details
Root cause: Same underlying issue as #95 - uses get_playlist_items internally
Status: Fixed with #95 resolution

Issue #87: Extracting values from `get_video_details` ✅ DOCUMENTATION IMPROVED

Problem: Users didn't know how to extract specific fields like video title from function response
Solution: Added comprehensive examples showing field extraction for common use cases, including Shiny applications
Status: Documentation enhanced

Issue #81: Resumable uploads feature request ⏳ FEATURE REQUEST

Problem: Large file uploads fail due to timeouts on slow connections
Current status: Basic resumable upload endpoint used but full resumption logic not implemented
Recommendation: Complex feature requiring chunked uploads, progress tracking, and resume capability - suitable for future major version

Issue #80: Unlisted video statistics ✅ ALREADY SUPPORTED

Problem: User wanted to access unlisted video statistics
Status: Already supported with OAuth authentication - documented this capability
Solution: Added clear documentation about OAuth requirement for unlisted content

Issue #111: Use of comments for research ℹ️ NON-TECHNICAL

Status: Legal/ethical question about terms of service - not a code issue

Function-Specific Debugging Notes

Comment Functions

Both get_comment_threads.R and get_all_comments.R use similar pagination patterns but may have different edge case handling
Unicode text processing is handled via clean_youtube_text() function
Always check for duplicate comment IDs when paginating

Playlist Functions

get_playlist_items() uses older pagination pattern (while loop with length(res$items)) vs newer pattern in search functions
Consider updating to match standardized pagination pattern from yt_search.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLAUDE.md

R Package Development Commands

Core Architecture

HTTP Layer (`tuber.R`)

Dual Authentication System

Function Categories

Standardized Pagination Patterns

Response Processing Architecture

YouTube API Constraints

Testing Infrastructure

Parameter Validation Patterns

Key Development Notes

Issues Fixed

Issue #107: `get_comment_threads` pagination problems ✅ FIXED

Issue #95: `get_playlist_items` returns count instead of items ✅ FIXED

Issue #112: missing TRUE/FALSE error ✅ ALREADY FIXED

Issue #118: 404 error with autogenerated channels ✅ FIXED

Issue #103/#122: Authentication token issues ✅ LIKELY FIXED

Issue #115: Quota management ✅ FIXED

Issue #88: `list_channel_videos` function error ✅ FIXED

Issue #87: Extracting values from `get_video_details` ✅ DOCUMENTATION IMPROVED

Issue #81: Resumable uploads feature request ⏳ FEATURE REQUEST

Issue #80: Unlisted video statistics ✅ ALREADY SUPPORTED

Issue #111: Use of comments for research ℹ️ NON-TECHNICAL

Function-Specific Debugging Notes

Comment Functions

Playlist Functions

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

CLAUDE.md

R Package Development Commands

Core Architecture

HTTP Layer (tuber.R)

Dual Authentication System

Function Categories

Standardized Pagination Patterns

Response Processing Architecture

YouTube API Constraints

Testing Infrastructure

Parameter Validation Patterns

Key Development Notes

Issues Fixed

Issue #107: get_comment_threads pagination problems ✅ FIXED

Issue #95: get_playlist_items returns count instead of items ✅ FIXED

Issue #112: missing TRUE/FALSE error ✅ ALREADY FIXED

Issue #118: 404 error with autogenerated channels ✅ FIXED

Issue #103/#122: Authentication token issues ✅ LIKELY FIXED

Issue #115: Quota management ✅ FIXED

Issue #88: list_channel_videos function error ✅ FIXED

Issue #87: Extracting values from get_video_details ✅ DOCUMENTATION IMPROVED

Issue #81: Resumable uploads feature request ⏳ FEATURE REQUEST

Issue #80: Unlisted video statistics ✅ ALREADY SUPPORTED

Issue #111: Use of comments for research ℹ️ NON-TECHNICAL

Function-Specific Debugging Notes

Comment Functions

Playlist Functions

HTTP Layer (`tuber.R`)

Issue #107: `get_comment_threads` pagination problems ✅ FIXED

Issue #95: `get_playlist_items` returns count instead of items ✅ FIXED

Issue #88: `list_channel_videos` function error ✅ FIXED

Issue #87: Extracting values from `get_video_details` ✅ DOCUMENTATION IMPROVED