This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
This is an R package providing access to the YouTube Data API v3. Use standard R package development workflow:
Build and Check:
# Build package
devtools::build()
# Check package (R CMD check)
devtools::check()
# Install development version
devtools::install()
# Load for development
devtools::load_all()Testing:
# Run all tests
devtools::test()
# Run specific test file
testthat::test_file("tests/testthat/test-get-all-comments.R")
# Run mock tests (no API required)
testthat::test_file("tests/testthat/test-pagination-mocks.R")
# Run integration tests (requires API key)
yt_set_key("YOUR_API_KEY")
testthat::test_file("tests/testthat/test-integration-ready.R")Documentation:
# Update documentation from roxygen comments
devtools::document()
# Build pkgdown site
pkgdown::build_site()
# Check package with lintr (if available)
lintr::lint_package()The foundation is built on tuber_GET, tuber_POST, tuber_DELETE functions that handle all YouTube API communication. These functions:
- Handle authentication (OAuth2 tokens or API keys)
- Manage HTTP errors and API responses
- Include user-agent headers and proper error checking
Functions support two authentication methods via the auth parameter:
auth = "token"(default): OAuth2 viayt_oauth(), stored in.httr-oauth, requiresyt_check_token()auth = "key": API keys viayt_set_key()/yt_get_key(), stored in.RenvironasYOUTUBE_KEY, requiresyt_check_key()
OAuth2 is required for write operations (uploads, playlist management), while API keys work for most read operations.
Functions are organized by YouTube resource types (see _pkgdown.yml for complete categorization):
- Channel functions:
get_channel_stats,list_channel_videos, etc. - Video functions:
get_video_details,get_stats, etc. - Playlist functions:
get_playlist_items,create_playlist, etc. - Comment functions:
get_all_comments,get_comment_threads, etc. - Search functions:
yt_search,yt_topic_search - Caption functions:
get_captions,list_captions, etc.
CRITICAL: All pagination functions follow consistent patterns after recent fixes:
- Use
nextPageTokenin while loops with proper null checks:while (!is.null(page_token) && is.character(page_token)) - Accumulate
res$itemsarrays, not entire response objects - Use efficient list indexing:
list[[length(list) + 1]] <- new_iteminstead ofc(list, new_item) - Include safety breaks:
if (length(items) >= max_results) break
Fixed Functions: get_playlist_items, get_all_comments, get_comment_threads all now use proper pagination.
Functions typically follow this pattern:
- Build query parameters with validation
- Make initial
tuber_GETcall - Handle pagination if
max_results > API_limitor equivalent - Apply
simplifyparameter logic to convert nested JSON to data.frames - Add result metadata as attributes (
total_results,actual_results,api_limit_reached)
- Quota limits: 10,000 units/day default, search costs 100 units each
- Result limits: Search ~500 items max, playlists 50/request, comments 100/request
- Rate limiting: Functions should handle HTTP 403/429 responses gracefully
- Date formats: Must use RFC 339 format (YYYY-MM-DDTHH:MM:SSZ)
Three-tier testing system:
- Mock tests (
test-pagination-mocks.R): Test pagination logic without API calls usingwith_mocked_bindings() - Unit tests (existing files): Test basic functionality with encrypted token
- Integration tests (
test-integration-ready.R): Full API testing with user-provided API key
Tests automatically skip when authentication is unavailable. Use yt_set_key() for API key testing.
Most functions validate:
- Required string parameters with
is.character()checks - Enumerated values with
%in%checks - API-specific constraints (e.g.,
max_resultsranges per endpoint) - RFC 339 date format validation
- Video-specific parameters only when
type = "video"
- All API calls must go through
tuber_GET/tuber_POST/tuber_DELETE- never call httr directly - Functions use both httr (OAuth) and httr2 (API key) for different authentication methods
- Many functions support both simplified data.frame output and raw list output via
simplifyparameter - When implementing new pagination, follow the standardized pattern in
yt_search.R
- Problem: When
max_results > 100, function repeated the most recent 100 comments instead of fetching unique additional comments - Solution: Implemented proper deduplication during pagination loop using collected comment IDs tracking and standardized pagination pattern from
yt_search.R - Status: Fixed and tested
- Problem: Function returned only playlist metadata instead of actual video items due to incorrect response processing
- Solution: Rewrote pagination and simplification logic to properly handle API response structure, improved error handling for empty responses
- Status: Fixed and tested
- Problem: Error in
get_all_comments()withif (n_replies[i] == 1)whenn_replieswas undefined - Status: Already fixed in current codebase - the problematic code pattern no longer exists
- Problem:
list_channel_videos()failed with 404 error for autogenerated YouTube channels (e.g., music topic channels) - Solution: Added intelligent detection of autogenerated channels and provide helpful error message with alternative approach using
yt_search() - Status: Fixed with improved error handling
- Problem: Users unable to read OAuth tokens from
.httr-oauthfile - Status: Improved error handling in
yt_oauth()function already implemented - can likely be closed
- Status: Comprehensive quota management system implemented in
quota_management.R - Features: Automatic quota tracking, daily reset, rate limiting warnings
- Status: Complete - can be closed
- Problem: Function returned only pageInfo instead of video details
- Root cause: Same underlying issue as #95 - uses
get_playlist_itemsinternally - Status: Fixed with #95 resolution
- Problem: Users didn't know how to extract specific fields like video title from function response
- Solution: Added comprehensive examples showing field extraction for common use cases, including Shiny applications
- Status: Documentation enhanced
- Problem: Large file uploads fail due to timeouts on slow connections
- Current status: Basic resumable upload endpoint used but full resumption logic not implemented
- Recommendation: Complex feature requiring chunked uploads, progress tracking, and resume capability - suitable for future major version
- Problem: User wanted to access unlisted video statistics
- Status: Already supported with OAuth authentication - documented this capability
- Solution: Added clear documentation about OAuth requirement for unlisted content
- Status: Legal/ethical question about terms of service - not a code issue
- Both
get_comment_threads.Randget_all_comments.Ruse similar pagination patterns but may have different edge case handling - Unicode text processing is handled via
clean_youtube_text()function - Always check for duplicate comment IDs when paginating
get_playlist_items()uses older pagination pattern (while loop withlength(res$items)) vs newer pattern in search functions- Consider updating to match standardized pagination pattern from
yt_search.R