Skip to content

source-youtube-data: Multiple channel_ids causes videos/video/comments streams to return no data #72638

@zjaniak

Description

@zjaniak

Bug Description

When configuring the YouTube Data source connector with multiple channel_ids, the videos, video, and comments streams return no data. The streams only work correctly when a single channel ID is provided.

Environment

  • Connector: source-youtube-data
  • Connector Version: 0.0.45
  • Airbyte Platform: Self-hosted (Airbyte OSS)

Steps to Reproduce

  1. Create a YouTube Data source with multiple channel IDs:
    {
      "credentials": {
        "auth_method": "api_key",
        "api_key": "YOUR_API_KEY"
      },
      "channel_ids": ["UCJr72fY4cTaNZv7WPbvjaSw", "UC8lxnUR_CzruT2KA6cb7p0Q"]
    }
  2. Run a sync
  3. Observe that videos and video streams return 0 records

Expected Behavior

All configured channels should have their videos discovered and synced.

Actual Behavior

  • With 1 channel: videos and video streams work correctly
  • With 2+ channels: videos and video streams return no data

Root Cause Analysis

Looking at the manifest.yaml, the videos stream passes the entire channel_ids array to the YouTube Search API's channelId parameter:

videos:
  type: DeclarativeStream
  name: videos
  retriever:
    requester:
      path: search
      request_parameters:
        channelId: "{{ config.channel_ids }}"  # BUG: passes array

However, the YouTube Data API v3 Search endpoint only accepts a single channelId value, not an array. When an array is passed, the API likely rejects or ignores the invalid input.

This is in contrast to the channels stream which correctly uses a ListPartitionRouter to iterate over each channel ID:

channels:
  retriever:
    partition_router:
      type: ListPartitionRouter
      values: "{{ config.channel_ids }}"
      request_option:
        field_name: id

Suggested Fix

Add a ListPartitionRouter to the videos stream (and channel_comments stream which has the same issue):

videos:
  type: DeclarativeStream
  name: videos
  retriever:
    type: SimpleRetriever
    requester:
      $ref: "#/definitions/base_requester"
      path: search
      http_method: GET
      request_parameters:
        type: video
    record_selector:
      type: RecordSelector
      extractor:
        type: DpathExtractor
        field_path:
          - items
          - "*"
          - id
    paginator:
      type: DefaultPaginator
      page_token_option:
        type: RequestOption
        inject_into: request_parameter
        field_name: pageToken
      page_size_option:
        type: RequestOption
        inject_into: request_parameter
        field_name: maxResults
      pagination_strategy:
        type: CursorPagination
        page_size: 50
        cursor_value: "{{ response.nextPageToken }}"
        stop_condition: "{{ not response.get('nextPageToken') }}"
    partition_router:
      type: ListPartitionRouter
      values: "{{ config.channel_ids }}"
      cursor_field: channel_id
      request_option:
        type: RequestOption
        inject_into: request_parameter
        field_name: channelId

Affected Streams

  • videos - passes array to channelId parameter
  • video - depends on videos as parent stream, so also broken
  • comments - depends on videos as parent stream, so also broken
  • channel_comments - same bug pattern with allThreadsRelatedToChannelId

Workaround

Create separate Airbyte sources for each YouTube channel instead of using a single source with multiple channel IDs.

Additional Context


Internal Tracking: https://github.com/airbytehq/oncall/issues/11141

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions