diff --git a/src/content/changelog/stream/2025-07-22-media-transformations-audio-mode.mdx b/src/content/changelog/stream/2025-07-22-media-transformations-audio-mode.mdx new file mode 100644 index 000000000000000..07b810af773e3d1 --- /dev/null +++ b/src/content/changelog/stream/2025-07-22-media-transformations-audio-mode.mdx @@ -0,0 +1,18 @@ +--- +title: Audio mode for Media Transforamtions +description: > + Media Transformations now supports `audio` mode, which extracts audio from a video. +date: 2025-07-22 +--- + +The addition of this feature allows a user to extract audio from a source video, outputting +an M4A file to use in downstream workflows like AI inference, content moderation, or transcription. + +For example, + +``` text title="Example URL" +https://example.com/cdn-cgi/media// +https://example.com/cdn-cgi/media/mode=audio,time=3s,duration=60s/ +``` + +For more information, learn about [Transforming Videos](/stream/transform-videos/). diff --git a/src/content/docs/stream/transform-videos/index.mdx b/src/content/docs/stream/transform-videos/index.mdx index e2fdad085c3efaf..2b966cb6b399b01 100644 --- a/src/content/docs/stream/transform-videos/index.mdx +++ b/src/content/docs/stream/transform-videos/index.mdx @@ -54,12 +54,13 @@ Specifies the kind of output to generate. - `video`: Outputs an H.264/AAC optimized MP4 file. - `frame`: Outputs a still image. - `spritesheet`: Outputs a JPEG with multiple frames. +- `audio`: Outputs an AAC encoded M4A file. ### `time` Specifies when to start extracting the output in the input file. Depends on `mode`: -- When `mode` is `spritesheet` or `video`, specifies the timestamp where the output will start. +- When `mode` is `spritesheet`, `video`, or `audio`, specifies the timestamp where the output will start. - When `mode` is `frame`, specifies the timestamp from which to extract the still image. - Formats as a time string, for example: 5s, 2m - Acceptable range: 0 – 10m @@ -69,7 +70,7 @@ Specifies when to start extracting the output in the input file. Depends on `mod The duration of the output video or spritesheet. Depends on `mode`: -- When `mode` is `video`, specifies the duration of the output. +- When `mode` is `video` or `audio`, specifies the duration of the output. - When `mode` is `spritesheet`, specifies the time range from which to select frames. - Acceptable range: 1s - 60s (or 1m) - Default: input duration or 60 seconds, whichever is shorter @@ -102,12 +103,18 @@ When `mode` is `video`, specifies whether or not to include the source audio in - `false`: Output will be silent. - Default: `true` +When `mode` is `audio`, audio cannot be false. + ### `format` If `mode` is `frame`, specifies the image output format. - Acceptable options: `jpg`, `png` +If `mode` is `audio`, specifies the audio output format. + +- Acceptable options: `m4a` (default) + ## Source video requirements Input video must be less than 100MB. @@ -128,6 +135,6 @@ Media Transformations will be free for all customers while in beta. After that, Media Transforamtions and Image Transformations will use the same subscriptions and usage metrics. - Generating a still frame (single image) from a video counts as 1 transformation. -- Generating an optimized video counts as 1 transformation _per second of the output_ video. +- Generating an optimized video or extracting audio counts as 1 transformation _per second of the output_ content. - Each unique transformation is only billed once per month. - All Media and Image Transformations cost $0.50 per 1,000 monthly unique transformation operations, with a free monthly allocation of 5,000.