Add Video metadata creation sample with Gemini and Media3 #71

MayuriKhinvasara · 2025-08-11T20:06:09Z

Add Gemini Video Metadata Creation Sample

A new sample demonstrating how to use the Gemini API with Firebase and Media3 to extract metadata from videos.

The sample includes:

UI for selecting a video from a predefined list or a custom URL.
A video player using ExoPlayer to display the selected video.
Buttons to trigger different metadata extraction tasks: Thumbnails, Description, Hashtags, Account Tags, Chapters, and Links.
Displays the generated text metadata and extracted thumbnail images.
Utilizes media3-transformer to extract HDR thumbnails from the video.
Includes utility functions for timestamp conversion and managing the video list.
Implements a ViewModel to handle the interaction with the Gemini API and manage UI state.

…d Media3

…deo-metadata-creation

gemini-code-assist

Summary of Changes

Hello @MayuriKhinvasara, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new Android AI sample application named "Gemini Video Metadata Creation." This sample showcases how to use the Gemini API, integrated with Firebase, to generate various types of metadata for videos, such as descriptions, hashtags, account tags, chapters, and links. It also demonstrates the extraction of HDR quality thumbnails from videos based on timestamps provided by the Gemini model, utilizing Media3's experimental frame extractor. The changes encompass all necessary UI components, ViewModel logic, and utility functions to provide a comprehensive example of AI-powered video content analysis on Android.

Highlights

New Sample Application: A new sample application, "Gemini Video Metadata Creation," has been added to demonstrate AI-powered video metadata generation.
Gemini API Integration: The sample integrates with the Gemini 2.0 Flash model via Firebase AI to generate various types of metadata, including descriptions, hashtags, account tags, chapters, and links, directly from video content.
Video Playback and Selection: The application now includes video playback functionality using ExoPlayer and a user-friendly dropdown menu for selecting different sample videos.
Dynamic Metadata Generation: Users can dynamically choose the type of metadata they wish to generate, with the application sending corresponding prompts to the Gemini model.
Thumbnail Extraction: A key feature is the ability to extract high-dynamic-range (HDR) quality video thumbnails based on timestamps provided by the Gemini model, leveraging Media3's experimental frame extractor.
New UI Components: Dedicated Compose UI components have been introduced for video selection, playback, metadata type selection (via a button grid), and displaying the generated text and thumbnails.
Dependency Updates: The Media3 library has been updated to version 1.8.0, and the media3-transformer dependency has been added to support advanced video processing capabilities.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a new sample application for video metadata creation using Gemini. The implementation is generally well-structured, utilizing modern Android development practices like Jetpack Compose, Hilt for dependency injection, and StateFlow for UI state management, which aligns with the provided architectural guidelines. However, I've identified several areas for improvement. There is a critical resource leak in the thumbnail extraction logic that needs to be addressed, a misconfiguration in the build script's namespace, and a few violations of type safety and the project's style guide. Additionally, there are opportunities to remove unused code. Addressing these points will significantly enhance the quality and robustness of the sample.

.../main/java/com/android/ai/samples/geminivideometadatacreation/player/ExtractHDRThumbnails.kt

ai-catalog/samples/gemini-video-metadata-creation/build.gradle.kts

...ation/src/main/java/com/android/ai/samples/geminivideometadatacreation/ui/ThumbnailScreen.kt

gemini-code-assist · 2025-08-11T20:08:14Z

...m/android/ai/samples/geminivideometadatacreation/viewmodel/VideoMetadataCreationViewModel.kt

+ * [StateFlow].
+ */
+@HiltViewModel
+class VideoMetadataCreationViewModel @Inject constructor(private val application: Application) : ViewModel() {


Injecting Application directly into the ViewModel violates the architectural principle of keeping ViewModels framework-agnostic.¹² This makes the ViewModel harder to test and can lead to issues if not handled carefully. The Context dependency required for extractListOfThumbnails should be provided through a lower layer, such as a repository or a use case, which can be injected into the ViewModel.

Style Guide References

Footnotes

The style guide strongly recommends that ViewModels should be agnostic of the Android lifecycle and should not hold references to Context or other lifecycle-related types. (link) ↩

The style guide recommends against using AndroidViewModel and avoiding the Application class in ViewModels, suggesting that such dependencies should be moved to the UI or data layer. (link) ↩

I agree context, should not be passed. Figuring out a way to still share context to the ExperimentalFrameExtractor to extract thumbnails async

I don't think the style guide is correct there. You can use AndroidViewModel to get an application context in your viewmodel

In your situation, a real app would probably create a class called something like "FrameExtractorService" that took an application context in it's constructor. You would then inject that service into your viewmodel and then your viewmodel wouldn't have to deal with contexts at all.

+1 I think for this sample using an AndroidViewModel is fine.

...m/android/ai/samples/geminivideometadatacreation/viewmodel/VideoMetadataCreationViewModel.kt

ai-catalog/samples/gemini-video-metadata-creation/src/main/res/values/strings.xml

…deo-metadata-creation

lethargicpanda · 2025-08-11T21:45:18Z

...m/android/ai/samples/geminivideometadatacreation/viewmodel/VideoMetadataCreationViewModel.kt

+            try {
+                val generativeModel =
+                    Firebase.ai(backend = GenerativeBackend.vertexAI())
+                        .generativeModel("gemini-2.0-flash")


Can we use Gemini 2.5 Flash instead.

Agreed. Will update

JolandaVerhoef · 2025-08-12T08:49:20Z

ai-catalog/app/src/main/java/com/android/ai/catalog/ui/domain/SampleCatalog.kt

 import com.android.ai.samples.geminivideosummary.VideoSummarizationScreen
 import com.android.ai.samples.genai_image_description.GenAIImageDescriptionScreen
 import com.android.ai.samples.genai_summarization.GenAISummarizationScreen
 import com.android.ai.samples.genai_writing_assistance.GenAIWritingAssistanceScreen
 import com.android.ai.samples.imagen.ui.ImagenScreen
 import com.android.ai.samples.magicselfie.ui.MagicSelfieScreen

+@SuppressLint("UnsafeOptInUsageError", "NewApi")


Instead of suppressing these (valid) lint errors here at the top level, wdyt about checking for API level inside the VideoMetadataCreationScreen? That way you can show a nice "not supported" message on that screen instead of it crashing on lower API levels.

This is a very valid point. Updated accordingly

JolandaVerhoef · 2025-08-12T08:59:33Z

.../main/java/com/android/ai/samples/geminivideometadatacreation/player/ExtractHDRThumbnails.kt

+
+    return try {
+        withContext(Dispatchers.IO) {
+            // Enable HDR frames fi=or better image quality


JolandaVerhoef · 2025-08-12T15:37:39Z

ai-catalog/samples/gemini-video-metadata-creation/src/main/res/values/strings.xml

+    <string name="create_metadata_button">Create Metadata</string>
+    <string name="video_metadata_creation_title">Video Metadata Creation</string>
+    <string name="output_text_combined">%s%s</string>
+    <string name="output_text_generated_placeholder">"Text generated with Gemini : "</string>


nit; remove the space between "Gemini" and ":"

JolandaVerhoef · 2025-08-12T15:38:24Z

ai-catalog/samples/gemini-video-metadata-creation/src/main/res/values/strings.xml

+    <string name="select_video_placeholder">Select Video</string>
+    <string name="create_metadata_button">Create Metadata</string>
+    <string name="video_metadata_creation_title">Video Metadata Creation</string>
+    <string name="output_text_combined">%s%s</string>


This is a bit weird - why not "Text generated with Gemini: %s" and then just replace the dynamic part of the string? Or is there a case where the first %s would resolve to something else?

Goof catch. This was a typo.

JolandaVerhoef · 2025-08-12T15:50:33Z

ai-catalog/samples/gemini-video-metadata-creation/src/main/res/values/strings.xml

+    <!--Video titles for list of sample videos-->
+    <string name="video_title_big_buck_bunny">Big Buck Bunny</string>
+    <string name="video_title_android_spotlight_shorts">Android Spotlight Week (Shorts video)</string>
+    <string name="video_title_rio_de_janeiro">Rio De Janerio</string>


JolandaVerhoef · 2025-08-12T15:55:38Z

...m/android/ai/samples/geminivideometadatacreation/viewmodel/VideoMetadataCreationViewModel.kt

+ * [StateFlow].
+ */
+@HiltViewModel
+class VideoMetadataCreationViewModel @Inject constructor(private val application: Application) : ViewModel() {


+1 I think for this sample using an AndroidViewModel is fine.

...m/android/ai/samples/geminivideometadatacreation/viewmodel/VideoMetadataCreationViewModel.kt

JolandaVerhoef · 2025-08-12T16:00:49Z

...m/android/ai/samples/geminivideometadatacreation/viewmodel/VideoMetadataCreationViewModel.kt

+            "Provide a compelling and concise description for this video, suitable for a YouTube video description in about 7-8 lines." +
+                "The description should be engaging and accurately reflect the video\'s content. Don't assume if you don't know"
+        MetadataType.THUMBNAILS ->
+            "Get three thumbnails for this video. Return only a comma separated list of timestamps in format \"hh:mm:ss\". Don\'t return any other text."


Just for my understanding - is there any logic into which timestamps are returned here? Does Gemini look for "good" thumbnails or just randomly picks?

Good catch. The original prompt got rewritten somehow. Updated.

JolandaVerhoef · 2025-08-12T16:01:08Z

...-creation/src/main/java/com/android/ai/samples/geminivideometadatacreation/util/VideoList.kt

+    ),
+    VideoItem(
+        R.string.video_title_rio_de_janeiro,
+        "gs://cloud-samples-data/generative-ai/video/rio_de_janeiro_beyond_the_map_rio.mp4".toUri(),


This doesn't load for me

There some error on the GCP side, Removed the ones which don't load

This change also refactors the prompts into a dedicated file, improves timestamp parsing to support both `hh:mm:ss` and `mm:ss` formats, and makes HDR thumbnail extraction conditional on Android 14 and above. Additionally, unused video samples and annotations have been removed.

This commit migrates the video player from an `AndroidView`-wrapped `PlayerView` to the new `PlayerSurface` composable from the `media3-ui-compose` library. The screen layout is also updated with weights to better manage the space between the player and the generated metadata.

lethargicpanda · 2025-08-13T22:55:32Z

...creation/src/main/java/com/android/ai/samples/geminivideometadatacreation/util/PromptList.kt

+val promptList = listOf(
+    Prompt(
+        metadataType = MetadataType.DESCRIPTION,
+        text = "Provide a compelling and concise description for this video, suitable for a YouTube video description in about 7-8 lines." +


nit: I would add Return just the description, nothing else. to the prompt.

lethargicpanda · 2025-08-13T23:08:43Z

.../main/java/com/android/ai/samples/geminivideometadatacreation/VideoMetadataCreationScreen.kt

+            selectedMetadataType = uiState.selectedMetadataType,
+            onMetadataCreationClicked = onMetadataTypeClicked,
+        )
+


Can we reset the content of OutputTextDisplay when the user selects a different video?

MayuriKhinvasara added 7 commits August 11, 2025 16:58

Create a new video-metatdata-creation-sample with Gemini, Firebase an…

8f853dc

…d Media3

Create a new video-metatdata-creation-sample with Gemini, Firebase an…

5a3d33a

…d Media3

Create a new video-metatdata-creation-sample with Gemini, Firebase an…

6331d80

…d Media3

Create a new video-metatdata-creation-sample with Gemini, Firebase an…

688c239

…d Media3

Create a new video-metatdata-creation-sample with Gemini, Firebase an…

58941a1

…d Media3

Create a new video-metatdata-creation-sample with Gemini, Firebase an…

91ca1d4

…d Media3

Merge remote-tracking branch 'origin/video-metadata-creation' into vi…

7995179

…deo-metadata-creation

gemini-code-assist bot reviewed Aug 11, 2025

View reviewed changes

MayuriKhinvasara and others added 3 commits August 12, 2025 02:13

Merge remote-tracking branch 'origin/video-metadata-creation' into vi…

64e58d2

…deo-metadata-creation

Merge remote-tracking branch 'origin/video-metadata-creation' into vi…

f7d8d84

…deo-metadata-creation

🤖 Apply Spotless formatting

8aadc1c

MayuriKhinvasara marked this pull request as ready for review August 11, 2025 20:46

MayuriKhinvasara requested review from lethargicpanda, calren and ksemenova as code owners August 11, 2025 20:46

MayuriKhinvasara requested a review from bentrengrove August 11, 2025 20:46

lethargicpanda reviewed Aug 11, 2025

View reviewed changes

MayuriKhinvasara requested a review from JolandaVerhoef August 12, 2025 08:24

JolandaVerhoef reviewed Aug 12, 2025

View reviewed changes

MayuriKhinvasara added 2 commits August 14, 2025 04:09

lethargicpanda reviewed Aug 13, 2025

View reviewed changes

JolandaVerhoef approved these changes Aug 14, 2025

View reviewed changes

MayuriKhinvasara and others added 3 commits August 15, 2025 03:00

Reset metadata on video change and update description prompt

5d74ebf

Merge branch 'main' into video-metadata-creation

b1ca497

🤖 Apply Spotless formatting

b1ec586

MayuriKhinvasara changed the title ~~video-metadata-creation~~ Add Video metadata creation sample with Gemini and Media3 Aug 14, 2025

lethargicpanda approved these changes Aug 14, 2025

View reviewed changes

MayuriKhinvasara merged commit 9c476d1 into main Aug 14, 2025
1 check passed

Add Video metadata creation sample with Gemini and Media3 #71

Add Video metadata creation sample with Gemini and Media3 #71

Uh oh!

Conversation

MayuriKhinvasara commented Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot Aug 11, 2025

Choose a reason for hiding this comment

Style Guide References

Footnotes

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MayuriKhinvasara Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

MayuriKhinvasara commented Aug 11, 2025 •

edited

Loading

MayuriKhinvasara Aug 14, 2025 •

edited

Loading