Add Moondream architecture skeleton by BharathC0 · Pull Request #2553 · keras-team/keras-hub

BharathC0 · 2026-01-30T13:08:15Z

Description

fixes #2549
This PR introduces the initial architecture skeleton for the Moondream model (a tiny vision-language model).

It implements the core components required to run the model, bridging the SigLIP vision encoder and the Phi-1.5 text decoder.

Added Components:

MoondreamBackbone: Handles the connection between the vision encoder and text decoder using a projection layer. It supports mixed input (images + text).
MoondreamCausalLM: The end-to-end task model for image-to-text generation.
MoondreamPreprocessor: Handles tokenization and image resizing/rescaling logic.
__init__.py: Registered the new classes.

Verification

I have verified the architecture locally using mock encoders and decoders to ensure:

Correct connectivity between Vision and Text components.
Correct output shapes for forward passes.
Compatibility with Keras functional API.

Note

This PR focuses on the architecture implementation. Pre-trained weights (presets) and extensive unit tests will be added in follow-up PRs.

gemini-code-assist · 2026-01-30T13:08:33Z

Summary of Changes

Hello @BharathC0, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request establishes the foundational architecture for the Moondream vision-language model within Keras Hub. It defines the core components necessary to integrate a vision encoder with a text decoder, enabling image-to-text generation capabilities. The implementation includes the backbone for feature projection, the causal language model wrapper, and a dedicated preprocessor for handling multimodal inputs.

Highlights

Moondream Architecture Skeleton: This PR introduces the initial architecture skeleton for the Moondream model, a tiny vision-language model, laying the groundwork for its full implementation.
Core Component Integration: It implements the core components required to bridge the SigLIP vision encoder and the Phi-1.5 text decoder, enabling multimodal processing.
MoondreamBackbone: The MoondreamBackbone class is added to handle the connection between the vision encoder and text decoder using a projection layer, supporting mixed input (images + text).
MoondreamCausalLM: The MoondreamCausalLM class is introduced as the end-to-end task model for image-to-text generation.
MoondreamPreprocessor: A MoondreamPreprocessor is included to manage tokenization and image resizing/rescaling logic for multimodal inputs.
Local Verification: The architecture has been verified locally using mock encoders and decoders to ensure correct connectivity, output shapes, and compatibility with the Keras functional API.
Future Work: This PR focuses solely on the architecture implementation; pre-trained weights (presets) and extensive unit tests will be added in follow-up pull requests.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces the architecture skeleton for the Moondream model. The overall structure is well-defined, but there are several areas that need to be aligned with the repository's contribution guidelines. My review focuses on ensuring adherence to the style guide, particularly regarding the use of the Keras Functional API for model definitions, the inclusion of comprehensive docstrings, and correct implementation patterns for multimodal data handling. Addressing these points will improve the maintainability and consistency of the new model within the KerasHub ecosystem.

keras_hub/src/models/moondream/moondream_backbone.py

keras_hub/src/models/moondream/moondream_causal_lm.py

keras_hub/src/models/moondream/moondream_preprocessor.py

keras_hub/src/models/moondream/__init__.py

keras_hub/src/models/moondream/moondream_backbone.py

keras_hub/src/models/moondream/moondream_causal_lm.py

keras_hub/src/models/moondream/moondream_preprocessor.py

BharathC0 · 2026-01-30T15:29:43Z

@gemini-code-assist review

gemini-code-assist

Code Review

This pull request introduces the initial architecture skeleton for the Moondream model, including the MoondreamBackbone, MoondreamCausalLM, and MoondreamPreprocessor components. The architecture has been locally verified with mock encoders and decoders. The review focuses on identifying potential issues related to correctness and adherence to the repository's style guide, particularly naming conventions, file structure, and code implementation style.

keras_hub/src/models/moondream/moondream_backbone.py

keras_hub/src/models/moondream/moondream_causal_lm.py

keras_hub/src/models/moondream/moondream_preprocessor.py

keras_hub/src/models/moondream/moondream_backbone.py

BharathC0 · 2026-01-30T16:05:50Z

@gemini-code-assist review

gemini-code-assist · 2026-01-30T16:06:02Z

Hi @BharathC0, thanks for the ping! I've kicked off the review process as requested.

gemini-code-assist

Code Review

This pull request introduces the architectural skeleton for the Moondream model, including the MoondreamBackbone, MoondreamCausalLM, and MoondreamPreprocessor. The overall structure aligns well with the repository's conventions for new model contributions. My review focuses on a few areas to improve consistency with the style guide's implementation patterns and to address a missing piece of functionality in the preprocessor that is crucial for text generation with image inputs. These changes will help ensure the new model is robust and consistent with other models in KerasHub.

keras_hub/src/models/moondream/moondream_preprocessor.py

keras_hub/src/models/moondream/__init__.py

keras_hub/src/models/moondream/moondream_backbone.py

keras_hub/src/models/moondream/moondream_causal_lm.py

BharathC0 · 2026-01-30T16:21:27Z

@gemini-code-assist review

gemini-code-assist

Code Review

The pull request introduces the architecture skeleton for the Moondream model, including the backbone, causal language model, and preprocessor. The implementation generally follows the established patterns for KerasHub models, with appropriate docstrings and get_config methods. However, there are a couple of issues related to file management and completeness of exports that need to be addressed.

keras_hub/api/models/__init__.py

keras_hub/src/models/moondream/__init__.py

BharathC0 · 2026-01-30T16:28:11Z

@gemini-code-assist review

gemini-code-assist

Code Review

The pull request introduces the architectural skeleton for the Moondream model, integrating a vision encoder and a text decoder. The implementation includes MoondreamBackbone, MoondreamCausalLM, and MoondreamPreprocessor, which are well-structured and follow the Keras functional API where appropriate. The docstrings are generally good, providing clear explanations and examples. However, a critical issue was identified regarding the modification of an autogenerated file, and a minor improvement can be made to a docstring for better adherence to the style guide.

keras_hub/api/models/__init__.py

keras_hub/src/models/moondream/moondream_preprocessor.py

BharathC0 · 2026-01-30T16:32:28Z

@gemini-code-assist review

gemini-code-assist

Code Review

This pull request introduces the architectural skeleton for the Moondream model, including the MoondreamBackbone, MoondreamCausalLM, and MoondreamPreprocessor. The overall structure adheres to the repository's modularity principles and naming conventions. Docstrings are well-formatted and include examples, which is excellent for maintainability. The use of keras.ops ensures backend agnosticism, aligning with the project's goals. I've identified a missing import in the __init__.py file and a potential runtime issue related to the text_decoder interface in the backbone. These are detailed in the specific review comments below.

keras_hub/src/models/moondream/__init__.py

keras_hub/src/models/moondream/moondream_backbone.py

BharathC0 · 2026-01-30T16:40:23Z

@sachinprasadhs please review

sachinprasadhs · 2026-02-20T19:34:06Z

@BharathC0 , Thanks for the implementation, please add the complete implementation in a single PR, which is the standard process we follow here to make the review process easier. Thanks.

Add Moondream

3bba424

gemini-code-assist bot reviewed Jan 30, 2026

View reviewed changes

BharathC0 force-pushed the moondream-architecture branch 2 times, most recently from ce78cce to 3355ea5 Compare January 30, 2026 15:22

gemini-code-assist bot reviewed Jan 30, 2026

View reviewed changes

BharathC0 force-pushed the moondream-architecture branch 2 times, most recently from dda3541 to 7dba12e Compare January 30, 2026 16:03

gemini-code-assist bot reviewed Jan 30, 2026

View reviewed changes

BharathC0 force-pushed the moondream-architecture branch from 6b1c3ec to 95e3804 Compare January 30, 2026 16:20

gemini-code-assist bot reviewed Jan 30, 2026

View reviewed changes

keras_hub/api/models/__init__.py Show resolved Hide resolved

keras_hub/src/models/moondream/__init__.py Show resolved Hide resolved

gemini-code-assist bot reviewed Jan 30, 2026

View reviewed changes

keras_hub/api/models/__init__.py Show resolved Hide resolved

keras_hub/src/models/moondream/moondream_preprocessor.py Show resolved Hide resolved

BharathC0 force-pushed the moondream-architecture branch from 9ce3307 to 9e66a2f Compare January 30, 2026 16:31

gemini-code-assist bot reviewed Jan 30, 2026

View reviewed changes

keras_hub/src/models/moondream/__init__.py Show resolved Hide resolved

keras_hub/src/models/moondream/moondream_backbone.py Show resolved Hide resolved

docs: Add docstrings and usage example to MoondreamPreprocessor

553839f

BharathC0 force-pushed the moondream-architecture branch from 16751d2 to 553839f Compare January 30, 2026 16:47

BharathC0 mentioned this pull request Feb 2, 2026

Add Moondream VLM Architecture #2549

Open

sachinprasadhs self-requested a review February 4, 2026 20:04

sachinprasadhs added the new model For PRs that contribute a new model to the Keras Hub registry. label Feb 9, 2026

sachinprasadhs added the stat:awaiting response from contributor label Feb 20, 2026

Comments

Conversation

BharathC0 commented Jan 30, 2026

Description

Added Components:

Verification

Note

Uh oh!

gemini-code-assist bot commented Jan 30, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BharathC0 commented Jan 30, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BharathC0 commented Jan 30, 2026

Uh oh!

gemini-code-assist bot commented Jan 30, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BharathC0 commented Jan 30, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

BharathC0 commented Jan 30, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

BharathC0 commented Jan 30, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

BharathC0 commented Jan 30, 2026

Uh oh!

sachinprasadhs commented Feb 20, 2026

Uh oh!

Reviewers