Skip to content

Granite Docling Recipe#244

Open
AishaDarga wants to merge 1 commit intoibm-granite-community:mainfrom
AishaDarga:granite-docling-recipe
Open

Granite Docling Recipe#244
AishaDarga wants to merge 1 commit intoibm-granite-community:mainfrom
AishaDarga:granite-docling-recipe

Conversation

@AishaDarga
Copy link
Contributor

PR Checklist

Model Interaction

  • Flexible LLM platform support The platform should be easily switchable. Use LangChain or LlamaIndex.
  • Use prompt guide corresponding to the model For example for Granite 3.x Language Models

Data

  • Example data: Follow the example data guidance.

Notebook requirements

  • Notebook outputs cleared: Ensure all notebook outputs are cleared.
  • Pre-commit hooks run: Ensure the pre-commit hooks for notebooks have been run.
  • Automated testing: Add the recipe to the automated tests as described here
  • Test in Google Colab:
    • Test that it works in Google Colab (Python 3.10.12).
    • Colab has its own package set and Python version, so ensure compatibility.
  • Test locally:
    • Ensure the code works in a fresh Python virtual environment (venv).
  • Standard access to secrets and variables Include %pip install git+https://github.com/ibm-granite-community/utils in the first code cell in order to make get_env_var available to accessing secrets and variables in the recipe.

Incoming References

  • README.md updates:
    • Add a link to the recipe in the Table of Contents (ToC).
    • Include a Colab button after that link if the notebook can be run in Colab.

GitHub

  • Commits signed: All commits must be GPG or SSH signed.
  • DCO Compliance: Developer Certificate of Origin (DCO) applies to the code, documentation, and any example data provided. Ensure commits are signed off.

Copy link
Member

@bjhargrave bjhargrave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments.

"\n",
"Before extracting and classifying contracts, we need to initialize our two main engines:\n",
"\n",
"- **Granite Docling** – **ibm-granite/granite-docling-258M-mlx** a multimodal Image-Text-to-Text model designed for converting complex documents (PDFs, scanned images, etc.) into structured, machine-readable formats like Markdown, HTML, or JSON.\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like the recipe will only work on macOS (MLX). It should be able to run on any system (linux, windows). Generally we like to test the notebooks in the CI build which is a linux system.

Comment on lines +135 to +137
"api_key = os.getenv(\"WATSON_API_KEY\")\n",
"project_id = os.getenv(\"WATSON_PROJECT_ID\")\n",
"watsonx_url = os.getenv(\"WATSON_URL\", \"https://us-south.ml.cloud.ibm.com\")\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to use the get_env method from ibm-granite-community.utils. See how other notebooks get these secrets.

" logger.error(\"WATSON_API_KEY or WATSON_PROJECT_ID environment variables not set\")\n",
" raise ValueError(\"Missing required environment variables\")\n",
"\n",
"llm = WatsonxLLM(\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why you are not using ChatWatsonx? In general, I have been trying to move the notebooks to use chat completion API.

"metadata": {},
"outputs": [],
"source": [
"def extract_contract_text(file_path, max_chars=32000):\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use type hints on function arguments and return value.

"\n",
" try:\n",
" logger.info(f\"Classifying: {contract_name}\")\n",
" response = llm(prompt).strip()\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The prompt text is not properly formatted using the appropriate Granite chat template. This is a reason to use ChatWatsonx since the proper formatting will be done on the server-side. If you want to use WatsonxLLM (old completion API), then you will need to format the prompt locally. See the use of TokenizerChatPromtTemplate in other notebooks.

Using ChatWatsonx will also allow you to use structured responses which will provide more consistent results for the json schema.

" langchain_ibm \\\n",
" langchain_community \\\n",
" transformers \\\n",
" mlx-vlm \n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My guess is that installing mlx on a non-mac will be an error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants