Questions on Prompt Architecture and Retrieval Robustness in ContextGem #30

NokeYuan · 2025-06-02T17:10:15Z

NokeYuan
Jun 2, 2025

Hi Sergii,

I'm using your package and just wondering about some architectural aspects for my own research.

From contextgem/contextgem/internal/prompts/extract_aspect_items.j2 and extract_concept_items.j2, I understand the core of this project is setting up predefined templates and modifying them based on user input. Am I correct? However, are these static prompt templates (other than user-defined parameters)? For example, if I use different models like ChatGPT-4o vs LLaMA, would the chat template be modified dynamically based on the model, or do they all use the same? If it changes dynamically, which part of the code should I look at?

I see ContextGem is quite robust at fetching reference lines from long documents. Did you achieve this just by using the prompt templates? Which part of the code (or secret sauce) enables you to robustly fetch the reference line without hallucinations?

Best,

SergiiShcherbak · 2025-06-07T19:04:07Z

SergiiShcherbak
Jun 7, 2025
Maintainer

@NokeYuan Thank you for using ContextGem, your questions, and feedback!

Regarding prompt templates and model adaptations:

One of the key features of the framework is automated dynamic prompts. The internal prompts are Jinja2 templates that are dynamically rendered based on:

user input (document text, aspect/concept names and descriptions, user-provided details for specific concept types)
extraction parameters (whether references are added, reference depth, justifications settings, etc.)
global LLM extraction params such as the output language

The prompt templates themselves are not modified based on the specific LLM being used. Whether you use gpt-4o, deepseek-r1, or any other model, the same Jinja2 templates are used with the same structural content and instructions.

The templates are "dynamic" vs "static" in the sense that they contain conditional logic and variable placeholders that get populated based on your extraction configuration - i.e. it's not just variable substitution, but conditional rendering that can produce very different final prompts depending on whether you're extracting with justifications, sentence-level vs paragraph-level references, etc. The templates are model-agnostic. The framework does not automatically modify prompt content based on the capabilities or characteristics of different LLMs.

ContextGem does implement model-specific adaptations, but these occur at the LLM API call level (parameters like max_tokens vs max_completion_tokens, handling system messages, etc.), not in the prompt content itself.

All the internal prompt templates are reviewed and tested with various extraction configurations, and contain detailed instructions specifically designed to cover all extraction functionality of ContextGem.

Regarding reference extraction robustness:

ContextGem's approach to reference extraction is indeed a key architectural decision that significantly reduces hallucination risks. The framework achieves this through several mechanisms:

ID-based referencing: Instead of asking LLMs to recite full text fragments, ContextGem assigns unique IDs to each paragraph and sentence, then instructs the LLM to return only these IDs as references.
Automatic validation: Under the hood, ContextGem verifies that all returned reference IDs actually exist in the source document, providing an additional safety layer.
Cost and token efficiency: This approach significantly reduces output tokens since the LLM only needs to output short ID strings rather than full text passages, leading to lower costs and reduced risk of hitting token limits.

With the ID-based approach, ContextGem offers a much more reliable reference mapping without the common issues of partial matches, typo corrections, whitespace inconsistencies etc. that often arise in recitation-based approaches.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Questions on Prompt Architecture and Retrieval Robustness in ContextGem #30

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Questions on Prompt Architecture and Retrieval Robustness in ContextGem #30

Uh oh!

NokeYuan Jun 2, 2025

Replies: 1 comment

Uh oh!

SergiiShcherbak Jun 7, 2025 Maintainer

NokeYuan
Jun 2, 2025

SergiiShcherbak
Jun 7, 2025
Maintainer