Skip to content

Conversation

@stephherbers
Copy link
Contributor

@stephherbers stephherbers commented Jan 6, 2026

resolves #2580
adds message highlighting feature to DatasetMessagesTable:

Technical Description

rows are highlighted when the URL contains a ?message_id= parameter matching the message ID, and the page automatically scrolls to center the highlighted message on load

Demo

Loom Video

Screenshot 2026-01-08 at 4 28 39 PM Screenshot 2026-01-08 at 4 28 52 PM

Docs and Changelog

  • This PR requires docs/changelog update

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 6, 2026

📝 Walkthrough

Walkthrough

Adds support for per-row highlighting and linking between evaluation results and dataset rows. DatasetMessagesTable gains a constructor parameter highlight_message_id stored on the instance. DatasetMessagesTableView switches to a custom template (evaluations/dataset_messages_table.html), extracts an integer message_id from GET params, and passes it to the table and template context. New templates render rows with IDs (message-{id}), apply highlight classes when IDs match highlight_message_id, and auto-scroll the highlighted row into view. Models refactor table row construction to use OrderedDict for deterministic column ordering. Evaluation views add a per-row dataset edit URL/action and dataset_edit now forwards message_id in its dataset messages URL.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • snopoke
  • SmittieC
🚥 Pre-merge checks | ✅ 4 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Linked Issues check ✅ Passed The PR fulfills issue #2580 objectives: enables direct linking to dataset rows (parameter-based), implements highlighting consistent with PR #2448, and adds links in the eval results table to open dataset rows.
Out of Scope Changes check ✅ Passed All changes are scoped to implementing the message highlighting feature; no unrelated modifications were introduced outside the issue requirements.
Description check ✅ Passed The PR description covers the technical implementation with a clear issue reference, demo materials, and flags the need for docs/changelog updates.
Title check ✅ Passed The title clearly and concisely describes the main changes: adding a new template and highlight logic for the dataset table, which aligns with the primary objectives of implementing message highlighting and new template files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI Agents
In @templates/evaluations/dataset_messages_table.html:
- Line 15: The selector using querySelector with `x-data*="selectedMessageRow
=== ${messageId}"` is using a contains match and can produce false positives for
IDs that are substrings; update the selection to use an exact attribute match
instead (e.g., target the full attribute value with `=` rather than `*=`) or,
better, add a stable attribute like `data-message-id` on the row (via
get_row_attrs in apps/evaluations/tables.py, e.g., `"data-message-id":
str(record.id)`) and then change the lookup in the code that defines selectedRow
to use `document.querySelector('tr[data-message-id="' + messageId + '"]')` so
rows are matched exactly by ID (refer to selectedRow, messageId, and the
querySelector call).
🧹 Nitpick comments (2)
apps/evaluations/tables.py (2)

342-348: Well-implemented Alpine.js integration with proper dark mode support.

The get_row_attrs method correctly:

  • Merges default row attributes from settings
  • Adds Alpine.js reactive state with a computed getter for row selection
  • Applies conditional Tailwind classes that support both light and dark modes (as per coding guidelines)

The use of record.id is safe as it's a Django model primary key (integer from the database).

Optional refactor suggestion: Consider adding a data-message-id attribute here to simplify the JavaScript selector logic in the template (see comment on template file).

Suggested addition for cleaner selector targeting
 def get_row_attrs(self, record):
     attrs = {
         **settings.DJANGO_TABLES2_ROW_ATTRS,
         "x-data": f'{{"get isSelected() {{ return selectedMessageRow === {record.id} }}}}',
         ":class": "{ 'bg-sky-100/40 dark:bg-sky-950/40': isSelected }",
+        "data-message-id": str(record.id),
     }
     return attrs

This would allow the template to use document.querySelector(\tr[data-message-id="${messageId}"]`)` for exact matching.


362-362: Minor: row_attrs in Meta is unused due to method override.

The row_attrs defined in the Meta class is not used because the get_row_attrs() method takes precedence in Django Tables2. While this doesn't cause any issues, it may cause confusion for future maintainers.

Consider removing the unused attribute
     class Meta:
         model = EvaluationMessage
         fields = (
             "source",
             "human_message_content",
             "ai_message_content",
             "context",
             "history",
             "participant_data",
             "session_state",
             "actions",
         )
-        row_attrs = settings.DJANGO_TABLES2_ROW_ATTRS
         orderable = False
         empty_text = "No messages in this dataset yet."

Alternatively, add a comment explaining that get_row_attrs() overrides this setting.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ecf4a45 and 909fb4a.

📒 Files selected for processing (3)
  • apps/evaluations/tables.py
  • apps/evaluations/views/dataset_views.py
  • templates/evaluations/dataset_messages_table.html
🧰 Additional context used
📓 Path-based instructions (5)
**/*.{tsx,jsx,html}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{tsx,jsx,html}: Prefer DaisyUI components for UI implementation, with TailwindCSS as fallback for custom styling
All UI components must support both light and dark modes

Files:

  • templates/evaluations/dataset_messages_table.html
**/*.html

📄 CodeRabbit inference engine (AGENTS.md)

Use Django templates with HTMX for dynamic content and Alpine.js for client-side interactivity

Files:

  • templates/evaluations/dataset_messages_table.html
**/templates/{**/,}*.html

📄 CodeRabbit inference engine (AGENTS.md)

Template files should extend web/app/app_base.html and use {% block %} tags for customization

Files:

  • templates/evaluations/dataset_messages_table.html
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Use Ruff for Python linting and formatting with line length of 120 characters

Files:

  • apps/evaluations/views/dataset_views.py
  • apps/evaluations/tables.py

⚙️ CodeRabbit configuration file

**/*.py: Do not review for Python compatibility below 3.13
Flag heavy AI/ML imports at module level (langchain_*, openai, anthropic, google, boto3, pandas, numpy). These should be lazy-loaded inside methods to keep Django startup fast.

Files:

  • apps/evaluations/views/dataset_views.py
  • apps/evaluations/tables.py
**/{apps,config}/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/{apps,config}/**/*.py: Use lazy imports for heavy AI/ML libraries (langchain_google_vertexai, langchain_google_genai, etc.) to maintain fast Django startup time
Use TYPE_CHECKING imports for type hints when lazy-loading heavy libraries

Files:

  • apps/evaluations/views/dataset_views.py
  • apps/evaluations/tables.py
🧠 Learnings (1)
📚 Learning: 2025-12-19T12:29:01.450Z
Learnt from: CR
Repo: dimagi/open-chat-studio PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-19T12:29:01.450Z
Learning: Applies to **/*.html : Use Django templates with HTMX for dynamic content and Alpine.js for client-side interactivity

Applied to files:

  • templates/evaluations/dataset_messages_table.html
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: cypress-tests
  • GitHub Check: python-tests
🔇 Additional comments (2)
templates/evaluations/dataset_messages_table.html (1)

1-25: Verify template structure: consider extending base template.

According to the coding guidelines, template files should extend web/app/app_base.html and use {% block %} tags for customization. This template appears to be a partial template used within Django Tables2's rendering context.

Please confirm whether this is:

  1. A partial template rendered by SingleTableView (current implementation is correct)
  2. A full-page template that should extend app_base.html

If this is a full-page template, it should be restructured to extend the base template. If it's a partial, the current structure is appropriate.

apps/evaluations/views/dataset_views.py (1)

284-284: LGTM - clean integration of custom template.

The template name change correctly connects the DatasetMessagesTableView to the new custom template that provides message highlighting and auto-scroll functionality.

@codecov-commenter
Copy link

codecov-commenter commented Jan 6, 2026

Codecov Report

❌ Patch coverage is 33.33333% with 40 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
apps/evaluations/views/dataset_views.py 20.83% 19 Missing ⚠️
apps/evaluations/views/evaluation_config_views.py 8.33% 11 Missing ⚠️
apps/evaluations/tables.py 18.18% 9 Missing ⚠️
apps/evaluations/models.py 92.30% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@stephherbers
Copy link
Contributor Author

@coderabbitai please review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 8, 2026

@stephherbers, I'll review the changes in this PR.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@dimagi dimagi deleted a comment from coderabbitai bot Jan 8, 2026
@dimagi dimagi deleted a comment from coderabbitai bot Jan 8, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
apps/evaluations/models.py (1)

343-375: Data loss bug: Multiple evaluators per message results are overwritten instead of accumulated.

The refactoring changes from incremental dict updates to replacing the entire row for each result. When a message has multiple evaluators, each iteration creates a new row_data OrderedDict and assigns it to table_by_message[result.message.id], overwriting previous evaluator results for the same message.

The data model explicitly supports this scenario: EvaluationConfig.evaluators is a ManyToManyField, and EvaluationResult has separate ForeignKeys to both Evaluator and EvaluationMessage. The column naming convention "{key} ({evaluator_name})" also expects to display multiple evaluators' results in the same row.

To fix, accumulate evaluator results instead of replacing the entire row. Initialize row data on first encounter of a message, then update with each evaluator's results:

Proposed fix
def get_table_data(self, include_ids: bool = False):
    results = self.results.select_related("message", "evaluator", "session").order_by("created_at").all()
    table_by_message = {}
    for result in results:
        context_columns = {
            f"{key}": value
            for key, value in result.message_context.items()
            if key != "current_datetime"
        }
        
        # Initialize row data only on first result for this message
        if result.message.id not in table_by_message:
            row_data = OrderedDict()
            row_data["session"] = result.session.external_id if result.session_id else ""
            row_data["message_id"] = result.message.id
            row_data["Dataset Input"] = result.input_message
            row_data["Dataset Output"] = result.output_message
            row_data["Generated Response"] = result.output.get("generated_response", "")
            table_by_message[result.message.id] = row_data
        
        # Accumulate evaluator results
        table_by_message[result.message.id].update(
            {f"{key} ({result.evaluator.name})": value for key, value in result.output.get("result", {}).items()}
        )
        
        table_by_message[result.message.id].update(context_columns)
        
        if result.output.get("error"):
            table_by_message[result.message.id]["error"] = result.output.get("error")
        
        if include_ids is True:
            table_by_message[result.message.id]["id"] = result.message.id
    
    return [{"#": index, **row} for index, row in enumerate(table_by_message.values())]
🧹 Nitpick comments (2)
apps/evaluations/views/evaluation_config_views.py (1)

275-285: Consider using proper URL parameter encoding.

While the current implementation works for integer IDs, it's better practice to use Django's URL utilities for query parameter construction. Consider using urllib.parse.urlencode or passing parameters through the reverse function if the URL pattern supports it.

♻️ Proposed refactor using urlencode
+from urllib.parse import urlencode
+
 def dataset_url_factory(_, __, record, value):
     if not value:
         return "#"
     dataset_id = self.evaluation_run.config.dataset_id
     message_id = record.get("message_id")
 
     url = reverse("evaluations:dataset_edit", args=[self.kwargs["team_slug"], dataset_id])
-    return f"{url}?message_id={message_id}"
+    if message_id:
+        return f"{url}?{urlencode({'message_id': message_id})}"
+    return url
templates/evaluations/dataset_edit.html (1)

59-59: Apply URL encoding to query parameter.

While Django auto-escapes HTML content, query parameters in URLs should be properly encoded using the urlencode filter to prevent potential issues with special characters.

♻️ Proposed fix with urlencode filter
-         hx-get="{% url 'evaluations:dataset_messages_table' request.team.slug object.id %}{% if request.GET.message_id %}?message_id={{ request.GET.message_id }}{% endif %}"
+         hx-get="{% url 'evaluations:dataset_messages_table' request.team.slug object.id %}{% if request.GET.message_id %}?message_id={{ request.GET.message_id|urlencode }}{% endif %}"
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 909fb4a and b519f35.

📒 Files selected for processing (7)
  • apps/evaluations/models.py
  • apps/evaluations/tables.py
  • apps/evaluations/views/dataset_views.py
  • apps/evaluations/views/evaluation_config_views.py
  • templates/evaluations/dataset_edit.html
  • templates/evaluations/dataset_messages_table.html
  • templates/evaluations/dataset_messages_table_render.html
🚧 Files skipped from review as they are similar to previous changes (2)
  • templates/evaluations/dataset_messages_table.html
  • apps/evaluations/tables.py
🧰 Additional context used
📓 Path-based instructions (6)
**/*.{tsx,jsx,html}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{tsx,jsx,html}: Prefer DaisyUI components for UI implementation, with TailwindCSS as fallback for custom styling
All UI components must support both light and dark modes

Files:

  • templates/evaluations/dataset_edit.html
  • templates/evaluations/dataset_messages_table_render.html
**/*.html

📄 CodeRabbit inference engine (AGENTS.md)

Use Django templates with HTMX for dynamic content and Alpine.js for client-side interactivity

Files:

  • templates/evaluations/dataset_edit.html
  • templates/evaluations/dataset_messages_table_render.html
**/templates/{**/,}*.html

📄 CodeRabbit inference engine (AGENTS.md)

Template files should extend web/app/app_base.html and use {% block %} tags for customization

Files:

  • templates/evaluations/dataset_edit.html
  • templates/evaluations/dataset_messages_table_render.html
**/{apps,config}/**/models.py

📄 CodeRabbit inference engine (AGENTS.md)

**/{apps,config}/**/models.py: Inherit team-scoped models from BaseTeamModel which automatically includes team FK and audit timestamps
Use VersionsMixin and VersionsObjectManagerMixin for models requiring version tracking
Use @audit_fields decorator with AuditingManager for tracking field changes on important models
Use public_id = models.UUIDField(default=uuid.uuid4, unique=True) for external API identifiers
Use BaseModel for non-team-scoped models when team-scoping is not required
All data must be scoped to teams via BaseTeamModel with no cross-team data access without explicit permission

Files:

  • apps/evaluations/models.py
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Use Ruff for Python linting and formatting with line length of 120 characters

Files:

  • apps/evaluations/models.py
  • apps/evaluations/views/evaluation_config_views.py
  • apps/evaluations/views/dataset_views.py

⚙️ CodeRabbit configuration file

**/*.py: Do not review for Python compatibility below 3.13
Flag heavy AI/ML imports at module level (langchain_*, openai, anthropic, google, boto3, pandas, numpy). These should be lazy-loaded inside methods to keep Django startup fast.

Files:

  • apps/evaluations/models.py
  • apps/evaluations/views/evaluation_config_views.py
  • apps/evaluations/views/dataset_views.py
**/{apps,config}/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/{apps,config}/**/*.py: Use lazy imports for heavy AI/ML libraries (langchain_google_vertexai, langchain_google_genai, etc.) to maintain fast Django startup time
Use TYPE_CHECKING imports for type hints when lazy-loading heavy libraries

Files:

  • apps/evaluations/models.py
  • apps/evaluations/views/evaluation_config_views.py
  • apps/evaluations/views/dataset_views.py
🧠 Learnings (1)
📚 Learning: 2025-12-19T12:29:01.450Z
Learnt from: CR
Repo: dimagi/open-chat-studio PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-19T12:29:01.450Z
Learning: Applies to **/{apps,config}/**/views.py : Filter API queryset responses by team using `MyModel.objects.filter(team=self.request.team)`

Applied to files:

  • apps/evaluations/views/dataset_views.py
🧬 Code graph analysis (1)
apps/evaluations/views/evaluation_config_views.py (1)
apps/generics/actions.py (2)
  • chip_action (172-211)
  • url_factory (195-198)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: python-tests
🔇 Additional comments (5)
templates/evaluations/dataset_messages_table_render.html (1)

1-21: LGTM! Highlighting implementation looks solid.

The template correctly implements row highlighting with proper dark mode support. The record.record.id access pattern is correct for django-tables2's BoundRow structure, and the conditional classes and ID attributes are applied appropriately.

apps/evaluations/views/evaluation_config_views.py (2)

298-306: LGTM! Chip action integration looks correct.

The new dataset link action is properly configured with appropriate URL factory, enabled condition, and opens in a new tab as intended. The use of mark_safe for the static FontAwesome icon is appropriate.


307-309: Good consolidation of related actions.

Returning None for the message_id case is the right approach since the message ID is now accessible through the dataset link action in the session column, avoiding redundant column display.

apps/evaluations/views/dataset_views.py (2)

298-304: Good input validation for highlight_message_id.

The try-except block properly handles invalid or missing message_id parameters by setting highlight_message_id to None, preventing errors while gracefully degrading when the parameter is absent or malformed.


306-313: LGTM! Table kwargs properly configured.

The get_table_kwargs method correctly passes highlight_message_id to the table constructor with proper validation, enabling the highlighting feature in the table rendering layer.

@stephherbers stephherbers marked this pull request as ready for review January 9, 2026 19:04
Comment on lines 355 to 359
"session_state",
"actions",
)
row_attrs = settings.DJANGO_TABLES2_ROW_ATTRS
orderable = False
empty_text = "No messages in this dataset yet."

This comment was marked as outdated.

Comment on lines +307 to 317
context["highlight_message_id"] = self.get_highlight_message_id()
return context

def get_table_kwargs(self):
kwargs = super().get_table_kwargs()
kwargs["highlight_message_id"] = self.get_highlight_message_id()
return kwargs


@login_and_team_required
@require_POST

This comment was marked as outdated.

@stephherbers stephherbers requested a review from snopoke January 13, 2026 16:28
@stephherbers stephherbers changed the title add new template and highlight logic Dataset Table: add new template and highlight logic Jan 15, 2026
Copy link
Contributor

@SmittieC SmittieC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. One comment

Copy link
Contributor

@snopoke snopoke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there.

self.Meta.row_attrs = {
**settings.DJANGO_TABLES2_ROW_ATTRS,
"class": _row_class_factory,
"data-message-id": lambda record: record.id,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to use the existing id attribute: id="record-{{record.id}}"

<h2 class="pg-subtitle">{% translate "Dataset Messages" %}</h2>
<div id="dataset-messages-table"
hx-get="{% url 'evaluations:dataset_messages_table' request.team.slug object.id %}"
hx-get="{% url 'evaluations:dataset_messages_table' request.team.slug object.id %}{% if request.GET.message_id %}?message_id={{ request.GET.message_id }}{% endif %}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use `{% querystring %} tag: https://docs.djangoproject.com/en/6.0/ref/templates/builtins/#querystring

Suggested change
hx-get="{% url 'evaluations:dataset_messages_table' request.team.slug object.id %}{% if request.GET.message_id %}?message_id={{ request.GET.message_id }}{% endif %}"
hx-get="{% url 'evaluations:dataset_messages_table' request.team.slug object.id %}{% if request.GET.message_id %}{% querystring %}"

**settings.DJANGO_TABLES2_ROW_ATTRS,
"class": _row_class_factory,
"data-message-id": lambda record: record.id,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code doesn't need to be in the init method. The _row_class_factor function can also accept a table arg (see https://django-tables2.readthedocs.io/en/latest/pages/column-attributes.html#id1).

def _row_class_factory(table, record):
    class_defaults = settings.DJANGO_TABLES2_ROW_ATTRS["class"]
    if record.id == table.highlight_message_id:
        return f"{class_defaults} bg-yellow-100 dark:bg-yellow-900/20"
    return class_defaults

@snopoke
Copy link
Contributor

snopoke commented Jan 22, 2026

@stephherbers bump

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Evals dataset and result linking

5 participants