feat: implement _repr_mimebundle_ for DataFrame#6385
feat: implement _repr_mimebundle_ for DataFrame#6385Abyss-lord wants to merge 3 commits intoEventual-Inc:mainfrom
Conversation
Greptile SummaryThis PR adds Key points from the review:
Confidence Score: 3/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant FE as Notebook Frontend
participant IPy as IPython Display
participant DF as DataFrame
FE->>IPy: display(df)
IPy->>DF: _repr_mimebundle_(include, exclude)
DF->>DF: __repr__() → text/plain
DF->>DF: _repr_html_() → text/html
DF->>DF: filter by include/exclude
DF-->>IPy: {text/plain: ..., text/html: ...}
IPy-->>FE: render best matching MIME type
Last reviewed commit: c191e8b |
doc/issue-2598-feasibility.md
Outdated
| # Issue #2598 可行性分析 | ||
|
|
||
| ## 1. issue summary | ||
| 为 `daft.DataFrame` 增加 `_repr_mimebundle_`,让不支持 `_repr_html_` 的 Jupyter 前端(例如 Zed)也能正确展示 DataFrame。 | ||
|
|
||
| ## 2. root cause | ||
| 当前 `DataFrame` 只提供 `__repr__` 和 `_repr_html_`,部分前端优先走 `_repr_mimebundle_` 协议,缺失该协议会导致展示降级或不可用。 | ||
|
|
||
| ## 3. expected modification modules | ||
| - `daft/dataframe/dataframe.py` | ||
| - `tests/dataframe/test_repr.py` | ||
|
|
||
| ## 4. implementation plan | ||
| 1. 在 `DataFrame` 上新增 `_repr_mimebundle_(include=None, exclude=None)`。 | ||
| 2. 默认返回 `text/plain` 与 `text/html` 两种 mime。 | ||
| 3. 支持 `include`/`exclude` 过滤,兼容 IPython display 协议。 | ||
| 4. 新增测试覆盖默认返回与 include/exclude 过滤行为。 | ||
|
|
||
| ## 复杂度评估 | ||
| - 预计修改文件数:2(<20) | ||
| - API 设计变更:无(仅新增兼容显示协议方法) | ||
| - 架构调整:无 | ||
| - 多模块重构:无 | ||
|
|
||
| 结论:可直接实现,风险低。 |
There was a problem hiding this comment.
Internal planning document should not be committed
This file (doc/issue-2598-feasibility.md) is a personal feasibility-analysis scratch note written in Chinese. It is not project documentation and should not be committed to the repository. The project's documentation lives in docs/, not a new doc/ directory.
Please remove this file from the PR before merging.
daft/dataframe/dataframe.py
Outdated
| mimebundle = { | ||
| "text/plain": self.__repr__(), | ||
| "text/html": self._repr_html_(), | ||
| } |
There was a problem hiding this comment.
Eager evaluation of both representations regardless of include filter
Both __repr__() and _repr_html_() are computed up-front before any include/exclude filtering is applied. This means that even when a caller specifies include={"text/plain"}, the potentially expensive _repr_html_() call (which may generate an interactive HTML widget) is still executed unconditionally.
Consider building the bundle lazily so that only the representations that will survive the filter are generated:
| mimebundle = { | |
| "text/plain": self.__repr__(), | |
| "text/html": self._repr_html_(), | |
| } | |
| include_set = set(include) if include is not None else None | |
| exclude_set = set(exclude) if exclude is not None else set() | |
| mimebundle: dict[str, str] = {} | |
| if include_set is None or "text/plain" in include_set: | |
| if "text/plain" not in exclude_set: | |
| mimebundle["text/plain"] = self.__repr__() | |
| if include_set is None or "text/html" in include_set: | |
| if "text/html" not in exclude_set: | |
| mimebundle["text/html"] = self._repr_html_() |
tests/dataframe/test_repr.py
Outdated
| def test_repr_mimebundle_contains_plain_and_html(make_df): | ||
| df = make_df({"A": [1, 2, 3], "B": ["x", "y", "z"]}) | ||
|
|
||
| bundle = df._repr_mimebundle_() | ||
|
|
||
| assert set(bundle.keys()) == {"text/plain", "text/html"} | ||
| assert bundle["text/plain"] == df.__repr__() | ||
| assert bundle["text/html"] == df._repr_html_() | ||
|
|
||
|
|
||
| def test_repr_mimebundle_include_exclude(make_df): | ||
| df = make_df({"A": [1]}) | ||
|
|
||
| assert set(df._repr_mimebundle_(include={"text/plain"}).keys()) == {"text/plain"} | ||
| assert set(df._repr_mimebundle_(exclude={"text/html"}).keys()) == {"text/plain"} | ||
| assert df._repr_mimebundle_(include={"text/plain"}, exclude={"text/plain"}) == {} |
There was a problem hiding this comment.
Prefer a single parametrized test over two separate functions
test_repr_mimebundle_contains_plain_and_html and test_repr_mimebundle_include_exclude both exercise _repr_mimebundle_. Following the project convention of consolidating related assertions, consider merging them — or at a minimum unifying the include/exclude cases into a single parametrized test:
@pytest.mark.parametrize(
"kwargs,expected_keys",
[
({}, {"text/plain", "text/html"}),
({"include": {"text/plain"}}, {"text/plain"}),
({"exclude": {"text/html"}}, {"text/plain"}),
({"include": {"text/plain"}, "exclude": {"text/plain"}}, set()),
],
)
def test_repr_mimebundle(make_df, kwargs, expected_keys):
df = make_df({"A": [1, 2, 3], "B": ["x", "y", "z"]})
bundle = df._repr_mimebundle_(**kwargs)
assert set(bundle.keys()) == expected_keysRule Used: Prefer single parametrized functions over multiple... (source)
Learnt From
Eventual-Inc/Daft#5207
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Unable to generate the performance reportThere was an internal error while processing the run's data. We're working on fixing the issue. Feel free to contact us on Discord or at support@codspeed.io if the issue persists. |
Problem
DataFramedoes not implement_repr_mimebundle_, so some notebook frontends cannot reliably render rich output.Root Cause
Only
__repr__and_repr_html_are implemented today; the IPython mimebundle display protocol is missing.Solution
_repr_mimebundle_(include, exclude)toDataFrame.text/plainandtext/htmlby default.Tests
tests/dataframe/test_repr.py.-k mimebundle: 8 passed.Impact
Improves DataFrame display compatibility across notebook frontends without affecting query behavior.