Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions daft/dataframe/dataframe.py
Original file line number Diff line number Diff line change
Expand Up @@ -685,6 +685,25 @@ def _repr_html_(self) -> str:
except ImportError:
return preview._repr_html_()

@DataframePublicAPI
def _repr_mimebundle_(
self, include: Iterable[str] | None = None, exclude: Iterable[str] | None = None
) -> dict[str, str]:
mimebundle = {
"text/plain": self.__repr__(),
"text/html": self._repr_html_(),
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eager evaluation of both representations regardless of include filter

Both __repr__() and _repr_html_() are computed up-front before any include/exclude filtering is applied. This means that even when a caller specifies include={"text/plain"}, the potentially expensive _repr_html_() call (which may generate an interactive HTML widget) is still executed unconditionally.

Consider building the bundle lazily so that only the representations that will survive the filter are generated:

Suggested change
mimebundle = {
"text/plain": self.__repr__(),
"text/html": self._repr_html_(),
}
include_set = set(include) if include is not None else None
exclude_set = set(exclude) if exclude is not None else set()
mimebundle: dict[str, str] = {}
if include_set is None or "text/plain" in include_set:
if "text/plain" not in exclude_set:
mimebundle["text/plain"] = self.__repr__()
if include_set is None or "text/html" in include_set:
if "text/html" not in exclude_set:
mimebundle["text/html"] = self._repr_html_()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix


if include is not None:
include_set = set(include)
mimebundle = {k: v for k, v in mimebundle.items() if k in include_set}

if exclude is not None:
exclude_set = set(exclude)
mimebundle = {k: v for k, v in mimebundle.items() if k not in exclude_set}

return mimebundle

###
# Creation methods
###
Expand Down
25 changes: 25 additions & 0 deletions doc/issue-2598-feasibility.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Issue #2598 可行性分析

## 1. issue summary
为 `daft.DataFrame` 增加 `_repr_mimebundle_`,让不支持 `_repr_html_` 的 Jupyter 前端(例如 Zed)也能正确展示 DataFrame。

## 2. root cause
当前 `DataFrame` 只提供 `__repr__` 和 `_repr_html_`,部分前端优先走 `_repr_mimebundle_` 协议,缺失该协议会导致展示降级或不可用。

## 3. expected modification modules
- `daft/dataframe/dataframe.py`
- `tests/dataframe/test_repr.py`

## 4. implementation plan
1. 在 `DataFrame` 上新增 `_repr_mimebundle_(include=None, exclude=None)`。
2. 默认返回 `text/plain` 与 `text/html` 两种 mime。
3. 支持 `include`/`exclude` 过滤,兼容 IPython display 协议。
4. 新增测试覆盖默认返回与 include/exclude 过滤行为。

## 复杂度评估
- 预计修改文件数:2(<20)
- API 设计变更:无(仅新增兼容显示协议方法)
- 架构调整:无
- 多模块重构:无

结论:可直接实现,风险低。
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Internal planning document should not be committed

This file (doc/issue-2598-feasibility.md) is a personal feasibility-analysis scratch note written in Chinese. It is not project documentation and should not be committed to the repository. The project's documentation lives in docs/, not a new doc/ directory.

Please remove this file from the PR before merging.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix

18 changes: 18 additions & 0 deletions tests/dataframe/test_repr.py
Original file line number Diff line number Diff line change
Expand Up @@ -275,6 +275,24 @@ def test_repr_with_html_string():
)


def test_repr_mimebundle_contains_plain_and_html(make_df):
df = make_df({"A": [1, 2, 3], "B": ["x", "y", "z"]})

bundle = df._repr_mimebundle_()

assert set(bundle.keys()) == {"text/plain", "text/html"}
assert bundle["text/plain"] == df.__repr__()
assert bundle["text/html"] == df._repr_html_()


def test_repr_mimebundle_include_exclude(make_df):
df = make_df({"A": [1]})

assert set(df._repr_mimebundle_(include={"text/plain"}).keys()) == {"text/plain"}
assert set(df._repr_mimebundle_(exclude={"text/html"}).keys()) == {"text/plain"}
assert df._repr_mimebundle_(include={"text/plain"}, exclude={"text/plain"}) == {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer a single parametrized test over two separate functions

test_repr_mimebundle_contains_plain_and_html and test_repr_mimebundle_include_exclude both exercise _repr_mimebundle_. Following the project convention of consolidating related assertions, consider merging them — or at a minimum unifying the include/exclude cases into a single parametrized test:

@pytest.mark.parametrize(
    "kwargs,expected_keys",
    [
        ({}, {"text/plain", "text/html"}),
        ({"include": {"text/plain"}}, {"text/plain"}),
        ({"exclude": {"text/html"}}, {"text/plain"}),
        ({"include": {"text/plain"}, "exclude": {"text/plain"}}, set()),
    ],
)
def test_repr_mimebundle(make_df, kwargs, expected_keys):
    df = make_df({"A": [1, 2, 3], "B": ["x", "y", "z"]})
    bundle = df._repr_mimebundle_(**kwargs)
    assert set(bundle.keys()) == expected_keys

Rule Used: Prefer single parametrized functions over multiple... (source)

Learnt From
Eventual-Inc/Daft#5207

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix



class MyObj:
def __repr__(self) -> str:
return "myobj-custom-repr"
Expand Down
Loading