Skip to content

feat: Add string trim function#6372

Open
Lucas61000 wants to merge 3 commits intoEventual-Inc:mainfrom
Lucas61000:issue-3792
Open

feat: Add string trim function#6372
Lucas61000 wants to merge 3 commits intoEventual-Inc:mainfrom
Lucas61000:issue-3792

Conversation

@Lucas61000
Copy link
Contributor

Changes Made

Add trim() function to Daft's string functions.

Related Issues

#3792

@github-actions github-actions bot added the feat label Mar 9, 2026
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 9, 2026

Greptile Summary

This PR adds a trim() string function to Daft, which strips leading and trailing whitespace from UTF-8 strings. It follows the exact same layered pattern as the existing lstrip and rstrip functions: a Rust ScalarUDF (Trim) delegates to Rust's built-in str::trim(), exposed through a Python daft.functions.trim() top-level function, a SeriesStringNamespace.trim() method, and an Expression.trim() convenience method.

  • The implementation is correct and complete across all layers (Rust UDF, Python function, Series namespace, Expression method, __init__ export, and a basic test).
  • Two style-level concerns are flagged: (1) the trim() method in expressions.py uses an inline from daft.functions import trim import, consistent with the existing file pattern but in violation of the project's import-placement convention; (2) the PR adds a third standalone strip function (trim) alongside lstrip and rstrip, which diverges from the project preference for a single parametrized function covering all variants.

Confidence Score: 4/5

  • This PR is safe to merge; no logic bugs or security issues were found.
  • The implementation faithfully mirrors the established lstrip/rstrip pattern at every layer, uses correct Rust stdlib semantics, handles None values, and is covered by a test. The score is 4 rather than 5 due to two style-level convention violations (inline import and preference for parametrized strip functions) that are worth addressing before or after merge.
  • No files require special attention; style issues are in daft/expressions/expressions.py and daft/functions/str.py.

Important Files Changed

Filename Overview
daft/expressions/expressions.py Adds trim() method to the Expression class, following the same pattern as lstrip/rstrip. Uses an inline import which violates the project's import convention.
daft/functions/init.py Exports trim from the str module and adds it to __all__. The __all__ entry is correctly placed in alphabetical order.
daft/functions/str.py Adds a trim() function with docstring and doctest, consistent with lstrip/rstrip. The function is a third separate strip variant rather than using a parametrized approach.
daft/series.py Adds trim() to SeriesStringNamespace, consistent with existing lstrip/rstrip methods.
src/daft-functions-utf8/src/trim.rs New Rust Trim scalar UDF, closely mirroring lstrip.rs/rstrip.rs. Uses Rust's standard str::trim() which strips both leading and trailing Unicode whitespace. No clippy suppressions; no logic issues found.
src/daft-functions-utf8/src/lib.rs Registers the new trim module, re-exports it, and adds Trim to the function module. Changes are minimal and correct.
tests/recordbatch/utf8/test_trim.py Single test covering None passthrough and various whitespace characters on both sides. Matches the coverage level of test_lstrip.py and test_rstrip.py.

Sequence Diagram

sequenceDiagram
    participant User as Python User
    participant Expr as Expression.trim()
    participant Fn as daft.functions.trim()
    participant Series as SeriesStringNamespace.trim()
    participant Rust as Trim ScalarUDF (Rust)

    User->>Expr: col("x").trim()
    Expr->>Fn: trim(self)
    Fn->>Fn: _call_builtin_scalar_fn("trim", expr)
    Fn-->>User: Expression (lazy plan node)
    User->>Series: series.str.trim()
    Series->>Series: _eval_expressions("trim")
    Series->>Rust: Trim::call(inputs)
    Rust->>Rust: val.trim()
    Rust-->>Series: DaftResult[Series]
    Series-->>User: trimmed Series
Loading

Last reviewed commit: bc0b6ae

@Lucas61000
Copy link
Contributor Author

Hi @universalmind303 @srilman , could you please review this PR when you have time? I’d appreciate your feedback on how to best handle the trim and strip functions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant