Skip to content

Provide filter overload to replace colGroups(predicate<ColumnGroup>) #1359

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions core/api/core.api
Original file line number Diff line number Diff line change
Expand Up @@ -2128,6 +2128,7 @@ public abstract interface class org/jetbrains/kotlinx/dataframe/api/ExprColumnsS

public abstract interface class org/jetbrains/kotlinx/dataframe/api/FilterColumnsSelectionDsl {
public fun filter (Lorg/jetbrains/kotlinx/dataframe/columns/ColumnSet;Lkotlin/jvm/functions/Function1;)Lorg/jetbrains/kotlinx/dataframe/columns/ColumnSet;
public fun filterColumnGroups (Lorg/jetbrains/kotlinx/dataframe/columns/ColumnSet;Lkotlin/jvm/functions/Function1;)Lorg/jetbrains/kotlinx/dataframe/columns/ColumnSet;
}

public abstract interface class org/jetbrains/kotlinx/dataframe/api/FilterColumnsSelectionDsl$Grammar {
Expand Down Expand Up @@ -5058,6 +5059,16 @@ public abstract interface class org/jetbrains/kotlinx/dataframe/columns/ColumnGr
public abstract fun rename (Ljava/lang/String;)Lorg/jetbrains/kotlinx/dataframe/columns/ColumnGroup;
}

public final class org/jetbrains/kotlinx/dataframe/columns/ColumnGroupWithPath {
public fun <init> (Lorg/jetbrains/kotlinx/dataframe/columns/ColumnGroup;Lorg/jetbrains/kotlinx/dataframe/columns/ColumnPath;)V
public final fun depth ()I
public final fun getData ()Lorg/jetbrains/kotlinx/dataframe/columns/ColumnGroup;
public final fun getDepth ()I
public final fun getName ()Ljava/lang/String;
public final fun getParentName ()Ljava/lang/String;
public final fun getPath ()Lorg/jetbrains/kotlinx/dataframe/columns/ColumnPath;
}

public class org/jetbrains/kotlinx/dataframe/columns/ColumnKind : java/lang/Enum {
public static final field Frame Lorg/jetbrains/kotlinx/dataframe/columns/ColumnKind;
public static final field Group Lorg/jetbrains/kotlinx/dataframe/columns/ColumnKind;
Expand Down
21 changes: 21 additions & 0 deletions core/src/main/kotlin/org/jetbrains/kotlinx/dataframe/api/filter.kt
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
package org.jetbrains.kotlinx.dataframe.api

import org.jetbrains.kotlinx.dataframe.AnyRow
import org.jetbrains.kotlinx.dataframe.ColumnFilter
import org.jetbrains.kotlinx.dataframe.ColumnSelector
import org.jetbrains.kotlinx.dataframe.ColumnsSelector
Expand All @@ -8,6 +9,7 @@ import org.jetbrains.kotlinx.dataframe.DataFrame
import org.jetbrains.kotlinx.dataframe.Predicate
import org.jetbrains.kotlinx.dataframe.RowFilter
import org.jetbrains.kotlinx.dataframe.annotations.AccessApiOverload
import org.jetbrains.kotlinx.dataframe.columns.ColumnGroupWithPath
import org.jetbrains.kotlinx.dataframe.columns.ColumnPath
import org.jetbrains.kotlinx.dataframe.columns.ColumnReference
import org.jetbrains.kotlinx.dataframe.columns.ColumnSet
Expand Down Expand Up @@ -159,6 +161,25 @@ public interface FilterColumnsSelectionDsl {
@Suppress("UNCHECKED_CAST")
public fun <C> ColumnSet<C>.filter(predicate: ColumnFilter<C>): ColumnSet<C> =
colsInternal(predicate as ColumnFilter<*>).cast()

/**
* ## Filter [ColumnSet]
*
* #### For example:
* ```kotlin
* df.convert {
* colsAtAnyDepth().colGroups()
* .filter { it.data.containsColumn("myCol")
* }.with { ... }
* ```
*/
@Suppress("INAPPLICABLE_JVM_NAME")
@JvmName("filterColumnGroups")
public fun ColumnSet<AnyRow>.filter(predicate: Predicate<ColumnGroupWithPath<*>>): ColumnSet<*> =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this intermediate type is not needed, right?
colGroups {} takes a Predicate<ColumnGroup<*>>; we can duplicate that here:

fun ColumnSet<AnyRow>.filter(predicate: Predicate<ColumnGroup<*>>): ColumnSet<*> =
    colsInternal { columnWithPath ->
        columnWithPath.isColumnGroup() && predicate(columnWithPath)
    }

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.isColumnGroup() does an instance check and auto-cast so is safe :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes but it won't hurt to have path property, same as ColumnSet<*>.filter does

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now I understand what you're trying to achieve :)
Still, we'll have to take a slightly further look, because I see we have the ColumnGroupWithPathImpl class. Adding a completely separate ColumnGroupWithPath would be confusing at best

Copy link
Collaborator Author

@koperagen koperagen Aug 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As alternative we could rename and expose ColumnGroupWithPathImpl, but

  1. ColumnWithPath is used extensively across whole column selection and ColumnGroupWithPathImpl is important implementation detail
  2. ColumnGroupWithPath is needed only for one filter overload, and is very simple itself - it's only a class with 2 properties.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we don't introduce a new class and just create:

@Suppress("UNCHECKED_CAST")
@get:JvmName("columnAsColumnGroup")
public val <T> ColumnWithPath<DataRow<T>>.column: ColumnGroup<T>
    get() = data as ColumnGroup<T>

@get:JvmName("columnAsFrameColumn")
public val <T> ColumnWithPath<DataFrame<T>>.column: FrameColumn<T>
    get() = data as FrameColumn<T>

public val <T> ColumnWithPath<T>.column: DataColumn<T>
    get() = data

I think the name "column" is easier to discover in the ColumnWithPath class than "data".

We could even try to find a way to hide data in ColumnWithPath, but that probably even won't be necessary.

(btw I tried calling them "data" with @kotlin.internal.HidesMembers, but unfortunately that doesn't seem to work :( )

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, i'll try

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In CS DSL scope for DataColumn<DataRow<*>> is very polluted though =( There are functions named column. IMO totally impossible to find anything just from looking at completion.
image
image

But if person knows what they're looking for, they can use asColumnGroup even without column.

For comparison completion for ColumnGroupWithPath looks clearer:
image

I tried to limit visibility of functions from outer scope with DslMarker, but i'm afraid everything is still visible despite Dsl Scope violation errors.

So i'd go with ColumnGroupWithPath for better discoverability

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. Just try to be as transparant as possible in KDoc if you decide to go this route. I want to prevent confusion between ColumnGroupWithPathImp and ColumnGroupWithPath, especially if they are separate things.

Also, how's the situation for FrameColumn?

colsInternal { columnWithPath ->
columnWithPath.isColumnGroup() &&
predicate(ColumnGroupWithPath(columnWithPath, columnWithPath.path))
}
}

// endregion
Original file line number Diff line number Diff line change
Expand Up @@ -63,3 +63,13 @@ public interface ColumnWithPath<out T> : DataColumn<T> {
public val <T> ColumnWithPath<T>.depth: Int get() = path.depth()

public fun ColumnWithPath(column: DataColumn<*>, path: ColumnPath): ColumnWithPath<*> = column.addPath(path)

public class ColumnGroupWithPath<T>(public val data: ColumnGroup<T>, public val path: ColumnPath) {
public val name: String get() = data.name()

public val parentName: String? get() = path.parentName

public fun depth(): Int = path.depth()

public val depth: Int get() = path.depth()
}
Original file line number Diff line number Diff line change
Expand Up @@ -19,4 +19,12 @@ class FilterTests : ColumnsSelectionDslTests() {
df.select { all().filter { true } } shouldBe df.select { all() }
df.select { all().filter { false } } shouldBe df.select { none() }
}

@Test
fun `filter column group`() {
listOf(
df.select { name },
df.select { colsAtAnyDepth().colGroups().filter { it.data.containsColumn("firstName") } },
)
}
}