Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,20 @@ def f(num1, num2):
var q = "source.select(['num1=f(num1, num2)','num1=(double)num1'])";
runner.test("UDF- 2 Doubles to Double No Hints", q, "num1", "num2");
}

@Test
void udf2DoublesToDoubleNoHintsSerial() {
var setup = """
def f(num1, num2):
return num1 + num2
from deephaven.table import Selectable
col1 = Selectable.parse('num1=f(num1, num2)').with_serial()
col2 = Selectable.parse('num1=(double)num1').with_serial()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the second cast for? Seems like it is going to do extra work beyond the UDF; making us measure not quite what we intend. I see we have it in the other test, so changing it would be inconsistent, but I would like to know the motivation so we can decide if it is really necessary going forward.

If we do need the cast, because there are no hints; why do we prefer it this way instead of just"(double)f(num1, num2)" as a single statement.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the javadoc heading I have the following...

Note: The "No Hints" tests have casts to make them equivalent to the hints tests, otherwise 
the return value would always be a PyObject and not really the same test. They use two 
formulas to achieve this, otherwise vectorization would not happen on "No Hints" benchmarks.

... Jianfeng pointed the vectorization issue out in a review.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that wse are testing exactly what we want though; because with a select like that we have to write down a PyObject. Then we have to read the PyObject from the first column and cast it.

    col1 = Selectable.parse('num1=(double)f(num1, num2)').with_serial()

Would take the PyObject inside of the formula, then turn it into a double, rather than writing it down and holding onto it for ever.

Fixing/changing the benchmark does have some negative effects though; because we basically lose the history.

""";
runner.addSetupQuery(setup);
var q = "source.select([col1, col2])";
runner.test("UDF- 2 Doubles to Double No Hints Serial", q, "num1", "num2");
}

@Test
void udf2DoublesToDoubleNoHintsNoVectorize() {
Expand All @@ -80,6 +94,20 @@ def f(num1: float, num2: float) -> float:
var q = "source.select(['num1=f(num1, num2)'])";
runner.test("UDF- 2 Doubles to Double Python Hints", q, "num1", "num2");
}

@Test
void udf2DoublesToDoublePythonHintsSerial() {
runner.setScaleFactors(3, 0);
var setup = """
def f(num1: float, num2: float) -> float:
return num1 + num2
from deephaven.table import Selectable
col = Selectable.parse('num1=f(num1, num2)').with_serial()
""";
runner.addSetupQuery(setup);
var q = "source.select([col])";
runner.test("UDF- 2 Doubles to Double Python Hints Serial", q, "num1", "num2");
}

@Test
void udf2DoublesToDoublePythonHintsNoVectorize() {
Expand Down Expand Up @@ -126,6 +154,19 @@ def f(num1: np.float64, num2: np.float64) -> np.float64:
var q = "source.select(['num1=f(num1, num2)'])";
runner.test("UDF- 2 Doubles to Double Numpy Hints", q, "num1", "num2");
}

@Test
void udf2DoublesToDoubleNumpyHintsSerial() {
var setup = """
def f(num1: np.float64, num2: np.float64) -> np.float64:
return num1 + num2
from deephaven.table import Selectable
col = Selectable.parse('num1=f(num1, num2)').with_serial()
""";
runner.addSetupQuery(setup);
var q = "source.select([col])";
runner.test("UDF- 2 Doubles to Double Numpy Hints Serial", q, "num1", "num2");
}

@Test
void udf2DoublesToDoubleNumpyHintsNoVectorize() {
Expand Down