from_protobuf (part 0): Add integration tests marked as XFAIL#14419
Open
thirtiseven wants to merge 18 commits intoNVIDIA:mainfrom
Open
from_protobuf (part 0): Add integration tests marked as XFAIL#14419thirtiseven wants to merge 18 commits intoNVIDIA:mainfrom
thirtiseven wants to merge 18 commits intoNVIDIA:mainfrom
Conversation
Contributor
Greptile SummaryThis PR adds a comprehensive integration test suite for the Key additions:
Findings:
Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant User as Test Runner
participant Script as run_pyspark_from_build.sh
participant pytest
participant Fixture as from_protobuf_fn fixture
participant Helper as is_protobuf_runtime_available()
participant Test as protobuf_test.py tests
participant CPU as CPU Spark
participant GPU as GPU Spark (XFAIL)
User->>Script: PROTOBUF_JARS=... ./run_pyspark_from_build.sh -m protobuf_test
Script->>Script: Validate jars, check spark-protobuf_* present
Script->>Script: Set PROTOBUF_JARS_AVAILABLE=true
Script->>Script: Add jars to ALL_JARS (--jars / --driver-class-path)
Script->>pytest: Launch pytest with PROTOBUF_JARS_AVAILABLE env var
pytest->>Test: Collect module (pytestmark skipif checks env var)
pytest->>Fixture: Invoke from_protobuf_fn (scope=module)
Fixture->>Helper: is_protobuf_runtime_available()
Helper->>Helper: import pyspark.sql.protobuf.functions
Helper->>Helper: Class.forName("...functions$")
Helper-->>Fixture: True
Fixture->>Fixture: import from_protobuf, return fn
loop Each XFAIL test
pytest->>Test: Run test
Test->>Test: _setup_protobuf_desc → build descriptor bytes via JVM
Test->>Test: write descriptor to HDFS temp path
Test->>Test: encode test rows via ProtobufRowGen / encode_pb_message
Test->>CPU: assert_gpu_and_cpu_are_equal_collect (CPU path)
CPU-->>Test: CPU result
Test->>GPU: assert_gpu_and_cpu_are_equal_collect (GPU path)
GPU-->>Test: XFAIL (GPU plugin not merged yet)
end
Last reviewed commit: "style" |
3 tasks
Signed-off-by: Haoyang Li <haoyangl@nvidia.com>
Signed-off-by: Haoyang Li <haoyangl@nvidia.com>
Signed-off-by: Haoyang Li <haoyangl@nvidia.com>
Signed-off-by: Haoyang Li <haoyangl@nvidia.com>
Signed-off-by: Haoyang Li <haoyangl@nvidia.com>
Signed-off-by: Haoyang Li <haoyangl@nvidia.com>
da03440 to
5213199
Compare
Signed-off-by: Haoyang Li <haoyangl@nvidia.com>
Signed-off-by: Haoyang Li <haoyangl@nvidia.com>
Signed-off-by: Haoyang Li <haoyangl@nvidia.com>
Signed-off-by: Haoyang Li <haoyangl@nvidia.com>
Signed-off-by: Haoyang Li <haoyangl@nvidia.com>
Signed-off-by: Haoyang Li <haoyangl@nvidia.com>
Signed-off-by: Haoyang Li <haoyangl@nvidia.com>
Signed-off-by: Haoyang Li <haoyangl@nvidia.com>
Signed-off-by: Haoyang Li <haoyangl@nvidia.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Part 0 of #14354
Description
This PR adds integration tests for from_protobuf. When #14354 is merged in parts, each small PR can enable some tests.
At the framework level:
from_protobufneeds a external jarspark-protobufto run, so I editedrun_pyspark_from_build.shto download the jar when it is not found.ProtobufMessageGento generate random protocol buffer data.Everything is generated by cursor.
Checklists
(Please explain in the PR description how the new code paths are tested, such as names of the new/existing tests that cover them.)