Skip to content

testpackage for comparing hashes of outputs and saving them as reference data + vlsvrs testing#450

Open
lassejsc wants to merge 39 commits intofmihpc:devfrom
lassejsc:vlsvrsCI
Open

testpackage for comparing hashes of outputs and saving them as reference data + vlsvrs testing#450
lassejsc wants to merge 39 commits intofmihpc:devfrom
lassejsc:vlsvrsCI

Conversation

@lassejsc
Copy link
Copy Markdown
Contributor

@lassejsc lassejsc commented Mar 18, 2026

wip

Testpackage_hashes for generating and comparing hashes from the outputs of functions, currently used for vlsvreader's read_variable and read_interpolated_variable.

This will also be used for testing vlsvrs from backend later when that is added to dev proper. The current gitlab runners cannot see the reference data on turso for testing and such for now the vlsvrs is tested more robustly in the analysator side. (Backend repo does have one vlsv file for doing basic read test with vlsvrs)

In testpackage_hashes.py is class Tester that can dump and load hashes and check them etc. Basic usage is

ciTester=Tester() 
ciTester.changeFile(filename) #sets the target file
ciTester.loadobj()  #loads the target file with vlsvrs and vlsvreader, one or the other can be selected with backend="rust" or backend="python"
ciTester.setHashTarget("python") #sets which reader to use for generating hashes etc
ciTester.hash("read_variable",["CellID"])  #will make hash of the output for read_variable(CellID) into hashes_dict_python.

The ciTester.hash can accept function object or name of the function as string, note in the latter case it has to be part of the vlsvobj given to it (either VlsvReader instance or VlsvFile instance for vlsvrs)
The function can also do operations on the return values and by default whatever is returned is "flattened", so the return value shape is not taken into account by default. (This is due to the vlsvrs and vlsvreader having differences in the calls and the array shape changing is not the biggest issue IF the return values are fine imo).

"flattening" is whether or not the shape of the array is added to the bytedata when making the hash of it

Example of doing operations on the return value for shaping what gets hashed, trivial example:

ciTester.hash("read_variable",{"variable":"CellID"},op=["reshape","astype","numpy.sort"],opargs=[[tuple(-1),[int],[]])

This will hash the return value of:

numpy.sort(read_variable(variable="CellID").reshape(-1).astype(int))

Also note that arguments given to the function call can be either list (params) or dict (kwargs).

More documentation to come

Other additions unrelated to this that were fixed while doing this:

  • turso import test concurrency to match that of image compare to limit number of concurrent workflows to not clog up the cluster
  • fixed path for generate reference data

@lassejsc lassejsc changed the title Adding the CI for comparing stuff between vlsvrs and testpackage for comparing hashes of outputs and saving them as reference data + vlsvrs testing Mar 20, 2026
lassejsc added 19 commits March 20, 2026 12:51
…ed the name of the variable those are saved as for interpolationtest2d()
…or some tests (might invert the default behavior later)
… for when loaded dictionary is empty in comparison case
@lassejsc lassejsc marked this pull request as ready for review March 30, 2026 14:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant