-
Notifications
You must be signed in to change notification settings - Fork 6
Description
I'm using CellProfiler to create Nuclei, Cells and Cytoplasm profiles, but I created each of these three CSVs using different CellProfiler methods. For example I have CSVs names Nuclei_Cellpose3, Cells_Cellpose3 and Cytoplasm_Cellpose3. I was able to run the cytotable convert() method for the Nuclei.csv, Cells.csv and Cytoplasm.csv and it outputted the expected parquet, I used this code snippet
convert(
source_path=source_path,
source_datatype="csv",
dest_path="cytotable",
dest_datatype="parquet",
concat=True,
preset="cellprofiler_csv",
no_sign_request=True,
)
In order to do the same for Nuclei_Cellpose3.csv, Cells_Cellpose3.csv and Cytoplasm_Cellpose3.csv, I wrote the following snippet
convert(
source_path=source_path,
source_datatype="csv",
dest_path="cytotable_cellpose",
dest_datatype="parquet",
compartments=["Nuclei_Cellpose3", "Cells_Cellpose3", "Cytoplasm_Cellpose3"],
join=True,
joins="ImageNumber,ObjectNumber",
page_keys={
'join': 'ImageNumber',
'Cells_Cellpose3': 'ObjectNumber',
'Nuclei_Cellpose3': 'ObjectNumber',
'Cytoplasm_Cellpose3': 'ObjectNumber'
},
preset=None
)
but I get this error
CytoTableException: No matching key found in page_keys for source_group_name: all_files.csv. Please include a pagination key based on a column name from the table.
Then I tried adding a key for all_files.csv by modifying the code above, and it threw an SQL error, so I changed the join value as follows":
convert(
source_path=source_path,
source_datatype="csv",
dest_path="cytotable_cellpose",
dest_datatype="parquet",
compartments=["Nuclei_Cellpose3", "Cells_Cellpose3","Cytoplasm_Cellpose3"],
join=True,
joins="ImageNumber,ObjectNumber",
page_keys={
'image': 'ImageNumber',
'Cells_Cellpose3': 'ObjectNumber',
'Nuclei_Cellpose3': 'ObjectNumber',
'Cytoplasm_Cellpose3': 'ObjectNumber',
'join': 'Cytoplasm_Number_Object_Number',
'all_files.csv': 'ImageNumber'
},
chunk_size = 10000,
sort_output=True,
preset=None
)
This time the code run without error but never finishes and does not create any file (tried running for 1hr, while default took 3m)