-
Notifications
You must be signed in to change notification settings - Fork 3
Description
Due to a power surge/outage during the SH-SY5Y run and the computer not plugged in properly to the UPS (uninterrupted power supply), the SQLite file is incomplete.
Since CellProfiler does not have the ability to pick up a run where it left off, that means to avoid spending more computational power and time rerunning the same images, we can split the LoadData CSV for this cell type into two parts:
a. Part A, where these are the images that were run
b. Part B, where these are the images that still need to be processed
We know the image set that CellProfiler stopped at using the log file from the first run. Since the LoadData CSV has the same number of rows as the images sets, we can split the data frame by row index as seen below:
def split_loaddata_csv_by_row(
path_to_loadata: pathlib.Path,
output_dir: pathlib.Path,
row_index_val: int,
first_csv_name: str,
second_csv_name: str,
):
"""
This function will split a LoadData CSV in half (two groups) based on columns into two different CSVs.
This is can used for when you have different cell types on the same plate.
Parameters
----------
path_to_loadata : pathlib.Path
path to the LoadData CSV with IC functions to be edited
output_dir : pathlib.Path
path to directory where new LoadData CSVs will be saved to
row_index_val : int
index value to separate
first_csv_name : str
name of the LoadData CSV for the first group of the plate (name should include loaddata and state
that there are IC functions)
Example: loaddata_PBMC_with_ic
second_csv_name : str
name of the LoadData CSV for the second group of the plate (see example above)
"""
# load in LoadData CSV as pandas dataframe
loaddata_df = pd.read_csv(path_to_loadata)
# splitting dataframe by row index
df_1 = loaddata_df.iloc[:row_index_val,:]
df_2 = loaddata_df.iloc[row_index_val:,:]
# save new LoadData CSVs based on given name
df_1.to_csv(pathlib.Path(f"{output_dir}/{first_csv_name}.csv"), index=False)
df_2.to_csv(pathlib.Path(f"{output_dir}/{second_csv_name}.csv"), index=False)
print(f"{path_to_loadata.name} has been split into {first_csv_name}.csv and {second_csv_name}.csv!")This allows for CellProfiler to start back where it left off.