Skip to content

Commit db47101

Browse files
committed
Additional improvements to charactarize_data script
Changed the way the file list of images comprising a DICOM series is returned. Previously they were returned in random order, this change ensures that they are returned in the order that enables the SeriesImageReader to read the image from the dataframe contents without additional computations.
1 parent 1c32d25 commit db47101

File tree

1 file changed

+31
-17
lines changed

1 file changed

+31
-17
lines changed

Python/scripts/characterize_data.py

Lines changed: 31 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -373,34 +373,48 @@ def inspect_single_series(series_data, meta_data_info={}, thumbnail_settings={})
373373
reader = sitk.ImageSeriesReader()
374374
reader.MetaDataDictionaryArrayUpdateOn()
375375
reader.LoadPrivateTagsOn()
376-
# CHANGE to split list of tag values and sid is first entry
376+
# split list of tag values, sid is first entry (see inspect_series)
377377
sid = series_data[0].split(":")[0]
378378
file_names = series_info["files"]
379-
# As the files comprising a series with multiple files can reside in
380-
# separate directories and SimpleITK expects them to be in a single directory
379+
# As the files comprising a series can reside in separate directories
380+
# and may have identical file names (e.g. 1/Z0.dcm, 2/Z0.dcm)
381381
# we use a tempdir and symbolic links to enable SimpleITK to read the series as
382-
# a single image. Additionally, the files are renamed as they may have resided in
383-
# separate directories with the same file name. Finally, on Windows
384-
# we need to copy the files to the tempdir as the os.symlink documentation says that
382+
# a single image (ImageSeriesReader_GetGDCMSeriesFileNames expects all files to
383+
# be in a single directory).
384+
# On Windows we need to copy the files to the tempdir as the os.symlink documentation says that
385385
# "On newer versions of Windows 10, unprivileged accounts can create symlinks
386386
# if Developer Mode is enabled. When Developer Mode is not available/enabled,
387387
# the SeCreateSymbolicLinkPrivilege privilege is required, or the process must be
388388
# run as an administrator."
389389
# To turn Developer Mode on in Windows 11:
390390
# Settings->System->For Developers and turn Developer Mode on.
391-
# We could then comment out the Windows specific code below.
391+
# We could then use the os.symlink function instead of the indirect usage of a
392+
# copy_link_function below.
392393
with tempfile.TemporaryDirectory() as tmpdirname:
393-
if platform.system() == "Windows":
394-
for i, fname in enumerate(file_names):
395-
shutil.copy(
396-
os.path.abspath(fname), os.path.join(tmpdirname, str(i))
397-
)
398-
else:
399-
for i, fname in enumerate(file_names):
400-
os.symlink(os.path.abspath(fname), os.path.join(tmpdirname, str(i)))
401-
reader.SetFileNames(
402-
sitk.ImageSeriesReader_GetGDCMSeriesFileNames(tmpdirname, sid)
394+
copy_link_function = (
395+
shutil.copy if platform.system() == "Windows" else os.symlink
403396
)
397+
new_orig_file_name_dict = {}
398+
for i, fname in enumerate(file_names):
399+
new_fname = os.path.join(tmpdirname, str(i))
400+
new_orig_file_name_dict[new_fname] = fname
401+
copy_link_function(fname, new_fname)
402+
# For some reason on windows the returned full paths use double backslash
403+
# for all directories except the last one which has a slash. This does not
404+
# match the contents of the new_orig_file_name_dict which has a backslash
405+
# for the last entry too. In the code below we call os.path.normpath to
406+
# address this issue.
407+
sorted_new_file_names = sitk.ImageSeriesReader_GetGDCMSeriesFileNames(
408+
tmpdirname, sid
409+
)
410+
# store the file names in a sorted order so that they are saved in this
411+
# manner. This is useful for reading from the saved csv file
412+
# using the SeriesImageReader or ImageRead which expect ordered file names
413+
series_info["files"] = [
414+
new_orig_file_name_dict[os.path.normpath(new_fname)]
415+
for new_fname in sorted_new_file_names
416+
]
417+
reader.SetFileNames(sorted_new_file_names)
404418
img = reader.Execute()
405419
for k in meta_data_info.values():
406420
if reader.HasMetaDataKey(0, k):

0 commit comments

Comments
 (0)