Skip to content

Commit be2312a

Browse files
committed
Additional improvements to charactarize_data script
Changed the way the file list of images comprising a DICOM series is returned. Previously they were returned in random order, this change ensures that they are returned in the order that enables the SeriesImageReader to read the image from the dataframe contents without additional computations.
1 parent 1c32d25 commit be2312a

File tree

1 file changed

+25
-17
lines changed

1 file changed

+25
-17
lines changed

Python/scripts/characterize_data.py

Lines changed: 25 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -373,34 +373,42 @@ def inspect_single_series(series_data, meta_data_info={}, thumbnail_settings={})
373373
reader = sitk.ImageSeriesReader()
374374
reader.MetaDataDictionaryArrayUpdateOn()
375375
reader.LoadPrivateTagsOn()
376-
# CHANGE to split list of tag values and sid is first entry
376+
# split list of tag values, sid is first entry (see inspect_series)
377377
sid = series_data[0].split(":")[0]
378378
file_names = series_info["files"]
379-
# As the files comprising a series with multiple files can reside in
380-
# separate directories and SimpleITK expects them to be in a single directory
379+
# As the files comprising a series can reside in separate directories
380+
# and may have identical file names (e.g. 1/Z0.dcm, 2/Z0.dcm)
381381
# we use a tempdir and symbolic links to enable SimpleITK to read the series as
382-
# a single image. Additionally, the files are renamed as they may have resided in
383-
# separate directories with the same file name. Finally, on Windows
384-
# we need to copy the files to the tempdir as the os.symlink documentation says that
382+
# a single image (ImageSeriesReader_GetGDCMSeriesFileNames expects all files to
383+
# be in a single directory).
384+
# On Windows we need to copy the files to the tempdir as the os.symlink documentation says that
385385
# "On newer versions of Windows 10, unprivileged accounts can create symlinks
386386
# if Developer Mode is enabled. When Developer Mode is not available/enabled,
387387
# the SeCreateSymbolicLinkPrivilege privilege is required, or the process must be
388388
# run as an administrator."
389389
# To turn Developer Mode on in Windows 11:
390390
# Settings->System->For Developers and turn Developer Mode on.
391-
# We could then comment out the Windows specific code below.
391+
# We could then use the os.symlink function instead of the indirect usage of a
392+
# copy_link_function below.
392393
with tempfile.TemporaryDirectory() as tmpdirname:
393-
if platform.system() == "Windows":
394-
for i, fname in enumerate(file_names):
395-
shutil.copy(
396-
os.path.abspath(fname), os.path.join(tmpdirname, str(i))
397-
)
398-
else:
399-
for i, fname in enumerate(file_names):
400-
os.symlink(os.path.abspath(fname), os.path.join(tmpdirname, str(i)))
401-
reader.SetFileNames(
402-
sitk.ImageSeriesReader_GetGDCMSeriesFileNames(tmpdirname, sid)
394+
copy_link_function = (
395+
shutil.copy if platform.system() == "Windows" else os.symlink
403396
)
397+
new_orig_file_name_dict = {}
398+
for i, fname in enumerate(file_names):
399+
new_fname = os.path.join(tmpdirname, str(i))
400+
new_orig_file_name_dict[new_fname] = fname
401+
copy_link_function(fname, new_fname)
402+
sorted_new_file_names = sitk.ImageSeriesReader_GetGDCMSeriesFileNames(
403+
tmpdirname, sid
404+
)
405+
# store the file names in a sorted order so that they are saved in this
406+
# manner. This is useful for reading from the saved csv file
407+
# using the SeriesImageReader or ImageRead which expect ordered file names
408+
series_info["files"] = [
409+
new_orig_file_name_dict[new_fname] for new_fname in sorted_new_file_names
410+
]
411+
reader.SetFileNames(sorted_new_file_names)
404412
img = reader.Execute()
405413
for k in meta_data_info.values():
406414
if reader.HasMetaDataKey(0, k):

0 commit comments

Comments
 (0)