ImagingDataCommons
diff --git a/‎LICENSE‎
Lines changed: 1 addition & 1 deletion b/‎LICENSE‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/development.rst‎
Lines changed: 11 additions & 5 deletions b/‎docs/development.rst‎
Lines changed: 11 additions & 5 deletions
diff --git a/‎docs/installation.rst‎
Lines changed: 2 additions & 3 deletions b/‎docs/installation.rst‎
Lines changed: 2 additions & 3 deletions
diff --git a/‎src/dicomweb_client/__init__.py‎
Lines changed: 2 additions & 1 deletion b/‎src/dicomweb_client/__init__.py‎
Lines changed: 2 additions & 1 deletion
diff --git a/‎src/dicomweb_client/api.py‎
Lines changed: 115 additions & 24 deletions b/‎src/dicomweb_client/api.py‎
Lines changed: 115 additions & 24 deletions
@@ -1,4 +1,4 @@
-Copyright 2018 MGH & BWH Center for Clinical Data Science
+Copyright 2020 MGH Computational Pathology
 
 Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
 
 
@@ -7,7 +7,7 @@ Source code is available at Github and can be cloned via git:
 
 .. code-block:: none
 
-    git clone https://github.com/clindatsci/dicomweb-client ~/dicomweb-client
+    git clone https://github.com/mghcomputationalpathology/dicomweb-client ~/dicomweb-client
 
 The :mod:`dicomweb_client` package can be installed in *develop* mode for local development:
 
@@ -30,19 +30,25 @@ Before creating a pull request on Github, read the coding style guideline, run t
 Coding style
 ------------
 
-Code must comply with `PEP 8 <https://www.python.org/dev/peps/pep-0008/>`_. The `flake8 <http://flake8.pycqa.org/en/latest/>`_ package is used to enforce compliance.
+Code must comply with `PEP 8 <https://www.python.org/dev/peps/pep-0008/>`_.
+The `flake8 <http://flake8.pycqa.org/en/latest/>`_ package is used to enforce compliance.
 
-The project uses `numpydoc <https://github.com/numpy/numpydoc/>`_ for documenting code according to `PEP 257 <https://www.python.org/dev/peps/pep-0257/>`_ docstring conventions. Further information and examples for the NumPy style can be found at the `NumPy Github repository <https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt>`_ and the website of the `Napoleon sphinx extension <https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_numpy.html#example-numpy>`_.
+The project uses `numpydoc <https://github.com/numpy/numpydoc/>`_ for documenting code according to `PEP 257 <https://www.python.org/dev/peps/pep-0257/>`_ docstring conventions.
+Further information and examples for the NumPy style can be found at the `NumPy Github repository <https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt>`_ and the website of the `Napoleon sphinx extension <https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_numpy.html#example-numpy>`_.
 
-All API classes, functions and modules must be documented (including "private" functions and methods). Each docstring must describe input parameters and return values. Types must be specified using type hints as specified by `PEP 484 <https://www.python.org/dev/peps/pep-0484/>`_ (see `typing <https://docs.python.org/3/library/typing.html>`_ module).
+All API classes, functions and modules must be documented (including "private" functions and methods).
+Each docstring must describe input parameters and return values.
+Types must be specified using type hints as specified by `PEP 484 <https://www.python.org/dev/peps/pep-0484/>`_ (see `typing <https://docs.python.org/3/library/typing.html>`_ module) in both the function definition as well as the docstring.
 
 
 .. _running-tests:
 
 Running tests
 -------------
 
-The project uses `pytest <http://doc.pytest.org/en/latest/>`_ to write and runs unit tests. Tests should be placed in a separate ``tests`` folder within the package root folder. Files containing actual test code should follow the pattern ``test_*.py``.
+The project uses `pytest <http://doc.pytest.org/en/latest/>`_ to write and runs unit tests.
+Tests should be placed in a separate ``tests`` folder within the package root folder.
+Files containing actual test code should follow the pattern ``test_*.py``.
 
 Install requirements:
 
 
@@ -8,7 +8,7 @@ Installation guide
 Requirements
 ------------
 
-* `Python <https://www.python.org/>`_ (version 2.7 or higher)
+* `Python <https://www.python.org/>`_ (version 3.5 or higher)
 * Python package manager `pip <https://pip.pypa.io/en/stable/>`_
 
 For support of image formats:
@@ -32,6 +32,5 @@ Source code available at Github:
 
 .. code-block:: none
 
-    git clone https://github.com/clindatsci/dicomweb-client ~/dicomweb-client
+    git clone https://github.com/mghcomputationalpathology/dicomweb-client ~/dicomweb-client
     pip install ~/dicomweb-client
-
@@ -1,3 +1,4 @@
-__version__ = '0.20.0'
+__version__ = '0.21.0rc'
+
 
 from dicomweb_client.api import DICOMwebClient
@@ -5,6 +5,7 @@
 import logging
 import email
 import six
+import xml.etree.ElementTree as ET
 from collections import OrderedDict
 from io import BytesIO
 from urllib.parse import quote_plus, urlparse
@@ -177,6 +178,41 @@ def load_json_dataset(dataset: Dict[str, dict]) -> pydicom.dataset.Dataset:
     return ds
 
 
+def _load_xml_dataset(dataset: ET) -> pydicom.dataset.Dataset:
+    '''Loads DICOM Data Set in DICOM XML format.
+
+    Parameters
+    ----------
+    dataset: xml.etree.ElementTree
+        element tree
+
+    Returns
+    -------
+    pydicom.dataset.Dataset
+        data set
+
+    '''
+    ds = pydicom.Dataset()
+    for element in dataset:
+        keyword = element.attrib['keyword']
+        vr = element.attrib['vr']
+        if vr == 'SQ':
+            value = [
+                _load_xml_dataset(item)
+                for item in element
+            ]
+        else:
+            value = list(element)
+            if len(value) == 1:
+                value = value[0].text.strip()
+            elif len(value) > 1:
+                value = [v.text.strip() for v in value]
+            else:
+                value = None
+        setattr(ds, keyword, value)
+    return ds
+
+
 class DICOMwebClient(object):
 
     '''Class for connecting to and interacting with a DICOMweb RESTful service.
@@ -216,7 +252,8 @@ def __init__(
         headers: Optional[Dict[str, Union[str, Sequence[str]]]] = None,
         callback: Optional[Callable] = None,
         auth: Optional[requests.auth.AuthBase] = None,
-        gcp_service_account_key_file: Optional[str] = None
+        gcp_service_account_key_file: Optional[str] = None,
+        chunk_size: Optional[int] = None
     ) -> None:
         '''
         Parameters
@@ -256,6 +293,10 @@ def __init__(
             JSON format to be used for authentication with Google Cloud
             Healthcare services
             (see `Google Cloud Healthcare API authentication <https://cloud.google.com/healthcare/docs/how-tos/authentication>`)
+        chunk_size: int, optional
+            maximum number of bytes per data chunk using chunked transfer
+            encoding (helpful for storing and retrieving large objects or large
+            collections of objects such as studies or series)
 
         '''  # noqa
         logger.debug('initialize HTTP session')
@@ -341,6 +382,7 @@ def __init__(
                         'No password provided for user "{0}".'.format(username)
                     )
                 self._session.auth = (username, password)
+        self._chunk_size = chunk_size
 
     def _get_gcp_session(
             self,
@@ -648,7 +690,9 @@ def _http_get(
             params = {}
         url += self._build_query_string(params)
         logger.debug('GET: {} {}'.format(url, headers))
-        response = self._session.get(url=url, headers=headers)
+        # Setting stream allows for retrieval of data in chunks using
+        # the iter_content() method
+        response = self._session.get(url=url, headers=headers, stream=True)
         try:
             response.raise_for_status()
         except requests.exceptions.HTTPError as error:
@@ -710,10 +754,11 @@ def _decode_multipart_message(
             message parts
 
         '''
-        header = ''
-        for key, value in headers.items():
-            header += '{}: {}\n'.format(key, value)
-        message = email.message_from_bytes(header.encode() + body)
+        header = ''.join([
+            '{}: {}\n'.format(key, value)
+            for key, value in headers.items()
+        ]).encode()
+        message = email.message_from_bytes(header + body)
         elements = []
         for part in message.walk():
             if part.get_content_maintype() == 'multipart':
@@ -997,8 +1042,17 @@ def _http_get_multipart_application_dicom(
             ),
         }
         response = self._http_get(url, params, headers)
+        with response as r:
+            if self._chunk_size is not None:
+                logger.info('retrieve data in chunks')
+                content = b''.join([
+                    chunk
+                    for chunk in r.iter_content(chunk_size=self._chunk_size)
+                ])
+            else:
+                content = r.content
         datasets = self._decode_multipart_message(
-            response.content,
+            content,
             response.headers
         )
         return [pydicom.dcmread(BytesIO(ds)) for ds in datasets]
@@ -1357,16 +1411,53 @@ def _http_post(
 
         '''
         logger.debug('POST: {} {}'.format(url, headers))
-        response = self._session.post(url=url, data=data, headers=headers)
+
+        def serve_data_chunks(data):
+            for i, offset in enumerate(range(0, len(data), self._chunk_size)):
+                end = offset + self._chunk_size
+                yield data[offset:end]
+
+        if self._chunk_size is not None and len(data) > self._chunk_size:
+            logger.info('store data in chunks using chunked transfer encoding')
+            chunked_headers = dict(headers)
+            chunked_headers['Transfer-Encoding'] = 'chunked'
+            chunked_headers['Cache-Control'] = 'no-cache'
+            chunked_headers['Connection'] = 'Keep-Alive'
+            data_chunks = serve_data_chunks(data)
+            response = self._session.post(
+                url=url,
+                data=data_chunks,
+                headers=headers
+            )
+        else:
+            response = self._session.post(url=url, data=data, headers=headers)
         logger.debug('request status code: {}'.format(response.status_code))
-        response.raise_for_status()
+        try:
+            response.raise_for_status()
+        except requests.exceptions.HTTPError as error:
+            raise HTTPError(error)
+        except requests.exceptions.ConnectionError as error:
+            raise HTTPError(error[0])
+        if not response.ok:
+            logger.warning('storage was not successful for all instances')
+            payload = response.content
+            tree = ET.fromstring(payload)
+            dataset = _load_xml_dataset(tree)
+            failed_sop_sequence = getattr(dataset, 'FailedSOPSequence', [])
+            for failed_sop_item in failed_sop_sequence:
+                logger.error(
+                    'storage of instance {} failed: "{}"'.format(
+                        failed_sop_item.ReferencedSOPInstanceUID,
+                        failed_sop_item.FailureReason
+                    )
+                )
         return response
 
     def _http_post_multipart_application_dicom(
             self,
             url: str,
             data: bytes
-        ) -> Dict[str, dict]:
+        ) -> Union[None, Dict[str, dict]]:
         '''Performs a HTTP POST request with a multipart payload with
         "application/dicom" media type.
 
@@ -1380,7 +1471,7 @@ def _http_post_multipart_application_dicom(
         Returns
         -------
         Dict[str, dict]
-            information about stored instances in DICOM JSON format
+            information about stored instances
 
         '''
         content_type = (
@@ -1389,20 +1480,20 @@ def _http_post_multipart_application_dicom(
             'boundary=0f3cf5c0-70e0-41ef-baef-c6f9f65ec3e1'
         )
         content = self._encode_multipart_message(data, content_type)
-        self._http_post(
+        response = self._http_post(
             url,
             content,
             headers={'Content-Type': content_type}
         )
-        # FIXME: return information
-        # http://dicom.nema.org/medical/dicom/current/output/chtml/part18/chapter_I.html
-        # response = self._http_post(
-        #     url,
-        #     content,
-        #     headers={'Content-Type': content_type}
-        # )
-        # response.content
-        return {}
+        if response.content:
+            if (response.headers['Content-Type'] == 'application/dicom+json' or
+                    response.headers['Content-Type'] == 'application/json'):
+                return load_json_dataset(response.json())
+            elif (response.headers['Content-Type'] == 'application/dicom+xml' or
+                    response.headers['Content-Type'] == 'application/xml'):
+                tree = ET.fromstring(response.content)
+                return _load_xml_dataset(tree)
+        return None
 
     def search_for_studies(
             self,
@@ -2039,16 +2130,16 @@ def store_instances(
         Returns
         -------
         Dict[str, dict]
-            information about status of stored instances in DICOM JSON format
+            information about status of stored instances
 
         '''
         url = self._get_studies_url('stow', study_instance_uid)
         encoded_datasets = list()
-        # TODO: can we do this more memory efficient? Concatenations?
         for ds in datasets:
             with BytesIO() as b:
                 pydicom.dcmwrite(b, ds)
-                encoded_datasets.append(b.getvalue())
+                encoded_ds = b.getvalue()
+            encoded_datasets.append(encoded_ds)
         return self._http_post_multipart_application_dicom(
             url,
             encoded_datasets
Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,4 @@`
`1`		`-Copyright 2018 MGH & BWH Center for Clinical Data Science`
	`1`	`+Copyright 2020 MGH Computational Pathology`
`2`	`2`
`3`	`3`	`Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:`
`4`	`4`