Merge branch 'master' into add_ffprobe_checks_and_env_variables

faroit · web-flow · commit b954a2c32925 · 2025-05-28T10:44:53.000+02:00
diff --git a/.github/workflows/test_unittests.yml b/.github/workflows/test_unittests.yml
@@ -10,7 +10,7 @@ jobs:
     runs-on: ubuntu-latest
     strategy:
       matrix:
-        ffmpeg-version: ["3.2.4", "3.4.2", "4.4.2", "5.1.2", "6.1.1", "7.0.0"]
+        ffmpeg-version: ["4.3", "5.0", "6.0", "7.0"]
 
     # Timeout: https://stackoverflow.com/a/59076067/4521646
     timeout-minutes: 10
@@ -25,7 +25,7 @@ jobs:
           activate-environment: stempeg
           auto-update-conda: true
           auto-activate-base: false
-          python-version: 3.7
+          python-version: 3.11
       - name: Install dependencies FFMPEG ${{ matrix.ffmpeg-version }}
         env:
           FFMPEG_INSTALL: ${{ matrix.pytorch-version }}
diff --git a/README.md b/README.md
@@ -6,15 +6,15 @@
 [![Supported Python versions](https://img.shields.io/pypi/pyversions/stempeg.svg)](https://pypi.python.org/pypi/stempeg)
 
 Python package to read and write [STEM](https://www.native-instruments.com/en/specials/stems/) audio files.
-Technically, stems are audio containers that combine multiple audio streams and metadata in a single audio file. This makes it ideal to playback multitrack audio, where users can select the audio sub-stream during playback (e.g. supported by VLC). 
+Technically, stems are audio containers that combine multiple audio streams and metadata in a single audio file. This makes it ideal to playback multitrack audio, where users can select the audio sub-stream during playback (e.g. supported by VLC).
 
 Under the hood, _stempeg_ uses [ffmpeg](https://www.ffmpeg.org/) for reading and writing multistream audio, optionally [MP4Box](https://github.com/gpac/gpac) is used to create STEM files that are compatible with Native Instruments hardware and software.
 
 #### Features
 
 - robust and fast interface for ffmpeg to read and write any supported format from/to numpy.
 - reading supports seeking and duration.
-- control container and codec as well as bitrate when compressed audio is written. 
+- control container and codec as well as bitrate when compressed audio is written.
 - store multi-track audio within audio formats by aggregate streams into channels (concatenation of pairs of
 stereo channels).
 - support for internal ffmpeg resampling furing read and write.
@@ -70,7 +70,7 @@ conda install -c conda-forge stempeg
 
 Stempeg can read multi-stream and single stream audio files, thus, it can replace your normal audio loaders for 1d or 2d (mono/stereo) arrays.
 
-By default [`read_stems`](https://faroit.com/stempeg/read.html#stempeg.read.read_stems), assumes that multiple substreams can exit (default `reader=stempeg.StreamsReader()`). 
+By default [`read_stems`](https://faroit.com/stempeg/read.html#stempeg.read.read_stems), assumes that multiple substreams can exit (default `reader=stempeg.StreamsReader()`).
 To support multi-stream, even when the audio container doesn't support multiple streams
 (e.g. WAV), streams can be mapped to multiple pairs of channels. In that
 case, `reader=stempeg.ChannelsReader()`, can be passed. Also see:
@@ -121,7 +121,7 @@ Writing stem files from a numpy tensor can done with.
 stempeg.write_stems(path="output.stem.mp4", data=S, sample_rate=44100, writer=stempeg.StreamsWriter())
 ```
 
-As seen in the flow chart above, stempeg supports multiple ways to write multi-stream audio. 
+As seen in the flow chart above, stempeg supports multiple ways to write multi-stream audio.
 Each of the method has different number of parameters. To select a method one of the following setting and be passed:
 
 * `stempeg.FilesWriter`
@@ -136,8 +136,8 @@ Each of the method has different number of parameters. To select a method one of
     Stem will be saved into a single multistream audio.
     Additionally Native Instruments Stems compabible
     Metadata is added. This requires the installation of
-    `MP4Box`. 
-    
+    `MP4Box`.
+
 > :warning: __Warning__: Muxing stems using _ffmpeg_ leads to multi-stream files not compatible with Native Instrument Hardware or Software. Please use [MP4Box](https://github.com/gpac/gpac) if you use the `stempeg.NISTemsWriter()`
 
 For more information on writing stems, see  [`stempeg.write_stems`](https://faroit.com/stempeg/write.html#stempeg.write.write_stems).
diff --git a/docs/read.html b/docs/read.html
@@ -101,7 +101,7 @@ <h1 class="title">Module <code>stempeg.read</code></h1>
         duration (float): duration in seconds
         dtype (numpy.dtype): Type of audio array to be casted into
         stem_idx (int): stream id
-        ffmpeg_format (str): ffmpeg intermediate format encoding. 
+        ffmpeg_format (str): ffmpeg intermediate format encoding.
             Choose &#34;f32le&#34; for best compatibility
 
     Returns:
@@ -123,10 +123,10 @@ <h1 class="title">Module <code>stempeg.read</code></h1>
 
     # decode to raw pcm format
     if ffmpeg_format == &#34;f64le&#34;:
-        # PCM 64 bit float 
+        # PCM 64 bit float
         numpy_dtype = &#39;&lt;f8&#39;
     elif ffmpeg_format == &#34;f32le&#34;:
-        # PCM 32 bit float 
+        # PCM 32 bit float
         numpy_dtype = &#39;&lt;f4&#39;
     elif ffmpeg_format == &#34;s16le&#34;:
         # PCM 16 bit signed int
@@ -150,7 +150,7 @@ <h1 class="title">Module <code>stempeg.read</code></h1>
     duration=None,
     stem_id=None,
     always_3d=False,
-    dtype=np.float_,
+    dtype=np.float64,
     ffmpeg_format=&#34;f32le&#34;,
     info=None,
     sample_rate=None,
@@ -181,28 +181,28 @@ <h1 class="title">Module <code>stempeg.read</code></h1>
         duration (float): Duration to load in seconds.
         stem_id (int, optional): substream id,
             defauls to `None` (all substreams are loaded).
-        always_3d (bool, optional): By default, reading a 
+        always_3d (bool, optional): By default, reading a
             single-stream audio file will return a
             two-dimensional array.  With ``always_3d=True``, audio data is
             always returned as a three-dimensional array, even if the audio
             file has only one stream.
         dtype (np.dtype, optional): Numpy data type to use, default to `np.float32`.
-        info (Info, Optional): Pass ffmpeg `Info` object to reduce number 
+        info (Info, Optional): Pass ffmpeg `Info` object to reduce number
             of os calls on file.
             This can be used e.g. the sample rate and length of a track is
             already known in advance. Useful for ML training where the
             info objects can be pre-processed, thus audio loading can
             be speed up.
-        sample_rate (float, optional): Sample rate of returned audio. 
+        sample_rate (float, optional): Sample rate of returned audio.
             Defaults to `None` which results in
             the sample rate returned from the mixture.
-        reader (Reader): Holds parameters for the reading method. 
+        reader (Reader): Holds parameters for the reading method.
             One of the following:
                 `StreamsReader(...)`
                     Read from a single multistream audio (default).
                 `ChannelsReader(...)`
                     Read/demultiplexed from multiple channels.
-        multiprocess (bool): Applys multi-processing for reading 
+        multiprocess (bool): Applys multi-processing for reading
             substreams in parallel to speed up reading. Defaults to `True`
 
     Returns:
@@ -280,7 +280,7 @@ <h1 class="title">Module <code>stempeg.read</code></h1>
         channels = min(_chans)
     else:
         raise RuntimeError(&#34;Stems do not have the same number of channels per substream&#34;)
-    
+
     # set channels to minimum channel per stream
     stems = []
 
@@ -511,7 +511,7 @@ <h2 id="shape">Shape</h2>
     duration=None,
     stem_id=None,
     always_3d=False,
-    dtype=np.float_,
+    dtype=np.float64,
     ffmpeg_format=&#34;f32le&#34;,
     info=None,
     sample_rate=None,
@@ -542,28 +542,28 @@ <h2 id="shape">Shape</h2>
         duration (float): Duration to load in seconds.
         stem_id (int, optional): substream id,
             defauls to `None` (all substreams are loaded).
-        always_3d (bool, optional): By default, reading a 
+        always_3d (bool, optional): By default, reading a
             single-stream audio file will return a
             two-dimensional array.  With ``always_3d=True``, audio data is
             always returned as a three-dimensional array, even if the audio
             file has only one stream.
         dtype (np.dtype, optional): Numpy data type to use, default to `np.float32`.
-        info (Info, Optional): Pass ffmpeg `Info` object to reduce number 
+        info (Info, Optional): Pass ffmpeg `Info` object to reduce number
             of os calls on file.
             This can be used e.g. the sample rate and length of a track is
             already known in advance. Useful for ML training where the
             info objects can be pre-processed, thus audio loading can
             be speed up.
-        sample_rate (float, optional): Sample rate of returned audio. 
+        sample_rate (float, optional): Sample rate of returned audio.
             Defaults to `None` which results in
             the sample rate returned from the mixture.
-        reader (Reader): Holds parameters for the reading method. 
+        reader (Reader): Holds parameters for the reading method.
             One of the following:
                 `StreamsReader(...)`
                     Read from a single multistream audio (default).
                 `ChannelsReader(...)`
                     Read/demultiplexed from multiple channels.
-        multiprocess (bool): Applys multi-processing for reading 
+        multiprocess (bool): Applys multi-processing for reading
             substreams in parallel to speed up reading. Defaults to `True`
 
     Returns:
@@ -641,7 +641,7 @@ <h2 id="shape">Shape</h2>
         channels = min(_chans)
     else:
         raise RuntimeError(&#34;Stems do not have the same number of channels per substream&#34;)
-    
+
     # set channels to minimum channel per stream
     stems = []
 
@@ -1130,4 +1130,4 @@ <h4><code><a title="stempeg.read.StreamsReader" href="#stempeg.read.StreamsReade
 <p>Generated by <a href="https://pdoc3.github.io/pdoc"><cite>pdoc</cite> 0.9.1</a>.</p>
 </footer>
 </body>
-</html>
+</html>
diff --git a/setup.py b/setup.py
@@ -14,45 +14,43 @@
 # Fields marked as "Optional" may be commented out.
 
 setup(
-    name='stempeg',
-    version='0.2.3',
-    description='Read and write stem/multistream audio files',
+    name="stempeg",
+    version="0.2.4",
+    description="Read and write stem/multistream audio files",
     long_description=long_description,
-    long_description_content_type='text/markdown',
-    url='http://github.com/faroit/stempeg',
-    author='Fabian-Robert Stoeter',
-    author_email='mail@faroit.com',
+    long_description_content_type="text/markdown",
+    url="http://github.com/faroit/stempeg",
+    author="Fabian-Robert Stoeter",
+    author_email="mail@faroit.com",
     classifiers=[
-            'Development Status :: 4 - Beta',
-            'Environment :: Console',
-            'Intended Audience :: Telecommunications Industry',
-            'Intended Audience :: Science/Research',
-            'Programming Language :: Python :: 3.5',
-            'Programming Language :: Python :: 3.6',
-            'Programming Language :: Python :: 3.7',
-            'Programming Language :: Python :: 3.8',
-            'Topic :: Multimedia :: Sound/Audio :: Analysis',
-            'Topic :: Multimedia :: Sound/Audio :: Sound Synthesis'
+        "Development Status :: 4 - Beta",
+        "Environment :: Console",
+        "Intended Audience :: Telecommunications Industry",
+        "Intended Audience :: Science/Research",
+        "Programming Language :: Python :: 3.9",
+        "Programming Language :: Python :: 3.10",
+        "Programming Language :: Python :: 3.11",
+        "Topic :: Multimedia :: Sound/Audio :: Analysis",
+        "Topic :: Multimedia :: Sound/Audio :: Sound Synthesis",
     ],
     zip_safe=True,
-    keywords='stems audio reader',
-    packages=find_packages(exclude=['tests']),
+    keywords="stems audio reader",
+    packages=find_packages(exclude=["tests"]),
     # Dependencies, this installs the entire Python scientific
     # computations stack
-    install_requires=[
-        'numpy>=1.6',
-        'ffmpeg-python>=0.2.0'
-    ],
+    install_requires=["numpy>=1.6", "ffmpeg-python>=0.2.0"],
     extras_require={
-        'tests': [
-            'pytest',
+        "tests": [
+            "pytest",
         ],
     },
-    entry_points={'console_scripts': [
-        'stem2files=stempeg.cli:cli',
-    ]},
+    entry_points={
+        "console_scripts": [
+            "stem2files=stempeg.cli:cli",
+        ]
+    },
     project_urls={  # Optional
-        'Bug Reports': 'https://github.com/faroit/stempeg/issues',
+        "Bug Reports": "https://github.com/faroit/stempeg/issues",
     },
-    include_package_data=True
+    include_package_data=True,
 )
diff --git a/stempeg/read.py b/stempeg/read.py
@@ -71,7 +71,7 @@ def _read_ffmpeg(
         duration (float): duration in seconds
         dtype (numpy.dtype): Type of audio array to be casted into
         stem_idx (int): stream id
-        ffmpeg_format (str): ffmpeg intermediate format encoding. 
+        ffmpeg_format (str): ffmpeg intermediate format encoding.
             Choose "f32le" for best compatibility
 
     Returns:
@@ -93,10 +93,10 @@ def _read_ffmpeg(
 
     # decode to raw pcm format
     if ffmpeg_format == "f64le":
-        # PCM 64 bit float 
+        # PCM 64 bit float
         numpy_dtype = '<f8'
     elif ffmpeg_format == "f32le":
-        # PCM 32 bit float 
+        # PCM 32 bit float
         numpy_dtype = '<f4'
     elif ffmpeg_format == "s16le":
         # PCM 16 bit signed int
@@ -120,7 +120,7 @@ def read_stems(
     duration=None,
     stem_id=None,
     always_3d=False,
-    dtype=np.float_,
+    dtype=np.float64,
     ffmpeg_format="f32le",
     info=None,
     sample_rate=None,
@@ -151,28 +151,28 @@ def read_stems(
         duration (float): Duration to load in seconds.
         stem_id (int, optional): substream id,
             defauls to `None` (all substreams are loaded).
-        always_3d (bool, optional): By default, reading a 
+        always_3d (bool, optional): By default, reading a
             single-stream audio file will return a
             two-dimensional array.  With ``always_3d=True``, audio data is
             always returned as a three-dimensional array, even if the audio
             file has only one stream.
         dtype (np.dtype, optional): Numpy data type to use, default to `np.float32`.
-        info (Info, Optional): Pass ffmpeg `Info` object to reduce number 
+        info (Info, Optional): Pass ffmpeg `Info` object to reduce number
             of os calls on file.
             This can be used e.g. the sample rate and length of a track is
             already known in advance. Useful for ML training where the
             info objects can be pre-processed, thus audio loading can
             be speed up.
-        sample_rate (float, optional): Sample rate of returned audio. 
+        sample_rate (float, optional): Sample rate of returned audio.
             Defaults to `None` which results in
             the sample rate returned from the mixture.
-        reader (Reader): Holds parameters for the reading method. 
+        reader (Reader): Holds parameters for the reading method.
             One of the following:
                 `StreamsReader(...)`
                     Read from a single multistream audio (default).
                 `ChannelsReader(...)`
                     Read/demultiplexed from multiple channels.
-        multiprocess (bool): Applys multi-processing for reading 
+        multiprocess (bool): Applys multi-processing for reading
             substreams in parallel to speed up reading. Defaults to `True`
 
     Returns:
@@ -250,7 +250,7 @@ def read_stems(
         channels = min(_chans)
     else:
         raise RuntimeError("Stems do not have the same number of channels per substream")
-    
+
     # set channels to minimum channel per stream
     stems = []
 
diff --git a/tests/test_random.py b/tests/test_random.py
@@ -3,7 +3,7 @@
 import pytest
 
 
-@pytest.fixture(params=[1024, 2048, 12313, 100000])
+@pytest.fixture(params=[1024, 2048, 100000])
 def nb_samples(request):
     return request.param
 
diff --git a/tests/test_read.py b/tests/test_read.py
@@ -9,12 +9,12 @@ def dtype(request):
     return request.param
 
 
-@pytest.fixture(params=[None, 0, 0.0000001, 1, 100])
+@pytest.fixture(params=[None, 0, 0.0000001, 1])
 def start(request):
     return request.param
 
 
-@pytest.fixture(params=[None, 0.00000001, 0.5, 1, 2.00000000000001])
+@pytest.fixture(params=[None, 0.00000001, 0.5, 2.00000000000001])
 def duration(request):
     return request.param
 
diff --git a/tests/test_write.py b/tests/test_write.py
@@ -160,7 +160,7 @@ def ordered(obj):
     else:
         return obj
 
-
+@pytest.mark.optional
 def test_nistems():
     stems, rate = stempeg.read_stems(stempeg.example_stem_path())
     with tmp.NamedTemporaryFile(