`PostprocessingDataset`: add support for multiprocessing #1765

NeoLegends · 2025-09-19T12:29:04Z

~~Currently fails due to #1763.~~

albertz · 2025-09-19T12:36:45Z

Let's discuss the implementation first before you implement sth. See my comment in #1701. I think we can simply share/reuse the MultiProcDataset code?

albertz · 2025-09-19T13:08:17Z

We should also fix #1766 first.

NeoLegends · 2025-09-19T13:08:27Z

I understand it's not yet clear if it's safe to always copy the raw tensor in Tensor __getstate__.

So, for the purposes of this PR, where I want to send TensorDicts across a pickled pipe, do you think it's reasonable to introduce such a scope/contextmanager (as already mentioned in #1541) that configures whether the raw_tensor is included in __getstate__ or not?

With the contextmanager we do not need to know now whether it's always safe to do copy. If it then eventually turns out it's always safe to copy the raw_tensor we can remove the contextmanager implementation again from RETURNN and default to always copying.

albertz · 2025-09-19T13:14:09Z

I think we should just fix #1541. I.e. first thing is to understand whether it is maybe safe to just always copy it. Or understand when it is not safe. I would avoid introducing some code complexity (like such a context scope) just because we want to avoid understanding the code.

returnn/datasets/postprocessing.py

… constants

NeoLegends · 2025-09-22T14:07:34Z

returnn/datasets/postprocessing.py

-        assert self._dataset is not None
-        self._dataset.init_seq_order(epoch=epoch, seq_list=seq_list, seq_order=seq_order)
-        self._data_iter = enumerate(self._build_mapping_iter())
+        if self._num_workers > 0:


I wonder if it makes sense to use a dedicated subclass for the multi-process variant to keep the base implementation more simple and avoid mixing in concerns of process management etc.

WDYT @albertz ?

I like this, the result has come out really cleanly separated.

@albertz see here for why I made a subclass instead. I did not like how the multiprocessing details bled into the existing, straightforward implementation and made it less straightforward. There were class fields that were only used for one of the two cases but not the other, the data iterator would be dynamically either one or the other variant, etc.

NeoLegends · 2025-09-29T12:17:08Z

It seems like manually calling gc.collect() after every 1000 (just an example number I picked, must not be too low to avoid slowing down the processor too much) segments keeps the resident size at ~2GB. So it is a GC issue in a way.

I guess from this finding it's sensible to add a gc.collect() to the other subprocess datasets as well.

Flamegraph WIP

NeoLegends · 2025-09-29T12:25:48Z

Flamegraph w/ GC every 2000 seqs:

Memory usage is half of what it was before. Is the Python GC even aware of memory pressure? If not, then I think we should add something like this to every dataset.

mem-2767946.html

albertz · 2025-09-29T12:29:39Z

Even 2GB is already much more than I would expect. A Python interpreter with Numpy imported takes rss=36.6MB pss=36.1MB uss=36.0MB shared=596.0KB for me.

NeoLegends · 2025-09-29T12:37:14Z

Ah, I've been running with buf_size=200 so far, so some memory usage is expected. With buf_size=1 and frequent GC the memory usage is at about 400MB.

albertz · 2025-09-29T12:45:39Z

Ah, I've been running with buf_size=200 so far, so some memory usage is expected. With buf_size=1 and frequent GC the memory usage is at about 400MB.

Buffer size 200 seems way too much. I would expect sth like 1-10 is enough here?

albertz · 2025-09-29T12:47:07Z

With buf_size=1 and frequent GC the memory usage is at about 400MB.

This seems still a bit too much to me. Where exactly is it consumed? As said, I would expect more like 40MB.

NeoLegends · 2025-09-29T13:12:36Z

Buffer size 200 seems way too much. I would expect sth like 1-10 is enough here?

Yeah, maybe. I tend to use larger buffers to make sure the data pipeline never stalls.

This seems still a bit too much to me. Where exactly is it consumed? As said, I would expect more like 40MB.

Hm, maybe. But I also don't think it's that problematic anymore then, maybe digging even deeper into where this memory is spent is "off topic" for this PR? This 400MB is the resident size, and as seen is about 2x the heap size here.

Here is the flamegraph:

mem-2769354.html

NeoLegends · 2025-09-29T13:16:26Z

I'm now looking into the allocation counts, they seem suspiciously high to me. In the 10 minutes of profiling time the process made about 930M allocations, many of which are made by the unpickler. Maybe we have an issue w/ unpickling?

albertz · 2025-09-29T13:44:57Z

I'm not sure this is off topic for the PR. For that, we first need to understand where it comes from.

From your flamegraph, I don't really get: where is the mem allocated to, to what objects, where were they created, etc?

NeoLegends · 2025-09-29T13:45:06Z

❯ memray stats -n 20 mem-2769354.profile
📏 Total allocations:
        466772172

📦 Total memory allocated:
        120.993GB

📈 Peak memory usage:
        175.886MB

📊 Histogram of allocation size:
        min: 1.000B
        ------------------------------------------------
        < 5.000B   :    759459 ▇
        < 26.000B  :  67449044 ▇▇▇▇▇
        < 136.000B : 395472412 ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇
        < 705.000B :   1863909 ▇
        < 3.636kB  :    379359 ▇
        < 18.739kB :    376007 ▇
        < 96.580kB :    247423 ▇
        < 497.753kB:    179849 ▇
        < 2.565MB  :     44196 ▇
        <=13.221MB :       514 ▇
        ------------------------------------------------
        max: 13.221MB

📂 Allocator type distribution:
         PYMALLOC_MALLOC: 371663783
         PYMALLOC_REALLOC: 94496982
         PYMALLOC_CALLOC: 501135
         MALLOC: 109923
         CALLOC: 349

🥇 Top 20 largest allocating locations (by size):
        - recv:/usr/lib/python3.10/multiprocessing/connection.py:251 -> 41.959GB
        - dumps:/usr/lib/python3.10/multiprocessing/reduction.py:51 -> 28.005GB
        - resample:/home/mgunz/src/venv/pythonenv/lib/python3.10/site-packages/resampy/core.py:140 -> 10.172GB
        - _recv:/usr/lib/python3.10/multiprocessing/connection.py:379 -> 8.352GB
        - _recv:/usr/lib/python3.10/multiprocessing/connection.py:386 -> 7.101GB
        - get_audio_features:/home/mgunz/setups/2025-03-07--ws-asr/recipe/returnn/returnn/datasets/util/feature_extraction.py:190 -> 3.458GB
        - _extract_features:/home/mgunz/setups/2025-03-07--ws-asr/recipe/apptek_asr/lib/data_postprocessing.py:374 -> 3.458GB
        - get_audio_features:/home/mgunz/setups/2025-03-07--ws-asr/recipe/returnn/returnn/datasets/util/feature_extraction.py:168 -> 3.457GB
        - _apply_speed_perturbation:/home/mgunz/setups/2025-03-07--ws-asr/recipe/apptek_asr/lib/data_postprocessing.py:278 -> 2.762GB
        - _apply_speed_perturbation:/home/mgunz/setups/2025-03-07--ws-asr/recipe/apptek_asr/lib/data_postprocessing.py:283 -> 2.416GB
        - zeros_like:/home/mgunz/src/venv/pythonenv/lib/python3.10/site-packages/numpy/core/numeric.py:129 -> 2.414GB
        - _pad_simple:/home/mgunz/src/venv/pythonenv/lib/python3.10/site-packages/numpy/lib/arraypad.py:114 -> 1.721GB
        - __call__:/home/mgunz/setups/2025-03-07--ws-asr/recipe/apptek_asr/lib/data_postprocessing.py:177 -> 1.401GB
        - diff:/home/mgunz/src/venv/pythonenv/lib/python3.10/site-packages/numpy/lib/function_base.py:1452 -> 772.242MB
        - diff:/home/mgunz/src/venv/pythonenv/lib/python3.10/site-packages/numpy/lib/function_base.py:1441 -> 771.573MB
        - valid_audio:/home/mgunz/src/venv/pythonenv/lib/python3.10/site-packages/librosa/util/utils.py:313 -> 691.183MB
        - resample:/home/mgunz/src/venv/pythonenv/lib/python3.10/site-packages/resampy/core.py:134 -> 576.544MB
        - _tokenize:/usr/lib/python3.10/tokenize.py:527 -> 373.258MB
        - _wrapreduction:/home/mgunz/src/venv/pythonenv/lib/python3.10/site-packages/numpy/core/fromnumeric.py:88 -> 330.813MB
        - __init__:/usr/lib/python3.10/multiprocessing/reduction.py:39 -> 84.045MB

🥇 Top 20 largest allocating locations (by number of allocations):
        - recv:/usr/lib/python3.10/multiprocessing/connection.py:251 -> 386929311
        - dumps:/usr/lib/python3.10/multiprocessing/reduction.py:51 -> 68922730
        - _tokenize:/usr/lib/python3.10/tokenize.py:527 -> 751792
        - _apply_speed_perturbation:/home/mgunz/setups/2025-03-07--ws-asr/recipe/apptek_asr/lib/data_postprocessing.py:271 -> 383424
        - __init__:/home/mgunz/src/venv/pythonenv/lib/python3.10/site-packages/numba/core/types/common.py:50 -> 375888
        - <lambda>:<string>:1 -> 289113
        - __call__:/home/mgunz/src/venv/pythonenv/lib/python3.10/site-packages/numba/core/types/abstract.py:67 -> 234946
        - ufunc_can_cast:/home/mgunz/src/venv/pythonenv/lib/python3.10/site-packages/numba/np/numpy_support.py:354 -> 219268
        - __reduce_ex__:/home/mgunz/setups/2025-03-07--ws-asr/recipe/returnn/returnn/tensor/_dim_extra.py:332 -> 200626
        - _tokenize:/usr/lib/python3.10/tokenize.py:591 -> 187944
        - _tokenize:/usr/lib/python3.10/tokenize.py:529 -> 187944
        - __init__:/usr/lib/python3.10/multiprocessing/reduction.py:39 -> 186956
        - _name_get:/home/mgunz/src/venv/pythonenv/lib/python3.10/site-packages/numpy/core/_dtype.py:363 -> 183308
        - issubclass_:/home/mgunz/src/venv/pythonenv/lib/python3.10/site-packages/numpy/core/numerictypes.py:320 -> 154470
        - _tokenize:/usr/lib/python3.10/tokenize.py:600 -> 148789
        - __setstate__:/home/mgunz/setups/2025-03-07--ws-asr/recipe/returnn/returnn/tensor/_tensor_extra.py:600 -> 147466
        - shape:/home/mgunz/setups/2025-03-07--ws-asr/recipe/returnn/returnn/tensor/_tensor_extra.py:1689 -> 143280
        - parent:<frozen importlib._bootstrap>:408 -> 140966
        - __init__:/home/mgunz/src/venv/pythonenv/lib/python3.10/site-packages/numba/core/types/npytypes.py:460 -> 140966
        - __init__:<frozen importlib._bootstrap>:72 -> 140952

Why would the unpickler need to make 380 million allocations over these 10min if all we're sending is TensorDicts? One subepoch takes longer than 10min, and uses about 1.5k steps. That is at max 300k seqs if every step used the full number of seqs (200 in my case). But it doesn't since I use laplace ordering so the real number of seqs will be much lower. Still, that would be ~1200 allocations per seq, which is way more than I would expect. And the statistics show that the majority of allocations falls between 26 and 136 bytes, so they are also really small? It still makes about 45k allocations between 500kb and 2.5MB, so maybe the numpy arrays for the audio data etc. fall into that range. Still, this high allocation count is very strange to me.

Unfortunately I don't get more details on the precise objects that pickle is allocating, that's all I see for now.

albertz · 2025-09-29T13:47:56Z

Ok, 1200 allocs per seq doesn't seem like too much for me. Every single object counts here. Even some attrib access will allocate a new temp method wrapper object.

NeoLegends · 2025-09-29T13:50:15Z

https://stackoverflow.com/a/25334520/2000501

Oh, wow. One class instance (without slots) apparently costs at least ~250 bytes. I expected python to use some memory, but did not expect it to be that wasteful. But ok.

NeoLegends · 2025-09-29T13:50:38Z

I guess we're done here then. No further digging required.

Maybe we can be more efficient at this in general by adding __slots__ to the RETURNN tensor classes (Dim, Tensor) if they don't have it yet. But probably you optimized this already, or have a good reason why we should not do this.

NeoLegends · 2025-09-29T13:57:26Z

wrt. the gc.collection parameter, I am inclined to add a lower bound like e.g. 100 seqs to the value as well. Reason being that if you use buf_size=1, you will end in the pathological case where you gc.collect() after every seq, which is going to be a massive slowdown. I want to make this reasonably easy for the user, and so that they don't have to think much about this config knob at all.

albertz · 2025-09-29T14:55:45Z

Well, I still don't understand where it consumes 400MB. ~1200 allocations per seq doesn't really explain that, and most of these objects are temporary, and immediately deallocated also (no need for GC). Also, for Tensor and Dim, we already use __slots__. But that is not really relevant for the discussion of mem consumption. This will make a few bytes difference (per seq), but does not explain 400MB.

Your log output still does not really show what I want to know. I don't really want to know about the accumulated amount of allocations. I want, at some/any particular point, know what is the allocated memory, what objects, where were they allocated, etc. E.g. in the flamegraph, when you see it has 2GB (or 400MB) allocated, what is this 400MB.

NeoLegends · 2025-09-29T15:10:34Z

Yes, this is true. I haven‘t been able to dig out this information from the profiles yet. But it‘s probably there. Will find that out.

NeoLegends · 2025-10-10T10:39:58Z

Setup:

DistributeFilesDataset(
  buf_size=800,
  PostprocessingDataset(
    buf_size=1,
    num_workers=2,
    MetaDataset(HDFDataset(Audio), HDFDataset(Transcription))
  )
)

No explicit GC calls/tuning.

By count it's mostly strs that are allocated. This is from pympler, but I think I need to use an even more advanced profiler to find out where these strings come from.

worker 0: memory usage: 54.6 MiB, after 10 loaded seqs
                                        types |   # objects |   total size
============================================= | =========== | ============
                                          str |      107202 |     17.45 MB
                                         dict |       44352 |     12.31 MB
                                         code |       27693 |      4.72 MB
                                         type |        3770 |      3.50 MB
                                  abc.ABCMeta |        2021 |      2.06 MB
                                        tuple |       34968 |      1.99 MB
                                         list |        5245 |    965.52 KB
                        weakref.ReferenceType |       13006 |    914.48 KB
                                          set |        1479 |    869.48 KB
                                numpy.ndarray |         146 |    602.02 KB
                                    frozenset |        2344 |    548.69 KB
                   builtin_function_or_method |        7160 |    503.44 KB
                                          int |        9057 |    273.95 KB
                          function (__init__) |        1774 |    249.47 KB
                                         cell |        5893 |    230.20 KB
                            getset_descriptor |        3401 |    212.56 KB
                           wrapper_descriptor |        2872 |    201.94 KB
                                 staticmethod |        3463 |    162.33 KB
                            method_descriptor |        2187 |    153.77 KB
                       function (on_disposal) |        1044 |    146.81 KB
     numba.core.types.abstract._TypeMetaclass |         138 |    144.97 KB
                               _abc._abc_data |        2191 |    136.94 KB
                                       method |        1870 |    116.88 KB
     _cython_3_0_10.cython_function_or_method |         571 |    115.98 KB
                      collections.OrderedDict |         228 |    113.77 KB
                                  numpy.ufunc |         465 |    112.62 KB
         _cython_3_0_10.fused_cython_function |         497 |    108.72 KB
                  numba.core.utils.UniqueDict |          19 |     82.39 KB
                          function (__repr__) |         571 |     80.30 KB
                                     property |        1011 |     78.98 KB
          numba.core.types.functions.Function |        1658 |     77.72 KB
                      collections.defaultdict |          47 |     68.93 KB
                                      StgDict |         120 |     68.56 KB
                                   re.Pattern |         137 |     66.29 KB
         numba.core.cpu_options.InlineOptions |        1358 |     63.66 KB
                 _frozen_importlib.ModuleSpec |        1337 |     62.67 KB
                                enum.EnumMeta |          53 |     55.16 KB
  _frozen_importlib_external.SourceFileLoader |        1143 |     53.58 KB
                            member_descriptor |         810 |     50.62 KB
       ctypes.CDLL.__init__.<locals>._FuncPtr |         255 |     47.81 KB
                                        float |        1999 |     46.85 KB
                            inspect.Parameter |         731 |     45.69 KB
                           function (ov_wrap) |         319 |     44.86 KB
                          function (<lambda>) |         292 |     41.06 KB
                       _ctypes.PyCPointerType |          39 |     40.18 KB
                        _ctypes.PyCStructType |          37 |     38.09 KB
                           function (__new__) |         265 |     37.27 KB
                          function (__call__) |         246 |     34.59 KB
                    _collections._tuplegetter |         730 |     34.22 KB
                       function (method_impl) |         243 |     34.17 KB
after 110 loaded seqs:
                                        types |   # objects |   total size
============================================= | =========== | ============
                                          str |      137800 |     20.27 MB
                                         dict |       44398 |     13.16 MB
                                         code |       27693 |      4.73 MB
                                         list |        5256 |      3.99 MB
                                         type |        3783 |      3.51 MB
                                  abc.ABCMeta |        2021 |      2.06 MB
                                        tuple |       34977 |      1.99 MB
                                numpy.ndarray |         167 |      1.19 MB
                        weakref.ReferenceType |       13019 |    915.40 KB
                                          set |        1482 |    870.11 KB
                                    frozenset |        2344 |    548.69 KB
                   builtin_function_or_method |        7160 |    503.44 KB
                                          int |        9079 |    274.60 KB
                          function (__init__) |        1774 |    249.47 KB
                                         cell |        5893 |    230.20 KB
                            getset_descriptor |        3401 |    212.56 KB
                           wrapper_descriptor |        2898 |    203.77 KB
                                 staticmethod |        3463 |    162.33 KB
                            method_descriptor |        2187 |    153.77 KB
                       function (on_disposal) |        1044 |    146.81 KB
     numba.core.types.abstract._TypeMetaclass |         138 |    144.97 KB
                               _abc._abc_data |        2191 |    136.94 KB
                                       method |        1870 |    116.88 KB
     _cython_3_0_10.cython_function_or_method |         571 |    115.98 KB
                      collections.OrderedDict |         228 |    113.77 KB
                                  numpy.ufunc |         465 |    112.62 KB
         _cython_3_0_10.fused_cython_function |         497 |    108.72 KB
                  numba.core.utils.UniqueDict |          19 |     82.39 KB
                          function (__repr__) |         571 |     80.30 KB
                                     property |        1011 |     78.98 KB
          numba.core.types.functions.Function |        1658 |     77.72 KB
                      collections.defaultdict |          47 |     68.93 KB
                                      StgDict |         120 |     68.56 KB
                                   re.Pattern |         137 |     66.29 KB
         numba.core.cpu_options.InlineOptions |        1358 |     63.66 KB
                 _frozen_importlib.ModuleSpec |        1337 |     62.67 KB
                                enum.EnumMeta |          53 |     55.16 KB
  _frozen_importlib_external.SourceFileLoader |        1143 |     53.58 KB
                            member_descriptor |         810 |     50.62 KB
       ctypes.CDLL.__init__.<locals>._FuncPtr |         255 |     47.81 KB
                                        float |        1999 |     46.85 KB
                            inspect.Parameter |         731 |     45.69 KB
                           function (ov_wrap) |         319 |     44.86 KB
                          function (<lambda>) |         292 |     41.06 KB
                       _ctypes.PyCPointerType |          39 |     40.18 KB
                        _ctypes.PyCStructType |          37 |     38.09 KB
                           function (__new__) |         265 |     37.27 KB
                          function (__call__) |         246 |     34.59 KB
                    _collections._tuplegetter |         730 |     34.22 KB
                       function (method_impl) |         243 |     34.17 KB

new
                                        types |   # objects |   total size
============================================= | =========== | ============
                                         list |          11 |      3.05 MB
                                          str |       30598 |      2.63 MB
                                numpy.ndarray |          21 |      2.05 MB
                                         dict |          46 |    873.06 KB
                                         type |          13 |      5.18 KB
                           wrapper_descriptor |          26 |      1.83 KB
                 returnn.tensor.tensor.Tensor |          15 |      1.64 KB
                       returnn.tensor.dim.Dim |          12 |      1.12 KB
                        weakref.ReferenceType |          13 |    936     B
                                          int |          22 |    668     B
                                          set |           3 |    648     B
                                        tuple |           9 |    480     B
        returnn.tensor.tensor_dict.TensorDict |           3 |    144     B
          returnn.tensor._dim_extra._DimExtra |           3 |    144     B
  returnn.datasets.util.vocabulary.Vocabulary |           3 |    144     B
                     functools._lru_list_elem |           1 |     56     B
gone
  types |   # objects |   total size
======= | =========== | ============

albertz · 2025-10-10T12:48:48Z

By count it's mostly strs that are allocated. This is from pympler, but I think I need to use an even more advanced profiler to find out where these strings come from.

But if you sum those allocations together, this is much less 2GB? (Also less than 400MB?) Does it really count everything? Or is this now a different setup? Is this then still a relevant setup?

albertz · 2025-10-10T12:51:01Z

I am assuming the wrapped HDF/MetaDataset for now, but it would be good to have proof.

What memory are you measuring? I thought you measure only the subproc of PostprocessingDataset? There should be no HDF/MetaDataset in there?

NeoLegends · 2025-10-10T12:57:21Z

But if you sum those allocations together, this is much less 2GB? (Also less than 400MB?) Does it really count everything? Or is this now a different setup? Is this then still a relevant setup?

The difference is just in the timing of the measurements. I use buf_size=1 in the worker, which eventually results in 400MB peak RSS usage, but only after leaving it running for a while (5min or so). Here I am taking two measurements that are just 100 seqs apart from each other, so just a few seconds. I am doing this because if I measure across more segments, the diff process between the two measurements takes too long (more than 10min) because there are so many new objects. I think what's important here is that there are 30k new str objects that are not yet collected, even across just 100 seqs.

What memory are you measuring? I thought you measure only the subproc of PostprocessingDataset? There should be no HDF/MetaDataset in there?

Ah, of course, yes. I am measuring only the postprocessor worker there. I mixed things up in my head when I wrote this. ...still feeling a bit under the weather today.

NeoLegends added 3 commits September 19, 2025 12:02

Pull apart map_seq application and dataset iteration

87864e2

make mapping iterator builder static

7e6f78f

Implement multi-process postprocessing

823434e

NeoLegends changed the title ~~Add multiprocessing support to PostprocessingDataset~~ Add multiprocessing support to PostprocessingDataset Sep 19, 2025

NeoLegends added 3 commits September 19, 2025 14:30

add test (currently fails due to pickle issues)

7d35731

add TODO on dataset lock

e70135b

ignore broad exception lint

bf878f2

NeoLegends mentioned this pull request Sep 19, 2025

PostprocessingDataset with multi-processing #1701

Open

extend docs

bc1a7c5

NeoLegends added 4 commits September 19, 2025 15:44

test if tensor pickling makes it work

6c463e9

late night todo to self: fix draining workers

4ecfdfd

add docs

55abd83

CI: ensure numpy remains at v1 even in espnet installation

b700514

NeoLegends force-pushed the moritz-pp-multiproc branch from 39d6521 to b700514 Compare September 19, 2025 22:08

NeoLegends added 6 commits September 22, 2025 10:23

fix lints

31a5413

optimize

4aa1349

clean implementation

69a88f3

init wrapped dataset in feeder thread

72152bf

cleaner code

8987608

Merge remote-tracking branch 'origin/master' into moritz-pp-multiproc

f38178e

hannah220 reviewed Sep 22, 2025

View reviewed changes

returnn/datasets/postprocessing.py Show resolved Hide resolved

NeoLegends added 3 commits September 22, 2025 11:55

pass correct epoch, but multiply worker count and worker index in via…

22455dc

… constants

limit RNG seed range

fb8f317

Merge branch 'master' into moritz-pp-multiproc

715efa2

NeoLegends commented Sep 22, 2025

View reviewed changes

move multi proc implementation into subclass

27d76dd

add frequent GC collection

d2a9101

NeoLegends force-pushed the moritz-pp-multiproc branch from 12a5186 to d2a9101 Compare September 29, 2025 13:39

improve test

c2bc4fd

NeoLegends added 5 commits September 29, 2025 15:58

add 100 seqs lower bound if gc_interval is not configured explicitly

1fa9f50

fix naming issue

a4c509c

pass 0 interval correctly

ad1e878

docs

c08733f

fix long line

fb2d6e1

NeoLegends added 2 commits September 30, 2025 16:06

remove unused import

ba99180

Merge branch 'master' into moritz-pp-multiproc

d182725

PostprocessingDataset: add support for multiprocessing #1765

Are you sure you want to change the base?

PostprocessingDataset: add support for multiprocessing #1765

Uh oh!

Conversation

NeoLegends commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

albertz commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

albertz commented Sep 19, 2025

Uh oh!

NeoLegends commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

albertz commented Sep 19, 2025

Uh oh!

Uh oh!

NeoLegends Sep 22, 2025

Choose a reason for hiding this comment

Uh oh!

NeoLegends Sep 22, 2025

Choose a reason for hiding this comment

Uh oh!

NeoLegends Sep 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NeoLegends commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NeoLegends commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

albertz commented Sep 29, 2025

Uh oh!

NeoLegends commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

albertz commented Sep 29, 2025

Uh oh!

albertz commented Sep 29, 2025

Uh oh!

NeoLegends commented Sep 29, 2025

Uh oh!

NeoLegends commented Sep 29, 2025

Uh oh!

albertz commented Sep 29, 2025

Uh oh!

NeoLegends commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

albertz commented Sep 29, 2025

Uh oh!

NeoLegends commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NeoLegends commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NeoLegends commented Sep 29, 2025

Uh oh!

albertz commented Sep 29, 2025

Uh oh!

NeoLegends commented Sep 29, 2025

Uh oh!

NeoLegends commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

albertz commented Oct 10, 2025

Uh oh!

albertz commented Oct 10, 2025

Uh oh!

NeoLegends commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

`PostprocessingDataset`: add support for multiprocessing #1765

`PostprocessingDataset`: add support for multiprocessing #1765

NeoLegends commented Sep 19, 2025 •

edited

Loading

albertz commented Sep 19, 2025 •

edited

Loading

NeoLegends commented Sep 19, 2025 •

edited

Loading

NeoLegends Sep 22, 2025 •

edited

Loading

NeoLegends commented Sep 29, 2025 •

edited

Loading

NeoLegends commented Sep 29, 2025 •

edited

Loading

NeoLegends commented Sep 29, 2025 •

edited

Loading

NeoLegends commented Sep 29, 2025 •

edited

Loading

NeoLegends commented Sep 29, 2025 •

edited

Loading

NeoLegends commented Sep 29, 2025 •

edited

Loading

NeoLegends commented Oct 10, 2025 •

edited

Loading

NeoLegends commented Oct 10, 2025 •

edited

Loading