-
I must admit I haven't had the possibility to test it, since there is no way out of C# nor MSIL to create an array without initializing it to 0. We do a lot of number crunching and during processing data, we need need to allocate a lot of data. Mostly images of 512x512 that are linearized to a 1D-array. What happens when I allocate an array in C# is that it's always initialized to 0 although I know, that immediately after creating the array, I will set the contents to a defined value, other then 0. In a regular scenario this might be practical but in our envrionment I assume zeroing the contents of the array is not negligible in terms of perfomance when it comes to a lot of data. I would be very interested about the performance (leaving out zeroing the array). And if possible - and here comes the proposal -
I understand this might be a little hard in C#, but it would be nice to have an extension in MSIL for newarr, either as something newarr.nz or as an optional argument. |
Beta Was this translation helpful? Give feedback.
Replies: 15 comments
-
Dupe of/related to #868. |
Beta Was this translation helpful? Give feedback.
-
@Joe4evr: Yes, related to #868 but in #868 it's only for stackalloc (MSIL: localloc), I somehow need it for MSIL newarr. Do you think I should close this one a refer to #868? |
Beta Was this translation helpful? Give feedback.
-
While that issue is raised about |
Beta Was this translation helpful? Give feedback.
-
@msedi consider using the System.Buffers package for this. |
Beta Was this translation helpful? Give feedback.
-
@scalablecory Do you mean the |
Beta Was this translation helpful? Give feedback.
-
My questions:
|
Beta Was this translation helpful? Give feedback.
-
You are talking about performance where you think zeroing it an initialization is problem? You better move to c++ or write in c for gpu and take advantage of parallelism. C# does not serve you very well. |
Beta Was this translation helpful? Give feedback.
-
@svick no, I meant the package I linked to. The workload @msedi describes is exactly what the It will reduce GC pressure and has an option to not clear the arrays when you're done with them. |
Beta Was this translation helpful? Give feedback.
-
@svick: Thanks for the suggestions, I try to answer as best as I can:
In fact I'm not sure, because there was no way to try. The newarr (MSIL) or new (C#) always do the initialization and I couldn't get around this. If someone has the option to try that would be great,
No, indeed not. The pattern is completely random. To make it a little bit more clear, I'm taling about radiological images that either have a size of 512x512 or much higher, up 4096x4096. Since there is also random noise on it, Even if the baseline pattern doesn't change the noise still does.
Depends on question 1, If zeroing the array has a tremendous impact and calling the function hasn't that would be of course ok.
As you already said, Span is created via stackalloc, but the memory size we are dealing are beyond the stack size. I was also thinking about the unmanaged Marshal.AllocHGlobal approach. But all our mathematical routines are based on conventional C# arrays. I would have to rewrite all array mathods to work with pointers also. There comes another issue the fairly often comes up. There is currently not mechanism in C# to restrict to some numeric type. That's one reason why all my math/vector algebra is very very long code. At least for some basic methods (like Add), I wrote my own Add implementation which is currently not typesafe due to the limitations that I'm not able to restrict the T as numeric type. I will have a look at span and memory. |
Beta Was this translation helpful? Give feedback.
-
@MkazemAkhgary: We are already using CUDA and C++/MKL for some high performance things. But in my opinion some things can still be optimized in C#, since I'm not very happy moving to other languages only to a few small restrictions. Also using CUDA is causing a lot of pain. Since the CUDA compilers are attached to some C++ compiler and to several platforms. With every new CUDA version we have to exchange GPUs at the customers site only because the compiler toolkits are not backward compatible. With C#/.NET I can make sure that it works for a longer time.... ;-) |
Beta Was this translation helpful? Give feedback.
-
@scalablecory: Thanks. I will ahev a look at the ArrayPool and will return here if I have further details. |
Beta Was this translation helpful? Give feedback.
-
@svick: Do you know who is responsible for ArrayPool or where I can ask questions or have suggestions for the ArrayPool? BTW: The ArrayPool always returns a multiple of some size and not the size I have requested. From the internals I understand the reason, But for the "external" user it would be helpful to have something returns like an ArraySegment or better a derivation of it. The only problem is that the ArraySegment is slow. So maybe the Span or Memory might help? |
Beta Was this translation helpful? Give feedback.
-
Have you tried profiling your code? I think a profiler should be able to tell you how much time is spent in GC, and compare it with how much time is spent allocating the arrays (which includes the time it takes to zero-initialize them).
No,
I meant to avoid directly using pointers and instead rewrite your methods using
That would be the corefx repo, since that's where the code is. |
Beta Was this translation helpful? Give feedback.
-
@svick : Thanks for the info. The only problem is that it seems Span and Memory are currently written in C# 7.2 or c# 8.0 which I cannot use in productive code, yet, right? |
Beta Was this translation helpful? Give feedback.
-
It seems that the problem is now solved with .NET5 and |
Beta Was this translation helpful? Give feedback.
It seems that the problem is now solved with .NET5 and
GC.AllocateUnitializedArray<T>
.