expand on the feedback by Tamar

markos · markos · commit 01f04718a1a6 · 2024-01-15T12:21:31.000+02:00
diff --git a/content/learning-paths/cross-platform/loop-reflowing/autovectorization-and-restrict.md b/content/learning-paths/cross-platform/loop-reflowing/autovectorization-and-restrict.md
@@ -80,6 +80,8 @@ addvec:
 
 As you can see, the compiler has enabled autovectorization for this algorithm and the output is identical to the hand-written function! Strictly speaking, you don't even need `restrict` in such a trivial loop as it will be autovectorized anyway when certain optimization levels are added to the compilation flags (`-O2` for clang, `-O3` for gcc). However, the use of restrict simplifies the code and generates SIMD code similar to the hand written version in `addvec_neon.c`.
 
-This is just a trivial example though and not all loops can be autovectorized that easily by the compiler. 
+The reason for this is because of the way each compiler decides whether to use autovectorization or not. For each candidate loop the compiler will estimate the possible performance gains against a cost model, which is affected by many parameters and of course the optimization level in the compilation flags. This cost model will estimate whether the autovectorized code grows in size and if the performance gains are enough to outweigh this increase in code size. Based on this estimation, the compiler will decide to use this vectorized code or fall back to a more 'safe' scalar implementation. This decision however is something that is not set in stone and is constantly reevaluated during compiler development.
+
+This analysis goes beyond the scope of this LP, this was just one trivial example to demonstrate how the autovectorization can be triggered by a flag.
 
 You will see some more advanced examples in the next sections.