Commit 06dcafd
Integrate Max Stream Size Chunking in Velox Writer (facebookincubator#249)
Summary:
This is the last feature of the new chunking policy described in this [doc](https://fburl.com/gdoc/gkdwwju1). Here, we break down large streams into multiple chunks of size up to `maxStreamChunkRawSize`. This protects the reader from attempting to materialize huge chunks. We included StreamData support for this in the previous diff. In this diff, we integrate with the VeloxWriter. With this change, while memory pressure is detected, we:
1. Chunk large streams above `maxStreamChunkRawSize`, retaining stream data below the limit.
2. If there is still memory pressure after the first step, chunk streams with size above `minStreamChunkRawSize`.
During stripe flush, we chunk all remaining data, breaking down streams above `maxStreamChunkRawSize` into smaller chunks.
---
The general chunking policy has two phases:
## **Phase 1 - Memory Pressure Management (shouldChunk)**
The policy monitors total in-memory data size:
- When memory usage exceeds the maximum threshold, it initiates chunking to reduce memory footprint while continuing data ingestion.
- When previous chunking attempts succeeded and memory remains above the minimum threshold, it continues chunking to further reduce memory usage.
- When chunking fails to reduce memory usage effectively and memory stays above the minimum threshold, it forces a full stripe flush to guarantee memory relief.
## **Phase 2 - Storage Size Optimization (shouldFlush)**
Implements compression-aware stripe size prediction:
- Calculates the anticipated final compressed stripe size by applying the estimated compression ratio to unencoded data.
- Triggers stripe flush when the predicted compressed size reaches the target stripe size threshold.
Differential Revision: D821754961 parent 9a7c755 commit 06dcafd
File tree
4 files changed
+105
-68
lines changed- dwio/nimble/velox
- tests
4 files changed
+105
-68
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
33 | 33 | | |
34 | 34 | | |
35 | 35 | | |
| 36 | + | |
36 | 37 | | |
37 | 38 | | |
38 | 39 | | |
| |||
806 | 807 | | |
807 | 808 | | |
808 | 809 | | |
| 810 | + | |
809 | 811 | | |
810 | 812 | | |
811 | 813 | | |
| |||
821 | 823 | | |
822 | 824 | | |
823 | 825 | | |
824 | | - | |
825 | | - | |
826 | | - | |
827 | 826 | | |
828 | 827 | | |
829 | | - | |
830 | | - | |
831 | | - | |
832 | | - | |
833 | | - | |
834 | | - | |
835 | | - | |
836 | | - | |
837 | | - | |
838 | | - | |
839 | | - | |
840 | | - | |
841 | | - | |
842 | | - | |
843 | | - | |
844 | | - | |
845 | | - | |
846 | | - | |
847 | | - | |
848 | | - | |
849 | | - | |
850 | | - | |
851 | | - | |
852 | | - | |
853 | | - | |
854 | | - | |
855 | | - | |
856 | | - | |
857 | | - | |
858 | | - | |
859 | | - | |
| 828 | + | |
| 829 | + | |
| 830 | + | |
| 831 | + | |
| 832 | + | |
| 833 | + | |
| 834 | + | |
| 835 | + | |
| 836 | + | |
| 837 | + | |
| 838 | + | |
| 839 | + | |
| 840 | + | |
| 841 | + | |
| 842 | + | |
| 843 | + | |
| 844 | + | |
| 845 | + | |
| 846 | + | |
| 847 | + | |
| 848 | + | |
860 | 849 | | |
861 | 850 | | |
862 | | - | |
863 | 851 | | |
864 | 852 | | |
865 | 853 | | |
866 | 854 | | |
867 | | - | |
| 855 | + | |
868 | 856 | | |
869 | 857 | | |
870 | 858 | | |
871 | | - | |
872 | | - | |
873 | 859 | | |
| 860 | + | |
| 861 | + | |
| 862 | + | |
874 | 863 | | |
875 | 864 | | |
876 | 865 | | |
| |||
924 | 913 | | |
925 | 914 | | |
926 | 915 | | |
927 | | - | |
| 916 | + | |
928 | 917 | | |
929 | 918 | | |
930 | 919 | | |
| |||
1011 | 1000 | | |
1012 | 1001 | | |
1013 | 1002 | | |
| 1003 | + | |
| 1004 | + | |
| 1005 | + | |
| 1006 | + | |
| 1007 | + | |
| 1008 | + | |
| 1009 | + | |
| 1010 | + | |
| 1011 | + | |
| 1012 | + | |
| 1013 | + | |
| 1014 | + | |
| 1015 | + | |
| 1016 | + | |
| 1017 | + | |
| 1018 | + | |
| 1019 | + | |
| 1020 | + | |
1014 | 1021 | | |
1015 | | - | |
1016 | | - | |
1017 | | - | |
1018 | | - | |
1019 | | - | |
1020 | | - | |
1021 | | - | |
1022 | | - | |
1023 | | - | |
1024 | | - | |
1025 | | - | |
1026 | | - | |
1027 | | - | |
1028 | | - | |
1029 | | - | |
1030 | | - | |
1031 | | - | |
1032 | | - | |
1033 | | - | |
1034 | | - | |
1035 | | - | |
1036 | | - | |
1037 | | - | |
| 1022 | + | |
| 1023 | + | |
| 1024 | + | |
| 1025 | + | |
| 1026 | + | |
| 1027 | + | |
1038 | 1028 | | |
1039 | 1029 | | |
| 1030 | + | |
| 1031 | + | |
| 1032 | + | |
| 1033 | + | |
| 1034 | + | |
| 1035 | + | |
| 1036 | + | |
| 1037 | + | |
| 1038 | + | |
| 1039 | + | |
| 1040 | + | |
| 1041 | + | |
| 1042 | + | |
| 1043 | + | |
| 1044 | + | |
| 1045 | + | |
| 1046 | + | |
1040 | 1047 | | |
1041 | 1048 | | |
1042 | 1049 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
92 | 92 | | |
93 | 93 | | |
94 | 94 | | |
95 | | - | |
| 95 | + | |
| 96 | + | |
96 | 97 | | |
97 | 98 | | |
98 | 99 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
100 | 100 | | |
101 | 101 | | |
102 | 102 | | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
103 | 108 | | |
104 | 109 | | |
105 | 110 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1944 | 1944 | | |
1945 | 1945 | | |
1946 | 1946 | | |
| 1947 | + | |
1947 | 1948 | | |
1948 | 1949 | | |
1949 | 1950 | | |
| |||
1959 | 1960 | | |
1960 | 1961 | | |
1961 | 1962 | | |
| 1963 | + | |
1962 | 1964 | | |
1963 | 1965 | | |
1964 | 1966 | | |
| |||
2074 | 2076 | | |
2075 | 2077 | | |
2076 | 2078 | | |
| 2079 | + | |
2077 | 2080 | | |
2078 | 2081 | | |
2079 | 2082 | | |
| |||
2088 | 2091 | | |
2089 | 2092 | | |
2090 | 2093 | | |
| 2094 | + | |
2091 | 2095 | | |
2092 | 2096 | | |
2093 | 2097 | | |
2094 | 2098 | | |
2095 | 2099 | | |
| 2100 | + | |
| 2101 | + | |
| 2102 | + | |
| 2103 | + | |
| 2104 | + | |
| 2105 | + | |
| 2106 | + | |
| 2107 | + | |
| 2108 | + | |
| 2109 | + | |
| 2110 | + | |
| 2111 | + | |
| 2112 | + | |
| 2113 | + | |
| 2114 | + | |
2096 | 2115 | | |
2097 | | - | |
| 2116 | + | |
2098 | 2117 | | |
2099 | 2118 | | |
2100 | 2119 | | |
| |||
2103 | 2122 | | |
2104 | 2123 | | |
2105 | 2124 | | |
| 2125 | + | |
2106 | 2126 | | |
2107 | | - | |
| 2127 | + | |
2108 | 2128 | | |
2109 | 2129 | | |
2110 | 2130 | | |
| |||
2118 | 2138 | | |
2119 | 2139 | | |
2120 | 2140 | | |
| 2141 | + | |
2121 | 2142 | | |
2122 | 2143 | | |
2123 | 2144 | | |
2124 | 2145 | | |
2125 | 2146 | | |
2126 | 2147 | | |
2127 | | - | |
| 2148 | + | |
2128 | 2149 | | |
2129 | 2150 | | |
2130 | 2151 | | |
| |||
2133 | 2154 | | |
2134 | 2155 | | |
2135 | 2156 | | |
| 2157 | + | |
2136 | 2158 | | |
2137 | | - | |
2138 | | - | |
| 2159 | + | |
| 2160 | + | |
2139 | 2161 | | |
2140 | 2162 | | |
2141 | 2163 | | |
| |||
2149 | 2171 | | |
2150 | 2172 | | |
2151 | 2173 | | |
| 2174 | + | |
2152 | 2175 | | |
2153 | 2176 | | |
2154 | 2177 | | |
| |||
2164 | 2187 | | |
2165 | 2188 | | |
2166 | 2189 | | |
| 2190 | + | |
2167 | 2191 | | |
2168 | 2192 | | |
2169 | 2193 | | |
| |||
0 commit comments