Commit 922f21f
authored
Refactor base_os_version Logic for Task Scheduling Performance (#5031)
## Motivation
The previous implementation for determining the `base_os_version` for
new tasks introduced a significant performance bottleneck. The
`add_task` function (and its underlying `bulk_add_tasks` wrapper)
queried the Datastore for the `Job` and `OssFuzzProject` entities *for
each individual task* being created.
In high-throughput scenarios, such as the `schedule_fuzz.py` cron job
which can schedule upwards of 300,000 tasks at once, this behavior
results in an equivalent number of Datastore queries. This "N+1" query
problem leads to extreme slowness, high operational costs, and a
significant risk of timeouts and failed task creation.
An alternative approach using a single batch query with an `IN` clause
was considered. However, this is also not scalable for a very large
number of entities and could hit Datastore limits or result in an
unacceptably slow query.
This PR refactors the logic to be far more efficient and scalable.
## Solution
The core idea of this change is to move the responsibility of
determining the `base_os_version` to the point where the necessary
information is already available, thus eliminating redundant Datastore
lookups.
1. **Logic moved to `schedule_fuzz.py`:**
The `schedule_fuzz.py` cron job already queries for all `Job` and
`OssFuzzProject` entities to perform its scheduling calculations. We now
leverage these in-memory entities to determine the correct
`base_os_version` *before* the `Task` object is created.
2. **`base_os_version` Precedence:**
The logic for selecting the OS version is now explicitly handled within
the schedulers (`OssfuzzFuzzTaskScheduler` and
`ChromeFuzzTaskScheduler`) with the following precedence:
- Use `OssFuzzProject.base_os_version` if it exists.
- Otherwise, use `Job.base_os_version` if it exists.
- Otherwise, the value is `None`.
3. **Simplified Task Creation:**
The determined `base_os_version` is passed directly into the `Task`
constructor via the `extra_info` dictionary. This makes the `add_task`
and `bulk_add_tasks` functions in `tasks/__init__.py` "dumb" in this
regard; they no longer perform any Datastore queries for this purpose
and simply publish the tasks they are given.
4. **Reverting to `add_task`:**
The logic has been consolidated back into the `add_task` function,
removing the `bulk_add_tasks` implementation to simplify the call chain.
The `add_task` function now correctly handles the `base_os_version`
logic and uses `job.is_external()` for dispatching to `external_tasks`.
## Benefits
* **Drastic Performance Improvement:** Reduces the number of Datastore
queries during the scheduling of fuzz tasks from potentially hundreds of
thousands to zero.
* **Enhanced Scalability:** The system can now schedule extremely large
batches of tasks efficiently without overwhelming the Datastore or
risking timeouts.
* **Improved Code Cohesion:** The logic for determining task properties
now resides within the scheduler, where the necessary context is already
present. This makes the `add_task` function simpler and more focused on
its core responsibility of enqueuing a task.1 parent d9b2898 commit 922f21f
File tree
4 files changed
+242
-67
lines changed- src/clusterfuzz/_internal
- base/tasks
- cron
- tests
- appengine/handlers/cron
- core/base/tasks
4 files changed
+242
-67
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
809 | 809 | | |
810 | 810 | | |
811 | 811 | | |
812 | | - | |
813 | | - | |
814 | | - | |
815 | | - | |
816 | | - | |
817 | | - | |
818 | | - | |
819 | | - | |
820 | | - | |
821 | | - | |
822 | | - | |
823 | | - | |
824 | | - | |
825 | | - | |
826 | | - | |
827 | | - | |
828 | | - | |
829 | 812 | | |
830 | 813 | | |
831 | 814 | | |
| |||
843 | 826 | | |
844 | 827 | | |
845 | 828 | | |
| 829 | + | |
846 | 830 | | |
847 | 831 | | |
848 | 832 | | |
849 | 833 | | |
850 | 834 | | |
| 835 | + | |
| 836 | + | |
| 837 | + | |
| 838 | + | |
| 839 | + | |
| 840 | + | |
| 841 | + | |
| 842 | + | |
| 843 | + | |
| 844 | + | |
| 845 | + | |
851 | 846 | | |
852 | 847 | | |
853 | 848 | | |
854 | 849 | | |
855 | 850 | | |
856 | 851 | | |
| 852 | + | |
| 853 | + | |
| 854 | + | |
857 | 855 | | |
858 | 856 | | |
859 | 857 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
135 | 135 | | |
136 | 136 | | |
137 | 137 | | |
138 | | - | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
139 | 144 | | |
140 | 145 | | |
141 | 146 | | |
142 | 147 | | |
| 148 | + | |
143 | 149 | | |
144 | 150 | | |
145 | 151 | | |
146 | 152 | | |
147 | 153 | | |
148 | 154 | | |
149 | | - | |
| 155 | + | |
| 156 | + | |
150 | 157 | | |
151 | 158 | | |
152 | 159 | | |
| |||
166 | 173 | | |
167 | 174 | | |
168 | 175 | | |
| 176 | + | |
| 177 | + | |
169 | 178 | | |
170 | 179 | | |
171 | 180 | | |
172 | 181 | | |
173 | 182 | | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
174 | 190 | | |
175 | | - | |
| 191 | + | |
176 | 192 | | |
177 | 193 | | |
178 | 194 | | |
| |||
213 | 229 | | |
214 | 230 | | |
215 | 231 | | |
216 | | - | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
217 | 237 | | |
218 | 238 | | |
219 | 239 | | |
| |||
236 | 256 | | |
237 | 257 | | |
238 | 258 | | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
239 | 263 | | |
240 | | - | |
| 264 | + | |
241 | 265 | | |
242 | 266 | | |
243 | 267 | | |
| |||
261 | 285 | | |
262 | 286 | | |
263 | 287 | | |
264 | | - | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
265 | 293 | | |
266 | 294 | | |
267 | 295 | | |
| |||
Lines changed: 157 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
78 | 78 | | |
79 | 79 | | |
80 | 80 | | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
81 | 238 | | |
82 | 239 | | |
83 | 240 | | |
| |||
0 commit comments