Commit 5eafc20
committed
Fix undefined behavior in SmallString.withUTF8
withUTF8 currently vends a typed UInt8 pointer to the underlying
SmallString. That pointer type differs from SmallString's
representation. It should simply vend a raw pointer, which would be
both type safe and convenient for UTF8 data. However, since this
method is already @inlinable, I added calls to bindMemory to prevent
the optimizer from reasoning about access to the typed pointer that we
vend.
rdar://67983613 (Undefinied behavior in SmallString.withUTF8 is miscompiled)
Additional commentary:
SmallString creates a situation where there are two types, the
in-memory type, (UInt64, UInt64), vs. the element type,
UInt8. `UnsafePointer<T>` specifies the in-memory type of the pointee,
because that's how C works. If you want to specify an element type,
not the in-memory type, then you need to use something other than
UnsafePointer to view the memory. A trivial `BufferView<UInt8>` would
be fine, although, frankly, I think UnsafeRawPointer is a perfectly
good type on its own for UTF8 bytes.
Unfortunately, a lot of the UTF8 helper code is ABI-exposed, so to
work around this, we need to insert calls to bindMemory at strategic
points to avoid undefined behavior. This is high-risk and can
negatively affect performance. So far, I was able to resolve the
regressions in our microbenchmarks just by tweaking the inliner.1 parent 2544806 commit 5eafc20
File tree
3 files changed
+31
-15
lines changed- stdlib/public/core
3 files changed
+31
-15
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
197 | 197 | | |
198 | 198 | | |
199 | 199 | | |
| 200 | + | |
200 | 201 | | |
201 | | - | |
202 | | - | |
203 | | - | |
204 | | - | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
205 | 212 | | |
206 | 213 | | |
207 | 214 | | |
208 | 215 | | |
209 | 216 | | |
210 | 217 | | |
211 | 218 | | |
212 | | - | |
| 219 | + | |
213 | 220 | | |
214 | 221 | | |
215 | 222 | | |
216 | | - | |
217 | | - | |
218 | | - | |
219 | | - | |
| 223 | + | |
220 | 224 | | |
221 | 225 | | |
222 | 226 | | |
| |||
273 | 277 | | |
274 | 278 | | |
275 | 279 | | |
276 | | - | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
277 | 291 | | |
278 | 292 | | |
279 | 293 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
169 | 169 | | |
170 | 170 | | |
171 | 171 | | |
172 | | - | |
| 172 | + | |
173 | 173 | | |
174 | 174 | | |
175 | 175 | | |
| |||
193 | 193 | | |
194 | 194 | | |
195 | 195 | | |
196 | | - | |
| 196 | + | |
197 | 197 | | |
198 | 198 | | |
199 | 199 | | |
| |||
206 | 206 | | |
207 | 207 | | |
208 | 208 | | |
209 | | - | |
| 209 | + | |
210 | 210 | | |
211 | 211 | | |
212 | 212 | | |
| |||
340 | 340 | | |
341 | 341 | | |
342 | 342 | | |
343 | | - | |
| 343 | + | |
344 | 344 | | |
345 | 345 | | |
346 | 346 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
250 | 250 | | |
251 | 251 | | |
252 | 252 | | |
253 | | - | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
254 | 256 | | |
255 | 257 | | |
256 | 258 | | |
| |||
0 commit comments