-
Notifications
You must be signed in to change notification settings - Fork 4
Description
I get errors about data being left on the device when running a Func on the GPU:
main :: IO ()
main = case gpuTarget of
Nothing -> putStrLn "no GPU target found"
Just gpu -> do
runOnGpu <- compileForTarget gpu $ \(buffer "original" -> original) -> do
y <- mkVar "y"
x <- mkVar "x"
c <- mkVar "c"
brightened <- define "brightened" (c, x, y) $ original ! (c, x, y) + 50
tileGpu brightened x y 16 16
return brightened
imgIn :: MutableImage RealWorld PixelRGB8 <- newMutableImage 256 256
imgOut :: MutableImage RealWorld PixelRGB8 <- newMutableImage 256 256
withHalideBuffer @3 @Word8 imgIn $ \imgInPtr -> do
withHalideBuffer @3 @Word8 imgOut $ \imgOutPtr -> do
runOnGpu imgInPtr imgOutPtr
freezeImage imgOut >>= writePng "brightened.png"
tileGpu f x y w h = do
xo <- mkVar "xo"
yo <- mkVar "yo"
xi <- mkVar "xi"
yi <- mkVar "yi"
split TailAuto x (xo, xi) w f
split TailAuto y (yo, yi) h f
reorder [xi, yi, xo, yo] f
gpuBlocks DeviceDefaultGPU (xo, yo) f
gpuThreads DeviceDefaultGPU (xi, yi) f
return (xo, yo, xi, yi)*** Exception: the Buffer still references data on the device; did you forget to call copyToHost?
Call copyToHost:
tileGpu brightened x y 16 16
copyToHost brightened
return brightened*** Exception: CppStdException e "Error: Func brightened$13 is scheduled as copy_to_host/device, but has value: ((uint8)original_im$13(c, x, y) + (uint8)50)\nExpected a single call to another Func with matching dimensionality and argument order.\n"(Just "std::runtime_error")
Replace with a wrapper Func:
tileGpu brightened x y 16 16
wrapper <- define "wrapper" (c, x, y) $ brightened ! (c, x, y)
return wrapper*** Exception: CppStdException e "Error: Cannot parallelize dimension x.xi of function brightened$20 because the function is scheduled inline.\n"(Just "std::runtime_error")
computeRoot brightened:
tileGpu brightened x y 16 16
computeRoot brightened
wrapper <- define "wrapper" (c, x, y) $ brightened ! (c, x, y)
copyToHost wrapper
return wrapper*** Exception: the Buffer still references data on the device; did you forget to call copyToHost?
Same error as before. Let's try withCopiedToHost which operates on the buffer:
withHalideBuffer @3 @Word8 imgIn $ \imgInPtr -> do
withHalideBuffer @3 @Word8 imgOut $ \imgOutPtr -> do
runOnGpu imgInPtr imgOutPtr
withCopiedToHost imgOutPtr $ return ()*** Exception: the Buffer still references data on the device; did you forget to call copyToHost?
Same error as before.
Using realizeOnTarget does not throw an error, but the only way to get the data back seems to be via peekToList (slow):
imgIn :: MutableImage RealWorld PixelRGB8 <- newMutableImage 256 256
imgOut :: MutableImage RealWorld PixelRGB8 <- newMutableImage 256 256
asBufferParam imgIn \original -> do
y <- mkVar "y"
x <- mkVar "x"
c <- mkVar "c"
brightened <- define "brightened" (c, x, y) $ original ! (c, x, y) + 100
tileGpu brightened x y 16 16
realizeOnTarget gpu brightened [3, 256, 256] $ \brightenedPtr -> do
withCopiedToHost brightenedPtr do
print =<< peekToList brightenedPtr
-- could write list back to imgOut, but would be slowHere's the definition of realizeOnTarget, which calls allocaBuffer internally:
halide-haskell/src/Language/Halide/Func.hs
Lines 922 to 932 in 7e88a54
| realizeOnTarget target func shape action = | |
| withFunc func $ \func' -> | |
| withCxxTarget target $ \target' -> | |
| allocaBuffer target shape $ \buf -> do | |
| let raw = castPtr buf | |
| [C.throwBlock| void { | |
| handle_halide_exceptions([=](){ | |
| $(Halide::Func* func')->realize($(halide_buffer_t* raw), *$(const Halide::Target* target')); | |
| }); | |
| } |] | |
| action buf |
Perhaps there could be another function like realizeOnTargetGivenBuffer which takes a Ptr (HalideBuffer n a) to use instead of allocating one itself. I think that would enable writing directly to imgOut.
Am I going about this in the right way, or am I missing something fundamental?