- 
                Notifications
    You must be signed in to change notification settings 
- Fork 87
max pool #110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
max pool #110
Conversation
| Lol, the assertslow is there for a reason, since scalar indexing into a gpuarray is very slow and should not be done! | 
| Ouch! The benchmarks don't include padding. I should have mentioned that. Sorry, my bad. I'll create a separate benchmark with padding. | 
        
          
                src/pool.jl
              
                Outdated
          
        
      | @@ -0,0 +1,44 @@ | |||
| import CUDAnative | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not needed, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
        
          
                src/pool.jl
              
                Outdated
          
        
      | pool = UInt32(pool) | ||
| stride = UInt32(stride) | ||
| out = similar(b) | ||
| out = out[1:(div(Asize[1] - pool, stride) + 1), 1:(div(Asize[2] - pool, stride) + 1), :, :] | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you could just do similar(b, outsize) no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I was unaware of this. It should be similar(b, outSize...) perhaps. Also, outSize needs to be determined before similar is called.
| Updated. Thank you @SimonDanisch for PR #111 and commit 1e1104e. | 
39e7783    to
    fef2421      
    Compare
  
    
An implementation of
maxpool. Here's a sample benchmarking (CPU v/s GPU): https://gist.github.com/americast/95358d972647adf5c7ebcde7c58db51fTests were failing due to
getindex is disablederror. I have made a small change insrc/indexing.jlas a workaround.Thanks.