Skip to content

Conversation

@felixcremer
Copy link
Collaborator

This is currently not using the GDALBufferBand but just plain cat. I would like to test this with actual large data to see, how the performance is, but unfortunately I currently don't have access to the data at EODC.

We could also improve the performance by not loading the data using Cube but with ArchGDAL directly, but at the moment I think that this should not be a bottleneck, because that is only 1 second out of a few minutes. But we need a proper benchmark of the IO times with this approach to then see what we need to further optimize.

@felixcremer
Copy link
Collaborator Author

felixcremer commented Feb 15, 2025

Unfortunately the processing of the data doesn't finish when I aggregate the different scenes with DAE.
The computation goes until 99% and then it blocks. The error below appears after I interrupted the process manually, see below.
While running the processing with the testdata threaded and distributed on my laptop I saw a lock conflict in the time report.

┌ Warning: There are still cache misses                                                                                                                                                       
└ @ YAXArrays.DAT /mnt/felix1/worldmap/dev/YAXArrays/src/DAT/DAT.jl:1099                                                                                                                      
Progress:  99%|██████████████████████████████████████████████████████▌|  ETA: 0:00:10^C^CERROR: InterruptException:                                                                           
Stacktrace:                                                                                                                                                                                   
  [1] try_yieldto(undo::typeof(Base.ensure_rescheduled))                                                                                                                                      
    @ Base ./task.jl:958                                                                                                                                                                      
  [2] wait()                                                                                                                                                                                  
    @ Base ./task.jl:1022                                                                                                                                                                     
  [3] wait(c::Base.GenericCondition{Base.Threads.SpinLock}; first::Bool)                                                                                                                      
    @ Base ./condition.jl:130                                                                                                                                                                 
  [4] wait                                                                                                                                                                                    
    @ ./condition.jl:125 [inlined]                                                                                                                                                            
  [5] _wait(t::Task)                                                                                                                                                                          
    @ Base ./task.jl:328                                                                                                                                                                      
  [6] wait(t::Task)                                                                                                                                                                           
    @ Base ./task.jl:368                                                                                                                                                                      
  [7] fetch                                                                                                                                                                                   
    @ ./task.jl:390 [inlined]                                                                                                                                                                 
  [8] (::Base.var"#1170#1172")(x::Task)                                                                                                                                                       
    @ Base ./asyncmap.jl:171                                                                                                                                                                  
  [9] foreach(f::Base.var"#1170#1172", itr::Vector{Any})                                                                                                                                      
    @ Base ./abstractarray.jl:3187                                                                                                                                                            
 [10] maptwice(wrapped_f::Function, chnl::Channel{Any}, worker_tasks::Vector{Any}, c::DiskArrays.GridChunks{2, Tuple{…}})                                                                     
    @ Base ./asyncmap.jl:171                                                                                                                                                                  
 [11] wrap_n_exec_twice                                                                                                                                                                       
    @ ./asyncmap.jl:147 [inlined]                                                                                                                                                             
 [12] #async_usemap#1155                                                                                                                                                                      
    @ ./asyncmap.jl:97 [inlined]                                                                                                                                                              
 [13] async_usemap                                                                                                                                                                            
    @ ./asyncmap.jl:78 [inlined]
 [14] #asyncmap#1154                                                                                                                                                                   
    @ ./asyncmap.jl:75 [inlined]
 [15] asyncmap                                 
    @ ./asyncmap.jl:74 [inlined]
 [16] pmap(f::Function, p::Distributed.CachingPool, c::DiskArrays.GridChunks{…}; distributed::Bool, batch_size::Int64, on_error::Nothing, retry_delays::Vector{…}, retry_check::Nothing)
    @ Distributed /mnt/felix1/.julia/juliaup/julia-1.11.3+0.x64.linux.gnu/share/julia/stdlib/v1.11/Distributed/src/pmap.jl:126
 [17] pmap(f::Function, p::Distributed.CachingPool, c::DiskArrays.GridChunks{2, Tuple{…}})     
    @ Distributed /mnt/felix1/.julia/juliaup/julia-1.11.3+0.x64.linux.gnu/share/julia/stdlib/v1.11/Distributed/src/pmap.jl:99
 [18] macro expansion                          
    @ ~/.julia/packages/ProgressMeter/kVZZH/src/ProgressMeter.jl:1049 [inlined]                
 [19] macro expansion                          
    @ ./task.jl:498 [inlined]
 [20] macro expansion                          
    @ ~/.julia/packages/ProgressMeter/kVZZH/src/ProgressMeter.jl:1048 [inlined]                
 [21] macro expansion                          
    @ ./task.jl:498 [inlined]
 [22] progress_map(::Function, ::Vararg{…}; mapfun::typeof(Distributed.pmap), progress::ProgressMeter.Progress, channel_bufflen::Int64, kwargs::@Kwargs{})
    @ ProgressMeter ~/.julia/packages/ProgressMeter/kVZZH/src/ProgressMeter.jl:1041            
 [23] progress_pmap(::Function, ::Vararg{Any}; kwargs::@Kwargs{})                              
    @ ProgressMeter ~/.julia/packages/ProgressMeter/kVZZH/src/ProgressMeter.jl:1066            
 [24] runLoop(dc::YAXArrays.DAT.DATConfig{1, 1}, showprog::Bool)                               
    @ YAXArrays.DAT /mnt/felix1/worldmap/dev/YAXArrays/src/DAT/DAT.jl:711                      
 [25] mapCube(fu::typeof(rqatrend), cdata::Tuple{…}, addargs::Float64; max_cache::Float64, indims::InDims, outdims::OutDims, inplace::Bool, ispar::Bool, debug::Bool, include_loopvars::Bool, showprog::Bool, irregular_loopranges::Bool, nthreads::Dict{…}, loopchunksize::Dict{…}, do_gc::Bool, kwargs::@Kwargs{})
    @ YAXArrays.DAT /mnt/felix1/worldmap/dev/YAXArrays/src/DAT/DAT.jl:498                      
 [26] mapCube                                  
    @ /mnt/felix1/worldmap/dev/YAXArrays/src/DAT/DAT.jl:311 [inlined]                          
 [27] #rqatrend#29                             
    @ /mnt/felix1/worldmap/dev/RQADeforestation/src/rqatrend.jl:12 [inlined]                   
 [28] macro expansion                          
    @ ./timing.jl:581 [inlined]
 [29] main(; tiles::Vector{…}, continent::String, indir::String, outdir::String, years::Vector{…}, polarisation::String, orbit::String, threshold::Float64, folders::Vector{…})
    @ RQADeforestation /mnt/felix1/worldmap/dev/RQADeforestation/src/main.jl:110               
 [30] top-level scope                          
    @ REPL[14]:1                               
Some type information was truncated. Use `show(err)` to see complete types.   

@felixcremer felixcremer merged commit 2b1ad4f into main Feb 24, 2025
3 checks passed
@felixcremer felixcremer deleted the fc/stackgroups branch February 24, 2025 13:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants