- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 50
Closed
Labels
enhancementNew feature or requestNew feature or requesthelp wantedExtra attention is neededExtra attention is needed
Description
Comment:
This package currently requires more than 16 builds to be build manually to ensure that it completes in time on the CIs.
Step 1: No more git clone
rgommers identified that one portion of the build process that takes time is cloning the repository. In my experience, cloning the 1.5GB repo can take up to 10 min on my powerful local machine, but I feel like it can take much longer on the CIs.
To avoid cloning, we will have to list out all the submodule manually, or make the conda-forge installable dependencies.
I mostly got this working using a recursive script which should help us keep it maintained: #109
Option 1: Split off Dependencies:
- clog seems to be a pretty low level library that is assisted by compile time flags. I think it is best if we don't package that one as a library. It seems like it will require some serious consideration in terms of performance if we do. They typically the full source in the repository. The only problematic thing, is that each package attempts to install the static library into the library path.
- QNNPACK has a build option to allow a special provision for CAFFE2's implementation of pthreadpool- It seems to be problematic with pthreadpoolon OSX.
 
- It seems to be problematic with 
- QNNPACK likely has two different implementations, the one they vendored in ATen, and the one they vendored in third_party.
- NNPACK has two different backens, one generated by python it seems, but for some reason fp16.pycannot be found, the other withpsimd.
Option 2 - step 1: Build a libpytorch package or something
By setting BUILD_PYTHON=OFF in #112 we then end up with the following libraries in lib and include:
| Dependency | linux | mac | win | GPU Aware | PR | 
|---|---|---|---|---|---|
| libasmjit | yes | yes | conda-forge/staged-recipes#19103 | ||
| libc10 | yes | yes | conda-forge/staged-recipes#19103 | ||
| libfbgemm | yes | yes | yes | conda-forge/staged-recipes#19103 | |
| libgloo | yes | yes | yes | ||
| libkineto | yes | yes | conda-forge/staged-recipes#19103 | ||
| libnnpack | yes | ??? | conda-forge/staged-recipes#19103 | ||
| libpytorch_qnnpack | yes | yes | conda-forge/staged-recipes#19103 | ||
| libqnnpack | yes | yes | conda-forge/staged-recipes#19103 | ||
| libtensorpipe | yes | ||||
| libtorch | |||||
| libtorch_cpu | |||||
| libtorch_global_deps | |||||
| Header only | |||||
| ATen | |||||
| c10d | |||||
| caffe2 | |||||
| libnop | yes | yes | conda-forge/staged-recipes#19103 | 
Option 2 - step 2: Depend on new ATen/libpytorch package
Compilation time progress
| platform | python | cuda | main | tar gh-109 | system deps | 
|---|---|---|---|---|---|
| linux 64 | 3.7 | no | 1h57m | 1h54m | |
| linux 64 | 3.8 | no | 2h0m | 1h51m | |
| linux 64 | 3.9 | no | 2h31m | 2h2m | |
| linux 64 | 3.10 | no | 2h26m | 2h7m | |
| linux 64 | 3.7 | 11.2 | 6h+ ( 3933/4242309 remaining) | 6h+ | |
| linux 64 | 3.8 | 11.2 | 6h+ ( 3897/4242345 remaning) | 6h+ | |
| linux 64 | 3.9 | 11.2 | 6h+ ( 3924/4242318 remaining) | 6h+ | 6h+ 1656/1969313 remaining | 
| linux 64 | 3.10 | 11.2 | 6h+ ( 3962/4242280 remaining) | 6h+ | |
| osx-64 | 3.7 | 2h42m | 2h39m | ||
| osx-64 | 3.8 | 3h28m | 2h52m | ||
| osx-64 | 3.9 | 2h40m | 2h42m | ||
| osx-64 | 3.10 | 3h2m | 2h42m | ||
| osx-arm-64 | 3.8 | 1h51 | 1h37m | ||
| osx-arm-64 | 3.9 | 2h20m | 2h10m | ||
| osx-arm-64 | 3.10 | 4h25m | 2h1m | 
There are approximately:
- 3600 files to compile for cmake for the CPU builds with the standard build process
- 1600-1800 files to compile when using system dependencies: WIP: Use more system libsΒ #111
ngam, rgommers and jjerphanngamngam, h-vetinari and jeongseok-metangam
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requesthelp wantedExtra attention is neededExtra attention is needed