Splitting this package in managable chunks

### Comment:

This package currently requires more than 16 builds to be build manually to ensure that it completes in time on the CIs.

# Step 1: No more git clone
rgommers identified that one portion of the build process that takes time is cloning the repository. In my experience, cloning the 1.5GB repo can take up to 10 min on my powerful local machine, but I feel like it can take much longer on the CIs.

To avoid cloning, we will have to list out all the submodule manually, or make the conda-forge installable dependencies.

I mostly got this working using a recursive script which should help us keep it maintained: https://github.com/conda-forge/pytorch-cpu-feedstock/pull/109

# Option 1: Split off Dependencies:

| Dependency     | linux  | mac    | win | GPU Aware | PR                                                       | system deps                                                                                    |
|----------------|--------|--------|-----|-----------|----------------------------------------------------------|------------------------------------------------------------------------------------------------|
| pybind11       |        |        |     | no        | https://github.com/conda-forge/pybind11-feedstock        | USE_SYSTEM_PYBIND11                                                                            |
| cub            |        |        |     | no        | https://github.com/conda-forge/cub-feedstock             |                                                                                                |
| eigen          |        |        |     | no        | https://github.com/conda-forge/eigen-feedstock           | [USE_SYSTEM_EIGEN_INSTALL](https://github.com/pytorch/pytorch/blob/master/CMakeLists.txt#L265) |
| googletest     |        |        |     | no        | will not package                                         |                                                                                                |
| benchmark      |        |        |     | no        | https://github.com/conda-forge/benchmark-feedstock       |                                                                                                |
| protobuf       |        |        |     | no        | https://github.com/conda-forge/libprotobuf-feedstock     |                                                                                                |
| ios-cmake      |        |        |     |           | not needed since we don't target ios                     |                                                                                                |
| NNPACK         | yes    | yes    |     | no        | https://github.com/conda-forge/staged-recipes/pull/19103 |                                                                                                |
| gloo           | yes    | yes   |     | yes     | https://github.com/conda-forge/staged-recipes/pull/19103 | USE_SYSTEM_GLOO                                                                                |
| pthreadpool    | yes    | yes    |     | no        | https://github.com/conda-forge/staged-recipes/pull/19103 | USE_SYSTEM_PTHREADPOOL                                                                         |
| FXdiv          | yes    | yes    |     | header    | https://github.com/conda-forge/staged-recipes/pull/19103 | USE_SYSTEM_FXDIV                                                                               |
| FP16           | yes    | yes    |     | header    | https://github.com/conda-forge/staged-recipes/pull/19103 | USE_SYSTEM_FP16                                                                                |
| psimd          | yes    | yes    |     | header    | https://github.com/conda-forge/staged-recipes/pull/19103 | USE_SYSTEM_PSIMD                                                                               |
| zstd           | yes    | yes    | yes | no        | https://github.com/conda-forge/zstd-feedstock            |                                                                                                |
| cpuinfo        | yes    | yes    | no  | no        | https://github.com/conda-forge/staged-recipes/pull/19103 | USE_SYSTEM_CPUINFO                                                                             |
| python-enum    |        |        |     | no        | https://github.com/conda-forge/enum34-feedstock          |                                                                                                |
| python-peachpy | yes    | yes    | yes | no        | https://github.com/conda-forge/staged-recipes/pull/19103 |                                                                                                |
| python-six     | yes    | yes    | yes | no        | https://github.com/conda-forge/six-feedstock             |                                                                                                |
| onnx           |        |        |     | no        | https://github.com/conda-forge/onnx-feedstock            | USE_SYSTEM_ONNX                                                                                |
| onnx-tensorrt  |        |        |     | only       |                                                          |                                                                                                |
| sleef          |        |        |     | no        | https://github.com/conda-forge/sleef-feedstock           | USE_SYSTEM_SLEEF                                                                               |
| ideep          |        |        |     |           |                                                          |                                                                                                |
| oneapisrc      |        |        |     |           |                                                          |                                                                                                |
| nccl           |        |        |     |           | https://github.com/conda-forge/nccl-feedstock            |                                                                                                |
| gemmlowp       |        |        |     |           |                                                          |                                                                                                |
| QNNPACK        | yes    | yes    |     |           | https://github.com/conda-forge/staged-recipes/pull/19103 |                                                                                                |
| neon2sse       |        |        |     |           |                                                          |                                                                                                |
| fbgemm         |        |        |     | yes       |                                                          |                                                                                                |
| foxi           |        |        |     |           |                                                          |                                                                                                |
| tbb            |        |        |     |           | https://github.com/conda-forge/tbb-feedstock             | USE_SYSTEM_TBB (deprecated)                                                                    |
| fbjni          |        |        |     |           |                                                          |                                                                                                |
| XNNPACK        | yes    | yes    |     |           | https://github.com/conda-forge/staged-recipes/pull/19103 | USE_SYSTEM_XNNPACK                                                                             |
| fmt            |        |        |     |           | https://github.com/conda-forge/fmt-feedstock             |                                                                                                |
| tensorpipe     |        |        |     |  yes  |                                                          |                                                                                                |
| cudnn_frontend |        |        |     |           |                                                          |                                                                                                |
| kineto         |        |        |     |           |                                                          |                                                                                                |
| pocketfft      |        |        |     |           |                                                          |                                                                                                |
| breakpad       |        |        |     |           |                                                          |                                                                                                |
| flatbuffers    | yes    | yes    | yes | no        | https://github.com/conda-forge/flatbuffers-feedstock     |                                                                                                |
| clog           | static | static |     |           | https://github.com/conda-forge/staged-recipes/pull/19103 |                                                                                                |
- clog seems to be a pretty low level library that is assisted by compile time flags. I think it is best if we don't package that one as a library. It seems like it will require some serious consideration in terms of performance if we do. They typically the full source in the repository. The only problematic thing, is that each package attempts to install the static library into the library path.
- QNNPACK has a build option to allow a special provision for CAFFE2's implementation of `pthreadpool`
    - It seems to be problematic with `pthreadpool` on OSX.
- QNNPACK likely has two different implementations, the one they vendored in ATen, and the one they vendored in `third_party`.
- NNPACK has two different backens, one generated by python it seems, but for some reason `fp16.py` cannot be found, the other with `psimd`.

# Option 2 - step 1: Build a libpytorch package or something

By setting `BUILD_PYTHON=OFF` in https://github.com/conda-forge/pytorch-cpu-feedstock/pull/112/ we then end up with the following libraries in `lib` and `include`:

| Dependency           | linux | mac | win | GPU Aware | PR                                                       |
|----------------------|-------|-----|-----|-----------|----------------------------------------------------------|
| libasmjit            | yes | yes |     |           | https://github.com/conda-forge/staged-recipes/pull/19103 |
| libc10               | yes   | yes |     |           | https://github.com/conda-forge/staged-recipes/pull/19103 |
| libfbgemm            | yes  | yes |     | yes |  https://github.com/conda-forge/staged-recipes/pull/19103  |
| libgloo              | yes | yes |     | yes       |                                                          |
| libkineto            | yes  |     |     | yes  | https://github.com/conda-forge/staged-recipes/pull/19103 |
| libnnpack            | yes   |     |     | ??? | https://github.com/conda-forge/staged-recipes/pull/19103 |
| libpytorch_qnnpack | yes | yes |     |           | https://github.com/conda-forge/staged-recipes/pull/19103 |
| libqnnpack           | yes | yes  |     |           | https://github.com/conda-forge/staged-recipes/pull/19103 |
| libtensorpipe        |       |     |     | yes  |                                                          |
| libtorch             |       |     |     |           |                                                          |
| libtorch_cpu         |       |     |     |           |                                                          |
| libtorch_global_deps |       |     |     |           |                                                          |
| Header only          |       |     |     |           |                                                          |
| ATen                 |       |     |     |           |                                                          |
| c10d                 |       |     |     |           |                                                          |
| caffe2               |       |     |     |           |                                                          |
| libnop               | yes | yes |    |          |  https://github.com/conda-forge/staged-recipes/pull/19103  |

# Option 2 - step 2: Depend on new ATen/libpytorch package



## Compilation time progress

| platform   | python | cuda | main  | tar gh-109 | system deps |
|------------|--------|------|-------|------------|-------------|
| linux 64   | 3.7    | no   | 1h57m | 1h54m      |             |
| linux 64   | 3.8    | no   | 2h0m  | 1h51m      |             |
| linux 64   | 3.9    | no   | 2h31m | 2h2m       |             |
| linux 64   | 3.10   | no   | 2h26m | 2h7m       |             |
| linux 64   | 3.7    | 11.2 | 6h+ (`3933/4242` 309 remaining)  | 6h+        |             |
| linux 64   | 3.8    | 11.2 | 6h+ (`3897/4242` 345 remaning)   | 6h+        |             |
| linux 64   | 3.9    | 11.2 | 6h+ (`3924/4242` 318 remaining)  | 6h+         | 6h+`1656/1969` 313 remaining            |
| linux 64   | 3.10   | 11.2 | 6h+ (`3962/4242` 280 remaining)  | 6h+        |             |
| osx-64     | 3.7    |      | 2h42m | 2h39m      |             |
| osx-64     | 3.8    |      | 3h28m | 2h52m      |             |
| osx-64     | 3.9    |      | 2h40m | 2h42m      |             |
| osx-64     | 3.10   |      | 3h2m  | 2h42m      |             |
| osx-arm-64 | 3.8    |      | 1h51  | 1h37m      |             |
| osx-arm-64 | 3.9    |      | 2h20m | 2h10m      |             |
| osx-arm-64 | 3.10   |      | 4h25m | 2h1m       |             |

There are approximately:
* 3600 files to compile for cmake for the CPU builds with the standard build process
* 1600-1800 files to compile when using system dependencies: https://github.com/conda-forge/pytorch-cpu-feedstock/pull/111

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Splitting this package in managable chunks #108

Comment:

Step 1: No more git clone

Option 1: Split off Dependencies:

Option 2 - step 1: Build a libpytorch package or something

Option 2 - step 2: Depend on new ATen/libpytorch package

Compilation time progress

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Dependency	linux	mac	win	GPU Aware	PR	system deps
pybind11				no	https://github.com/conda-forge/pybind11-feedstock	USE_SYSTEM_PYBIND11
cub				no	https://github.com/conda-forge/cub-feedstock
eigen				no	https://github.com/conda-forge/eigen-feedstock	USE_SYSTEM_EIGEN_INSTALL
googletest				no	will not package
benchmark				no	https://github.com/conda-forge/benchmark-feedstock
protobuf				no	https://github.com/conda-forge/libprotobuf-feedstock
ios-cmake					not needed since we don't target ios
NNPACK	yes	yes		no	conda-forge/staged-recipes#19103
gloo	yes	yes		yes	conda-forge/staged-recipes#19103	USE_SYSTEM_GLOO
pthreadpool	yes	yes		no	conda-forge/staged-recipes#19103	USE_SYSTEM_PTHREADPOOL
FXdiv	yes	yes		header	conda-forge/staged-recipes#19103	USE_SYSTEM_FXDIV
FP16	yes	yes		header	conda-forge/staged-recipes#19103	USE_SYSTEM_FP16
psimd	yes	yes		header	conda-forge/staged-recipes#19103	USE_SYSTEM_PSIMD
zstd	yes	yes	yes	no	https://github.com/conda-forge/zstd-feedstock
cpuinfo	yes	yes	no	no	conda-forge/staged-recipes#19103	USE_SYSTEM_CPUINFO
python-enum				no	https://github.com/conda-forge/enum34-feedstock
python-peachpy	yes	yes	yes	no	conda-forge/staged-recipes#19103
python-six	yes	yes	yes	no	https://github.com/conda-forge/six-feedstock
onnx				no	https://github.com/conda-forge/onnx-feedstock	USE_SYSTEM_ONNX
onnx-tensorrt				only
sleef				no	https://github.com/conda-forge/sleef-feedstock	USE_SYSTEM_SLEEF
ideep
oneapisrc
nccl					https://github.com/conda-forge/nccl-feedstock
gemmlowp
QNNPACK	yes	yes			conda-forge/staged-recipes#19103
neon2sse
fbgemm				yes
foxi
tbb					https://github.com/conda-forge/tbb-feedstock	USE_SYSTEM_TBB (deprecated)
fbjni
XNNPACK	yes	yes			conda-forge/staged-recipes#19103	USE_SYSTEM_XNNPACK
fmt					https://github.com/conda-forge/fmt-feedstock
tensorpipe				yes
cudnn_frontend
kineto
pocketfft
breakpad
flatbuffers	yes	yes	yes	no	https://github.com/conda-forge/flatbuffers-feedstock
clog	static	static			conda-forge/staged-recipes#19103

Dependency	linux	mac	GPU Aware	PR
libasmjit	yes	yes		conda-forge/staged-recipes#19103
libc10	yes	yes		conda-forge/staged-recipes#19103
libfbgemm	yes	yes	yes	conda-forge/staged-recipes#19103
libgloo	yes	yes	yes
libkineto	yes		yes	conda-forge/staged-recipes#19103
libnnpack	yes		???	conda-forge/staged-recipes#19103
libpytorch_qnnpack	yes	yes		conda-forge/staged-recipes#19103
libqnnpack	yes	yes		conda-forge/staged-recipes#19103
libtensorpipe			yes
libtorch
libtorch_cpu
libtorch_global_deps
Header only
ATen
c10d
caffe2
libnop	yes	yes		conda-forge/staged-recipes#19103

platform	python	cuda	main	tar gh-109	system deps
linux 64	3.7	no	1h57m	1h54m
linux 64	3.8	no	2h0m	1h51m
linux 64	3.9	no	2h31m	2h2m
linux 64	3.10	no	2h26m	2h7m
linux 64	3.7	11.2	6h+ (`3933/4242` 309 remaining)	6h+
linux 64	3.8	11.2	6h+ (`3897/4242` 345 remaning)	6h+
linux 64	3.9	11.2	6h+ (`3924/4242` 318 remaining)	6h+	6h+`1656/1969` 313 remaining
linux 64	3.10	11.2	6h+ (`3962/4242` 280 remaining)	6h+
osx-64	3.7		2h42m	2h39m
osx-64	3.8		3h28m	2h52m
osx-64	3.9		2h40m	2h42m
osx-64	3.10		3h2m	2h42m
osx-arm-64	3.8		1h51	1h37m
osx-arm-64	3.9		2h20m	2h10m
osx-arm-64	3.10		4h25m	2h1m

Uh oh!

Splitting this package in managable chunks #108

Description

Comment:

Step 1: No more git clone

Option 1: Split off Dependencies:

Option 2 - step 1: Build a libpytorch package or something

Option 2 - step 2: Depend on new ATen/libpytorch package

Compilation time progress

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions