Skip to content

Commit b5bfedf

Browse files
committed
adjust docker file and cutlass instructions
1 parent 6ff7a0d commit b5bfedf

File tree

8 files changed

+232
-37
lines changed

8 files changed

+232
-37
lines changed

.dev-scripts/.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
test_*.sh
Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
# take caution: everything is quite hardcoded here
2+
# any changes to the readme could break this code
3+
# run it from root directory: python extract_install_cmd.py path/to/custom/torch-xxx.whl
4+
5+
import argparse
6+
parser = argparse.ArgumentParser()
7+
parser.add_argument("custom_pytorch_path", help="Path to custom PyTorch wheel")
8+
args = parser.parse_args()
9+
10+
BLOCK_HEADER_START = "### Conda on Linux"
11+
12+
with open("README.md") as infile:
13+
content = infile.readlines()
14+
15+
local_install_instructions = []
16+
global_install_instructions = []
17+
18+
in_code_block = False
19+
reading_instructions = False
20+
insert_block_pause = False
21+
instruction_type = ""
22+
23+
FILE_INTRO = """#!/usr/bin/env bash
24+
25+
function check_error() {
26+
# shows and then runs a command. if the exit code is not zero, aborts the script
27+
# usage: check_error mv foo bar
28+
29+
echo + $@
30+
"$@"
31+
local exit_code=$?
32+
if [ "${exit_code}" -ne 0 ]; then
33+
echo "! > An error occured, aborting."
34+
exit 1
35+
fi
36+
}
37+
"""
38+
EXTRA_CONDA_INSTRUCTION = """# extra step for bash script (not required in a proper command line):
39+
eval "$(conda shell.bash hook)"
40+
"""
41+
42+
43+
for line in content:
44+
if line.startswith("```"):
45+
in_code_block = not in_code_block
46+
continue
47+
if line.startswith(BLOCK_HEADER_START):
48+
reading_instructions = True
49+
instruction_type = "global"
50+
continue
51+
if line.startswith("<details><summary>"):
52+
instruction_type = "local"
53+
continue
54+
if line.startswith("</details>"):
55+
instruction_type = "both"
56+
continue
57+
if line.startswith(BLOCK_HEADER_START.split()[0]):
58+
reading_instructions = False
59+
continue
60+
if not reading_instructions:
61+
continue
62+
if not in_code_block:
63+
insert_block_pause = True
64+
continue
65+
66+
# deal with comments
67+
if line.startswith("# export CC="):
68+
line = line[2:]
69+
if line.startswith("#"):
70+
continue
71+
72+
# replace some line contents and add some lines
73+
if "conda activate" in line:
74+
line = EXTRA_CONDA_INSTRUCTION + "check_error " + line
75+
if "export BITORCH_WORKSPACE" in line:
76+
line = line.replace("${HOME}", "$(pwd)")
77+
if line.startswith("pip install torch-"):
78+
line = "pip install {}\n".format(args.custom_pytorch_path)
79+
80+
# decide how to write line
81+
line_format = "check_error {line}"
82+
if line.startswith("#"):
83+
line_format = "{line}"
84+
if insert_block_pause:
85+
insert_block_pause = False
86+
line_format = "\n" + line_format
87+
88+
# write result line(s)
89+
if instruction_type == "global" or instruction_type == "both":
90+
global_install_instructions.append(line_format.format(line=line))
91+
if instruction_type == "local" or instruction_type == "both":
92+
local_install_instructions.append(line_format.format(line=line))
93+
94+
with open(".dev-scripts/test_local_conda_install.sh", "w") as outfile:
95+
outfile.write(FILE_INTRO)
96+
outfile.writelines(local_install_instructions)
97+
with open(".dev-scripts/test_global_conda_install.sh", "w") as outfile:
98+
outfile.write(FILE_INTRO)
99+
outfile.writelines(global_install_instructions)

CHANGELOG.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,17 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/)
55
and this project adheres to [Semantic Versioning](http://semver.org/).
66

77

8+
## [0.2.2] - 2024/04/29
9+
10+
### Updated
11+
12+
- Building instructions (adding a section for cutlass)
13+
- Checksums for custom torch builds (within docker)
14+
15+
### Fixed
16+
17+
- An error in `pack_fp_weight`
18+
819
## [0.2.1] - 2024/04/27
920

1021
### Fixed

README.md

Lines changed: 44 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -40,15 +40,20 @@ The requirements are:
4040
- A compiler that fully supports C++17, such as clang or gcc (gcc 9.4.0 or newer is required, but gcc 12.x is not supported yet)
4141
- Python 3.9 or later
4242
- PyTorch 1.8 or later
43-
- CUDA Toolkit 11.8 or 12.1 (optional, for CUDA accelerated layers)
4443

45-
For more detailed information, you can check the [requirements of PyTorch](https://github.com/pytorch/pytorch?tab=readme-ov-file#prerequisites).
44+
Please check your operating system's options for the C++ compiler.
45+
For more detailed information, you can check the [requirements to build PyTorch from source](https://github.com/pytorch/pytorch?tab=readme-ov-file#prerequisites).
46+
In addition, for layers to speed up on specific hardware (such as CUDA devices, or MacOS M1/2/3 chips), we recommend installing:
47+
48+
- CUDA Toolkit 11.8 or 12.1 for CUDA accelerated layers
49+
- **[MLX](https://github.com/ml-explore/mlx)** for mlx-based layers on MacOS
50+
- **[CUTLASS](https://github.com/NVIDIA/cutlass)** for cutlass-based layers
4651

4752
Currently, the engine **needs to be built from source**.
48-
We provide instructions how to install Python/PyTorch (and CUDA/MLX) for:
53+
We provide instructions for the following options:
4954

50-
- Conda + Linux (with CUDA)
51-
- Docker (with CUDA)
55+
- Conda + Linux (with CUDA and cutlass)
56+
- Docker (with CUDA and cutlass)
5257
- Conda + MacOS (with MLX)
5358

5459
We recommend managing your BITorch Engine installation in a conda environment (otherwise you should adapt/remove certain variables, e.g. `CUDA_HOME`).
@@ -57,6 +62,8 @@ You may wish to adapt the CUDA version to 12.1 where applicable.
5762

5863
### Conda on Linux (with CUDA)
5964

65+
To use these instructions, you need to have [conda](https://conda.io/projects/conda/en/latest/user-guide/getting-started.html) and a suitable C++ compiler installed.
66+
6067
1. Create Environment for Python 3.9 and activate it:
6168
```bash
6269
conda create -y --name bitorch-engine python=3.9
@@ -72,8 +79,22 @@ pip install torch-2.1.0-cp39-cp39-linux_x86_64.whl
7279
# optional: install corresponding torchvision (check https://github.com/pytorch/vision?tab=readme-ov-file#installation in the future)
7380
pip install "torchvision==0.16.0" --index-url https://download.pytorch.org/whl/cu118
7481
```
82+
4. To use cutlass layers, you should also install CUTLASS 2.8.0 (from source), adjust `CUTLASS_HOME` (this is where we clone and install cutlass)
83+
(if you have older or newer GPUs you may need to add your [CUDA compute capability](https://developer.nvidia.com/cuda-gpus) in `CUTLASS_NVCC_ARCHS`):
84+
```bash
85+
export CUTLASS_HOME="/some/path"
86+
mkdir -p "${CUTLASS_HOME}"
87+
git clone --depth 1 --branch "v2.8.0" "https://github.com/NVIDIA/cutlass.git" --recursive ${CUTLASS_HOME}/source
88+
mkdir -p "${CUTLASS_HOME}/build" && mkdir -p "${CUTLASS_HOME}/install"
89+
cd "${CUTLASS_HOME}/build"
90+
cmake ../source -DCMAKE_INSTALL_PREFIX="${CUTLASS_HOME}/install" -DCUTLASS_ENABLE_TESTS=OFF -DCUTLASS_ENABLE_EXAMPLES=OFF -DCUTLASS_NVCC_ARCHS='75;80;86'
91+
make -j 4
92+
cmake --install .
93+
```
94+
If you have difficulties installing cutlass, you can check the [official documentation](https://github.com/NVIDIA/cutlass/tree/v2.8.0),
95+
use the other layers without installing it or try the docker installation.
7596

76-
Alternatively, you can also save the environment and clone the repository within the same directory.
97+
As an alternative to the instructions above, you can also store the environment and clone all repositories within one "root" directory.
7798

7899
<details><summary>Click to here to expand the instructions for this.</summary>
79100

@@ -99,17 +120,33 @@ pip install torch-2.1.0-cp39-cp39-linux_x86_64.whl
99120
# optional: install corresponding torchvision (check https://github.com/pytorch/vision?tab=readme-ov-file#installation in the future)
100121
pip install "torchvision==0.16.0" --index-url https://download.pytorch.org/whl/cu118
101122
```
123+
4. To use cutlass layers, you should also install CUTLASS 2.8.0
124+
(if you have older or newer GPUs you may need to add your [CUDA compute capability](https://developer.nvidia.com/cuda-gpus) in `CUTLASS_NVCC_ARCHS`):
125+
```bash
126+
export CUTLASS_HOME="${BITORCH_WORKSPACE}/cutlass"
127+
mkdir -p "${CUTLASS_HOME}"
128+
git clone --depth 1 --branch "v2.8.0" "https://github.com/NVIDIA/cutlass.git" --recursive ${CUTLASS_HOME}/source
129+
mkdir -p "${CUTLASS_HOME}/build" && mkdir -p "${CUTLASS_HOME}/install"
130+
cd "${CUTLASS_HOME}/build"
131+
cmake ../source -DCMAKE_INSTALL_PREFIX="${CUTLASS_HOME}/install" -DCUTLASS_ENABLE_TESTS=OFF -DCUTLASS_ENABLE_EXAMPLES=OFF -DCUTLASS_NVCC_ARCHS='75;80;86'
132+
make -j 4
133+
cmake --install .
134+
cd "${BITORCH_WORKSPACE}"
135+
```
136+
If you have difficulties installing cutlass, you can check the [official documentation](https://github.com/NVIDIA/cutlass/tree/v2.8.0),
137+
use the other layers without installing it or try the docker installation.
102138
</details>
103139

104140
After setting up the environment, clone the code and build with pip (to hide the build output remove `-v`):
105141

106142
```bash
143+
# make sure you are in a suitable directory, e.g. your bitorch workspace
107144
git clone --recursive https://github.com/GreenBitAI/bitorch-engine
108145
cd bitorch-engine
109146
# only gcc versions 9.x, 10.x, 11.x are supported
110147
# to select the correct gcc, use:
111148
# export CC=gcc-11 CPP=g++-11 CXX=g++-11
112-
CUDA_HOME="${CONDA_PREFIX}" pip install -e . -v
149+
CPATH="${CUTLASS_HOME}/install/include" CUDA_HOME="${CONDA_PREFIX}" pip install -e . -v
113150
```
114151

115152
### Docker (with CUDA)

bitorch_engine/layers/qlinear/nbit/cuda/utils.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -107,8 +107,9 @@ def pack_fp_weight(weight: torch.Tensor, qweight: MPQWeightParameter) -> torch.T
107107
# Adjust scales and zeros for symmetric quantization without group index
108108
scales = scales.unsqueeze(1).repeat(1, weight.size(0)//scales.size(0), 1).view(-1, scales.size(-1))
109109
zeros = zeros.unsqueeze(1).repeat(1, weight.size(0) // zeros.size(0), 1).view(-1, zeros.size(-1))
110-
q_perm = qweight.q_perm.unsqueeze(1).repeat(1, weight.size(1)).long()
111-
weight = torch.gather(weight, dim=0, index=q_perm)
110+
if hasattr(qweight, "q_perm") and qweight.q_perm is not None:
111+
q_perm = qweight.q_perm.unsqueeze(1).repeat(1, weight.size(1)).long()
112+
weight = torch.gather(weight, dim=0, index=q_perm)
112113

113114
intweight = torch.round((weight + zeros) / scales).to(torch.int32).clamp(0, 2 ** w_bit - 1)
114115
else:

docker/Dockerfile

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,10 +7,10 @@ RUN apt-get update && \
77
apt-get install -y git && \
88
apt-get clean && \
99
rm -rf /var/lib/apt/lists/* && \
10-
git clone --depth 1 --branch "v${CUTLASS_VERSION}" "https://github.com/NVIDIA/cutlass.git" --recursive /cutlass && \
10+
git clone --depth 1 --branch "v${CUTLASS_VERSION}" "https://github.com/NVIDIA/cutlass.git" --recursive /cutlass/source && \
1111
mkdir /cutlass/build && \
1212
cd /cutlass/build && \
13-
cmake .. -DCMAKE_INSTALL_PREFIX:PATH=/usr/local -DBUILD_TESTING=OFF -DCUTLASS_NVCC_ARCHS='75;80;86' && \
13+
cmake ../source -DCMAKE_INSTALL_PREFIX=/cutlass/install -DCUTLASS_ENABLE_TESTS=OFF -DCUTLASS_ENABLE_EXAMPLES=OFF -DCUTLASS_NVCC_ARCHS='75;80;86' && \
1414
make -j `$(nproc)` && \
1515
cmake --install .
1616

@@ -32,7 +32,7 @@ RUN git clone \
3232
"${GIT_URL}" \
3333
/bitorch-engine && \
3434
cd /bitorch-engine && \
35-
BIE_FORCE_CUDA="true" pip install -e ${BUILD_TARGET} -v && \
35+
BIE_FORCE_CUDA="true" CPATH="/cutlass/install/include" pip install -e ${BUILD_TARGET} -v && \
3636
rm -rf build/ bitorch_engine.egg-info/
3737

3838
FROM no-examples as example-ready

docker/build_scripts/install_modified_pytorch.sh

Lines changed: 4 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -17,24 +17,14 @@ file="custom_torch.whl"
1717
## adding them here individually is tedious, but we need to build them manually and ensure compatibility anyway
1818

1919
if [ "${from_image}" == "pytorch/pytorch:2.2.0-cuda11.8-cudnn8-devel" ]; then
20-
gdrive_id="1sS3LS_8wEm2CJ-oCHZAWYeuXjJHXPINP"
20+
gdrive_id="1PoVor85-RF3s0KpOP19mFV5hNUnHERa1"
2121
file="torch-2.2.2-cp310-cp310-linux_x86_64.whl"
22-
checksum="1a7e8f1c315d3aefcc65b0a6676857b9cde4877737a134cf1423a048d8938985"
22+
checksum="6646519e5e7b4af8f99b79eb9be3e6460b0d05c4695bbf86de02568f37ff3fea"
2323
fi
2424
if [ "${from_image}" == "pytorch/pytorch:2.2.0-cuda12.1-cudnn8-devel" ]; then
25-
gdrive_id="18DP0P9MJ4U211HR5-1ss6NogFPcIOJDR"
25+
gdrive_id="1LjFNImboq8QeFSompMS2gPjBRYtP2Dsz"
2626
file="torch-2.2.2-cp310-cp310-linux_x86_64.whl"
27-
checksum="5f89163d910e1e1ee6010e4ea5d478756c021abab1e248be9716d3bee729b9e7"
28-
fi
29-
if [ "${from_image}" == "pytorch/pytorch:2.1.0-cuda11.8-cudnn8-devel" ]; then
30-
gdrive_id="1QK_QqlPubFNgitiOkSABZ3AZyg7M0ezc"
31-
file="torch-2.1.0-cp39-cp39-linux_x86_64.whl"
32-
checksum="6600c130395b66bd047ca01b077f702703924eb3eaab2d3d04d9eb51154d9080"
33-
fi
34-
if [ "${from_image}" == "pytorch/pytorch:2.1.0-cuda12.1-cudnn8-devel" ]; then
35-
gdrive_id="1fguT0jRJwRE1126rPpEvL9G6F246CLar"
36-
file="torch-2.1.0-cp39-cp39-linux_x86_64.whl"
37-
checksum="10b95aaca45558f3b80ee331677ddd925f3891ef542ab419ae68dd57641b9a12"
27+
checksum="2a5953dab7be6c1640112e38ae7519ad88180d9fa79faab6c86dbee6b1cc210e"
3828
fi
3929
#if [ "${from_image}" == "pytorch/pytorch:X.X.X-cudaXX.X-cudnn8-devel" ]; then
4030
# gdrive_id="xxx"

0 commit comments

Comments
 (0)