Skip to content

Commit 4f95471

Browse files
trsharm25aws-rhsolnrickpeng-awsakhil-awsaws-srsawant
authored
Neuron SDK Release 2.20.0 (#29)
--------- Co-authored-by: Rahul Solanki <[email protected]> Co-authored-by: Ricky Peng <[email protected]> Co-authored-by: Akhil Raj Azhikodan <[email protected]> Co-authored-by: Saurabh Arjun Sawant <[email protected]> Co-authored-by: Can Karakus <[email protected]> Co-authored-by: Shubham Chandak <[email protected]> Co-authored-by: Guangtai Huang <[email protected]> Co-authored-by: Dima Fayyad <[email protected]> Co-authored-by: Joey Zheng <[email protected]> Co-authored-by: Alexander Jipa <[email protected]> Co-authored-by: Kavish Gandhi <[email protected]> Co-authored-by: Aayush Sheth <[email protected]> Co-authored-by: Charan Shettyhalli Guruswamy <[email protected]> Co-authored-by: Ryan King <[email protected]> Co-authored-by: Jeffrey Huynh <[email protected]> Co-authored-by: Jiyoung An <[email protected]> Co-authored-by: zhuangw-at-533267172582 <[email protected]> Co-authored-by: Zhenyu Song <[email protected]> Co-authored-by: Jimmy Huang <[email protected]> Co-authored-by: rickpeng-aws <[email protected]> Co-authored-by: Goutham Ramakrishnan <[email protected]> Co-authored-by: Piyush Dugar <[email protected]> Co-authored-by: Ashraf Mahgoub <[email protected]> Co-authored-by: Yu Liu <[email protected]> Co-authored-by: Shruti Dubey <[email protected]> Co-authored-by: Shruti Dubey <[email protected]> Co-authored-by: Sertan Alkan <[email protected]> Co-authored-by: Rishabh Rajesh <[email protected]> Co-authored-by: Xiufeng Zhao <[email protected]> Co-authored-by: Mario Michael Krell <[email protected]> Co-authored-by: Jack Zhao <[email protected]> Co-authored-by: Yi-Hsiang (Sean) Lai <[email protected]> Co-authored-by: Octavian Soldea <[email protected]> Co-authored-by: Zhenkun Cai <[email protected]> Co-authored-by: Finn Thompson <[email protected]> Co-authored-by: Uros Lipovsek <[email protected]> Co-authored-by: Mario Michael Krell <[email protected]> Co-authored-by: Mugilan Ganesan <[email protected]> Co-authored-by: Prem Kumar A K <[email protected]>
1 parent 1100bf0 commit 4f95471

File tree

209 files changed

+16427
-7684
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

209 files changed

+16427
-7684
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -134,6 +134,7 @@ dmypy.json
134134

135135
build
136136
.vscode/
137+
*.iml
137138
.attach_pid*
138139
src/neuronx_distributed.egg-info/
139140
*.whl

.pre-commit-config.yaml

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
default_language_version:
2+
# force all unspecified python hooks to run python3
3+
python: python3
4+
repos:
5+
- repo: https://github.com/pre-commit/pre-commit-hooks
6+
rev: v2.3.0
7+
hooks:
8+
- id: end-of-file-fixer
9+
- id: trailing-whitespace
10+
- id: detect-aws-credentials
11+
- repo: https://github.com/pocc/pre-commit-hooks
12+
rev: v1.1.1
13+
hooks:
14+
- id: clang-format
15+
args: [--style=file, -i]
16+
- repo: https://github.com/astral-sh/ruff-pre-commit
17+
rev: v0.5.0
18+
hooks:
19+
- id: ruff
20+
name: ruff
21+
entry: ruff
22+
args: [check, --fix, "--line-length=120", "--ignore=F401,E203"]
23+
types: [python]
24+
language: system
25+
exclude: cases_update

README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,10 +9,10 @@ To install the library, please follow the instructions mentioned here: https://a
99
To build from source, run the following command:
1010

1111
```
12-
python3 setup.py bdist_wheel
12+
bash ./build.sh
1313
```
14-
15-
It should place the wheel at `dist/`
14+
15+
It should place the wheel at `build/`
1616

1717
## API Reference Guide
1818

build-tools/bin/custom-build

Lines changed: 5 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -8,43 +8,12 @@ LICENSE_TXT_PATH=${BUILD_PATH}/private/LICENSE.txt
88
BUILD_PATH_NEURONX_DISTRIBUTED=${BUILD_PATH}/public/NeuronxDistributed
99
mkdir -p ${BUILD_PATH_NEURONX_DISTRIBUTED}
1010

11-
# check against flake8 linter
12-
# Options used:
13-
# --max-line-length=120 is used since a lot of docstrings
14-
# contain lines longer than 120 that wouldn't make sense
15-
# to split (ex. code snippets)
16-
#
17-
# Warnings that are ignored
18-
# F401: unused import
19-
# - Reason to ignore: Side effects might occur on import.
20-
# Also, neuronx-cc check would trip this.
21-
# W503/504: newline before/after binary operator.
22-
# - Reason to Ignore: conditionals are often split into
23-
# multiple lines for readability).
24-
#
25-
# More info in the following links:
26-
# 1) https://flake8.pycqa.org/en/latest/user/error-codes.html
27-
# 2) https://pycodestyle.pycqa.org/en/latest/intro.html#error-codes
28-
29-
FLAKE8_MSG=$(flake8 --max-line-length=120 --ignore=F401,W503,W504,E203 ${SRC_PATH}/src/neuronx_distributed || true)
30-
31-
python3.8 -m pip install flake8==3.7
32-
if [[ ! -z $FLAKE8_MSG ]]
33-
then
34-
echo "FLAKE8 LINTING HAS DETECTED FORMATTING AND POTENTIALLY SOME SYNTAX ERRORS, PLEASE CHECK ABOVE OUTPUT!"
35-
exit 1
36-
fi
37-
38-
if [[ "$1" == "flake8" ]]
39-
then
40-
exit 0
41-
fi
42-
43-
# # Copy Python source files
11+
# Copy Python source files
4412
cp setup.py ${BUILD_PATH_NEURONX_DISTRIBUTED}/
4513
cp -r src ${BUILD_PATH_NEURONX_DISTRIBUTED}/
4614
cp $LICENSE_TXT_PATH ${BUILD_PATH_NEURONX_DISTRIBUTED}/
4715

48-
## Build wheel
49-
DIST_DIR=${BUILD_PATH}/pip/public/neuronx-distributed
50-
python3.8 setup.py bdist_wheel --dist-dir ${DIST_DIR}
16+
17+
export DIST_DIR=${BUILD_PATH}/pip/public/neuronx-distributed
18+
19+
bash build.sh

build.sh

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
#! /bin/bash
2+
set -e
3+
4+
: ${DIST_DIR:=build}
5+
6+
python3.8 -m pip install ruff
7+
# removing cache fails in ToD
8+
python3.8 -m ruff check --no-cache --line-length=120 --ignore=F401,E203
9+
# exit when asked to run `ruff` only
10+
if [[ "$1" == "ruff" ]]
11+
then
12+
exit 0
13+
fi
14+
15+
# Run static code analysis
16+
python3.8 -m pip install mypy
17+
# Install type bindings
18+
python3.8 -m pip install types-requests boto3-stubs[s3]
19+
# removing cache fails in ToD
20+
python3.8 -m mypy --no-incremental || true
21+
# exit when asked to run `mypy` only
22+
if [[ "$1" == "mypy" ]]
23+
then
24+
exit 0
25+
fi
26+
27+
28+
29+
# Build wheel
30+
python3.8 setup.py bdist_wheel --dist-dir ${DIST_DIR}
Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
import torch
2+
from dbrx.neuron_modeling_dbrx import (
3+
NeuronDbrxConfig,
4+
NeuronDbrxForCausalLM,
5+
NeuronDbrxModel,
6+
)
7+
from runner import InferenceRunner
8+
from transformers import AutoTokenizer
9+
10+
from neuronx_distributed.parallel_layers.checkpointing import _invoke_preshard_hook
11+
12+
13+
class DbrxRunner(InferenceRunner):
14+
def load_hf_model(self):
15+
config = NeuronDbrxConfig.from_pretrained(self.model_path)
16+
return NeuronDbrxForCausalLM.load_hf_model(self.model_path, config)
17+
18+
def load_neuron_model_on_cpu(self, max_prompt_length, sequence_length, batch_size, **kwargs):
19+
# On CPU we can only run tensor parallelism with degree 1
20+
config = self.get_config_for_nxd(
21+
batch_size,
22+
1,
23+
max_prompt_length=max_prompt_length,
24+
sequence_length=sequence_length,
25+
enable_bucketing=False,
26+
**kwargs)
27+
config.torch_dtype = torch.float32
28+
29+
self.init_ditributed_env()
30+
neuron_model = NeuronDbrxModel(config)
31+
32+
state_dict = NeuronDbrxForCausalLM.get_state_dict(self.model_path, config)
33+
34+
_invoke_preshard_hook(neuron_model, state_dict)
35+
36+
neuron_model.load_state_dict(state_dict, strict=False)
37+
38+
if config.torch_dtype == torch.bfloat16:
39+
neuron_model.bfloat16()
40+
41+
model = NeuronDbrxForCausalLM(None, config)
42+
model.context_encoding_model.model = neuron_model
43+
model.token_generation_model.model = neuron_model
44+
return model
45+
46+
def load_neuron_model(self, traced_model_path):
47+
config = NeuronDbrxConfig.from_pretrained(traced_model_path)
48+
model = NeuronDbrxForCausalLM.from_pretrained("", config)
49+
50+
model.load(traced_model_path)
51+
if config.torch_dtype == torch.bfloat16:
52+
model.bfloat16()
53+
54+
return model
55+
56+
def load_tokenizer(self, padding_side=None):
57+
tokenizer = AutoTokenizer.from_pretrained(self.tokenizer_path)
58+
tokenizer.pad_token = tokenizer.unk_token
59+
tokenizer.padding_side = padding_side if padding_side else self.get_padding_side()
60+
return tokenizer
61+
62+
def get_config_cls(self):
63+
return NeuronDbrxConfig
64+
65+
def get_model_cls(self):
66+
return NeuronDbrxForCausalLM
67+
68+
def get_padding_side(self):
69+
return "right"
70+
71+
def get_default_hf_generation_config_kwargs(self):
72+
config = super().get_default_hf_generation_config_kwargs()
73+
config['pad_token_id'] = 0
74+
75+
return config
76+
77+
78+
if __name__ == "__main__":
79+
DbrxRunner.cmd_execute()

0 commit comments

Comments
 (0)