|
1 | 1 | # SHARK Turbine |
2 | 2 |
|
3 | | - |
| 3 | +This repo is Nod-AI's integration repository for various model bringup |
| 4 | +activities and CI. In 2023 and early 2024, it played a different role |
| 5 | +by being the place where FX/Dynamo based torch-mlir and IREE toolsets |
| 6 | +were developed, including: |
4 | 7 |
|
5 | | -Turbine is the set of development tools that the [SHARK Team](https://github.com/nod-ai/SHARK) |
6 | | -is building for deploying all of our models for deployment to the cloud and devices. We |
7 | | -are building it as we transition from our TorchScript-era 1-off export and compilation |
8 | | -to a unified approach based on PyTorch 2 and Dynamo. While we use it heavily ourselves, it |
9 | | -is intended to be a general purpose model compilation and execution tool. |
| 8 | +* [Torch-MLIR FxImporter](https://github.com/llvm/torch-mlir/blob/main/python/torch_mlir/extras/fx_importer.py) |
| 9 | +* [Torch-MLIR ONNX Importer](https://github.com/llvm/torch-mlir/blob/main/python/torch_mlir/extras/onnx_importer.py) |
| 10 | +* [Torch-MLIR's ONNX C Importer](https://github.com/llvm/torch-mlir/tree/main/projects/onnx_c_importer) |
| 11 | +* [IREE Turbine](https://github.com/iree-org/iree-turbine) |
| 12 | +* [Sharktank and Shortfin](https://github.com/nod-ai/sharktank) |
10 | 13 |
|
11 | | -Turbine provides a collection of tools: |
| 14 | +As these have all found upstream homes, this repo is a bit bare. We will |
| 15 | +continue to use it as a staging ground for things that don't have a |
| 16 | +more defined spot and as a way to drive certain kinds of upstreaming |
| 17 | +activities. |
12 | 18 |
|
13 | | -* *AOT Export*: For compiling one or more `nn.Module`s to compiled, deployment |
14 | | - ready artifacts. This operates via both a simple one-shot export API (Already upstreamed to [torch-mlir](https://github.com/llvm/torch-mlir/blob/main/python/torch_mlir/extras/fx_importer.py)) |
15 | | - for simple models and an underlying [advanced API](https://github.com/nod-ai/SHARK-Turbine/blob/main/core/shark_turbine/aot/compiled_module.py) for complicated models |
16 | | - and accessing the full features of the runtime. |
17 | | -* *Eager Execution*: A `torch.compile` backend is provided and a Turbine Tensor/Device |
18 | | - is available for more native, interactive use within a PyTorch session. |
19 | | -* *Turbine Kernels*: (coming soon) A union of the [Triton](https://github.com/openai/triton) approach and |
20 | | - [Pallas](https://jax.readthedocs.io/en/latest/pallas/index.html) but based on |
21 | | - native PyTorch constructs and tracing. It is intended to complement for simple |
22 | | - cases where direct emission to the underlying, cross platform, vector programming model |
23 | | - is desirable. |
24 | | -* *Turbine-LLM*: a repository of layers, model recipes, and conversion tools |
25 | | - from popular Large Language Model (LLM) quantization tooling. |
26 | 19 |
|
27 | | -Under the covers, Turbine is based heavily on [IREE](https://github.com/openxla/iree) and |
28 | | -[torch-mlir](https://github.com/llvm/torch-mlir) and we use it to drive evolution |
29 | | -of both, upstreaming infrastructure as it becomes timely to do so. |
| 20 | +## Current Projects |
30 | 21 |
|
31 | | -See [the roadmap](docs/roadmap.md) for upcoming work and places to contribute. |
| 22 | +### turbine-models |
32 | 23 |
|
33 | | -## Contact Us |
| 24 | +The `turbine-models` project (under models/) contains ports and adaptations |
| 25 | +of various (mostly HF) models that we use in various ways. |
34 | 26 |
|
35 | | -Turbine is under active development. If you would like to participate as it comes online, |
36 | | -please reach out to us on the `#turbine` channel of the |
37 | | -[nod-ai Discord server](https://discord.gg/QMmR6f8rGb). |
| 27 | +### CI |
38 | 28 |
|
39 | | -## Quick Start for Users |
| 29 | +Integration CI for a variety of projects is rooted in this repo. |
40 | 30 |
|
41 | | -1. Install from source: |
42 | | - |
43 | | -``` |
44 | | -pip install shark-turbine |
45 | | -# Or for editable: see instructions under developers |
46 | | -``` |
47 | | - |
48 | | -The above does install some unecessary cuda/cudnn packages for cpu use. To avoid this you |
49 | | -can specify pytorch-cpu and install via: |
50 | | -``` |
51 | | -pip install -r core/pytorch-cpu-requirements.txt |
52 | | -pip install shark-turbine |
53 | | -``` |
54 | | - |
55 | | -(or follow the "Developers" instructions below for installing from head/nightly) |
56 | | - |
57 | | -2. Try one of the samples: |
58 | | - |
59 | | -Generally, we use Turbine to produce valid, dynamic shaped Torch IR (from the |
60 | | -[`torch-mlir torch` dialect](https://github.com/llvm/torch-mlir/tree/main/include/torch-mlir/Dialect/Torch/IR) |
61 | | -with various approaches to handling globals). Depending on the use-case and status of the |
62 | | -compiler, these should be compilable via IREE with `--iree-input-type=torch` for |
63 | | -end to end execution. Dynamic shape support in torch-mlir is a work in progress, |
64 | | -and not everything works at head with release binaries at present. |
65 | | - |
66 | | - * [AOT MLP With Static Shapes](https://github.com/nod-ai/SHARK-Turbine/blob/main/core/examples/aot_mlp/mlp_export_simple.py) |
67 | | - * [AOT MLP with a dynamic batch size](https://github.com/nod-ai/SHARK-Turbine/blob/main/core/examples/aot_mlp/mlp_export_dynamic.py) |
68 | | - * [AOT llama2](https://github.com/nod-ai/SHARK-Turbine/blob/main/core/examples/llama2_inference/llama2.ipynb): |
69 | | - Dynamic sequence length custom compiled module with state management internal to the model. |
70 | | - * [Eager MNIST with `torch.compile`](https://github.com/nod-ai/SHARK-Turbine/blob/main/core/examples/eager_mlp/mlp_eager_simple.py) |
71 | | - |
72 | | -## Developers |
73 | | - |
74 | | -### Getting Up and Running |
75 | | - |
76 | | -If only looking to develop against this project, then you need to install Python |
77 | | -deps for the following: |
78 | | - |
79 | | -* PyTorch |
80 | | -* iree-compiler (with Torch input support) |
81 | | -* iree-runtime |
82 | | - |
83 | | -The pinned deps at HEAD require pre-release versions of all of the above, and |
84 | | -therefore require additional pip flags to install. Therefore, to satisfy |
85 | | -development, we provide a `requirements.txt` file which installs precise |
86 | | -versions and has all flags. This can be installed prior to the package: |
87 | | - |
88 | | -Installing into a venv is highly recommended. |
89 | | - |
90 | | -``` |
91 | | -pip install -r core/pytorch-cpu-requirements.txt |
92 | | -pip install --upgrade -r core/requirements.txt |
93 | | -pip install --upgrade -e "core[torch-cpu-nightly,testing]" |
94 | | -``` |
95 | | - |
96 | | -Run tests: |
97 | | - |
98 | | -``` |
99 | | -pytest core/ |
100 | | -``` |
101 | | - |
102 | | -### Using a development compiler |
103 | | - |
104 | | -If doing native development of the compiler, it can be useful to switch to |
105 | | -source builds for iree-compiler and iree-runtime. |
106 | | - |
107 | | -In order to do this, check out [IREE](https://github.com/openxla/iree) and |
108 | | -follow the instructions to [build from source](https://iree.dev/building-from-source/getting-started/), making |
109 | | -sure to specify [additional options for the Python bindings](https://iree.dev/building-from-source/getting-started/#building-with-cmake): |
110 | | - |
111 | | -``` |
112 | | --DIREE_BUILD_PYTHON_BINDINGS=ON -DPython3_EXECUTABLE="$(which python)" |
113 | | -``` |
114 | | - |
115 | | -#### Configuring Python |
116 | | - |
117 | | -Uninstall existing packages: |
118 | | - |
119 | | -``` |
120 | | -pip uninstall iree-compiler |
121 | | -pip uninstall iree-runtime |
122 | | -``` |
123 | | - |
124 | | -Copy the `.env` file from `iree/` to this source directory to get IDE |
125 | | -support and add to your path for use from your shell: |
126 | | - |
127 | | -``` |
128 | | -source .env && export PYTHONPATH |
129 | | -``` |
0 commit comments