Skip to content

Commit 8b8bb8d

Browse files
wjayeshstrickvl
andauthored
Merge vLLM deployer project to llm-finetuning (#163)
* use the official huggingface integration * add option to deploy to vllm * add vllm deployer step * fix readme and add deployment target option * test config * test * fix vllm integration * rm llm-vllm project * uncomment config file * Update dependencies and GCP library versions in requirements.txt * stop overwriting the config for dataset generation * Configure Hugging Face cache directories for dataset preparation * Update dataset name from htahir1 to zenml namespace in configuration files * Update README with ZenML namespace dataset and repository links * change hf repo to zenml * use uv * fix syntax * update deprecated log metadata command --------- Co-authored-by: Alex Strick van Linschoten <[email protected]> Co-authored-by: Alex Strick van Linschoten <[email protected]>
1 parent dd8c545 commit 8b8bb8d

31 files changed

+86
-1056
lines changed

databricks-production-qa-demo/steps/deployment/deployment_deploy.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@ def deployment_deploy() -> (
6565
model_deployer = zenml_client.active_stack.model_deployer
6666
databricks_deployment_config = DatabricksDeploymentConfig(
6767
model_name=model.name,
68-
model_version=model.run_metadata["model_registry_version"].value,
68+
model_version=model.run_metadata["model_registry_version"],
6969
workload_size="Small",
7070
workload_type="CPU",
7171
scale_to_zero_enabled=True,

llm-finetuning/README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -69,16 +69,16 @@ The three pipelines can be run using the CLI:
6969

7070
```shell
7171
# Data generation
72-
python run.py --feature-engineering --config <NAME_OF_CONFIG_IN_CONFIGS_FOLDER>
73-
python run.py --feature-engineering --config generate_code_dataset.yaml
72+
python run.py --feature-pipeline --config <NAME_OF_CONFIG_IN_CONFIGS_FOLDER>
73+
python run.py --feature-pipeline --config generate_code_dataset.yaml
7474

7575
# Training
7676
python run.py --training-pipeline --config <NAME_OF_CONFIG_IN_CONFIGS_FOLDER>
7777
python run.py --training-pipeline --config finetune_gcp.yaml
7878

7979
# Deployment
80-
python run.py --deployment-pipeline --config <NAME_OF_CONFIG_IN_CONFIGS_FOLDER>
81-
python run.py --deployment-pipeline --config deployment_a100.yaml
80+
python run.py --deploy-pipeline --config <NAME_OF_CONFIG_IN_CONFIGS_FOLDER>
81+
python run.py --deploy-pipeline --config deployment_a100.yaml
8282
```
8383

8484
The `feature_engineering` and `deployment` pipeline can be run simply with the `default` stack, but the training pipelines [stack](https://docs.zenml.io/user-guide/production-guide/understand-stacks) will depend on the config.
@@ -127,7 +127,7 @@ python run.py --deployment-pipeline --config deployment_a100.yaml
127127

128128
A working prototype has been trained and deployed as of Jan 19 2024. The model is using minimal data and finetuned using QLoRA and PEFT. The model was trained using 1 A100 GPU on the cloud:
129129

130-
- Training dataset [Link](https://huggingface.co/datasets/htahir1/zenml-codegen-v1)
130+
- Training dataset [Link](https://huggingface.co/datasets/zenml/zenml-codegen-v1)
131131
- PEFT Model [Link](https://huggingface.co/htahir1/peft-lora-zencoder15B-personal-copilot/)
132132
- Fully merged model (Ready to deploy on HuggingFace Inference Endpoints) [Link](https://huggingface.co/htahir1/peft-lora-zencoder15B-personal-copilot-merged)
133133

@@ -147,7 +147,7 @@ The [ZenML Pro](https://zenml.io/pro) was used to manage the pipelines, models,
147147

148148
This project recently did a [call of volunteers](https://www.linkedin.com/feed/update/urn:li:activity:7150388250178662400/). This TODO list can serve as a source of collaboration. If you want to work on any of the following, please [create an issue on this repository](https://github.com/zenml-io/zenml-projects/issues) and assign it to yourself!
149149

150-
- [x] Create a functioning data generation pipeline (initial dataset with the core [ZenML repo](https://github.com/zenml-io/zenml) scraped and pushed [here](https://huggingface.co/datasets/htahir1/zenml-codegen-v1))
150+
- [x] Create a functioning data generation pipeline (initial dataset with the core [ZenML repo](https://github.com/zenml-io/zenml) scraped and pushed [here](https://huggingface.co/datasets/zenml/zenml-codegen-v1))
151151
- [x] Deploy the model on a HuggingFace inference endpoint and use it in the [VS Code Extension](https://github.com/huggingface/llm-vscode#installation) using a deployment pipeline.
152152
- [x] Create a functioning training pipeline.
153153
- [ ] Curate a set of 5-10 repositories that are using the ZenML latest syntax and use data generation pipeline to push dataset to HuggingFace.

llm-finetuning/configs/deployment_a10.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
settings:
33
docker:
44
requirements: requirements.txt
5+
python_package_installer: "uv"
56

67
model:
78
name: peft-lora-zencoder15B-personal-copilot

llm-finetuning/configs/deployment_a100.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
settings:
33
docker:
44
requirements: requirements.txt
5+
python_package_installer: "uv"
56

67
model:
78
name: "peft-lora-zencoder15B-personal-copilot"

llm-finetuning/configs/deployment_t4.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
# environment configuration
22
settings:
33
docker:
4+
python_package_installer: "uv"
45
requirements: requirements.txt
56

67
model:

llm-finetuning/configs/finetune_aws.yaml

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,12 @@
22
settings:
33
docker:
44
requirements: requirements.txt
5+
python_package_installer: "uv"
56

67
model:
78
name: "peft-lora-zencoder15B-personal-copilot"
89
description: "Fine-tuned `starcoder15B-personal-copilot-A100-40GB-colab` for ZenML pipelines."
9-
audience: "Data Scientists / ML Engineers"
10+
audience: "Data Scientists / ML Engineers"
1011
use_cases: "Code Generation for ZenML MLOps pipelines."
1112
limitations: "There is no guarantee that this model will work for your use case. Please test it thoroughly before using it in production."
1213
trade_offs: "This model is optimized for ZenML pipelines. It is not optimized for other libraries."
@@ -23,13 +24,13 @@ steps:
2324
step_operator: sagemaker-eu
2425
settings:
2526
step_operator.sagemaker:
26-
estimator_args:
27+
estimator_args:
2728
instance_type: "ml.p4d.24xlarge"
2829

2930
parameters:
3031
args:
3132
model_path: "bigcode/starcoder"
32-
dataset_name: "htahir1/zenml-codegen-v1"
33+
dataset_name: "zenml/zenml-codegen-v1"
3334
subset: "data"
3435
data_column: "content"
3536
split: "train"
@@ -58,4 +59,4 @@ steps:
5859
use_4bit_qunatization: true
5960
use_nested_quant: true
6061
bnb_4bit_compute_dtype: "bfloat16"
61-
output_peft_repo_id: "htahir1/peft-lora-zencoder15B-personal-copilot"
62+
output_peft_repo_id: "zenml/peft-lora-zencoder15B-personal-copilot"

llm-finetuning/configs/finetune_gcp.yaml

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,12 @@
22
settings:
33
docker:
44
requirements: requirements.txt
5+
python_package_installer: "uv"
56

67
model:
78
name: "peft-lora-zencoder15B-personal-copilot"
89
description: "Fine-tuned `starcoder15B-personal-copilot-A100-40GB-colab` for ZenML pipelines."
9-
audience: "Data Scientists / ML Engineers"
10+
audience: "Data Scientists / ML Engineers"
1011
use_cases: "Code Generation for ZenML MLOps pipelines."
1112
limitations: "There is no guarantee that this model will work for your use case. Please test it thoroughly before using it in production."
1213
trade_offs: "This model is optimized for ZenML pipelines. It is not optimized for other libraries."
@@ -29,7 +30,7 @@ steps:
2930
parameters:
3031
args:
3132
model_path: "bigcode/starcoder"
32-
dataset_name: "htahir1/zenml-codegen-v1"
33+
dataset_name: "zenml/zenml-codegen-v1"
3334
subset: "data"
3435
data_column: "content"
3536
split: "train"
@@ -58,4 +59,4 @@ steps:
5859
use_4bit_qunatization: true
5960
use_nested_quant: true
6061
bnb_4bit_compute_dtype: "bfloat16"
61-
output_peft_repo_id: "htahir1/peft-lora-zencoder15B-personal-copilot"
62+
output_peft_repo_id: "zenml/peft-lora-zencoder15B-personal-copilot"

llm-finetuning/configs/finetune_local.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
settings:
33
docker:
44
requirements: requirements.txt
5+
python_package_installer: "uv"
56

67
model:
78
name: "peft-lora-zencoder15B-personal-copilot"
Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,20 @@
11
# environment configuration
22
settings:
33
docker:
4+
python_package_installer: "uv"
45
requirements: requirements.txt
6+
apt_packages:
7+
- git
8+
environment:
9+
HF_HOME: "/tmp/huggingface"
10+
HF_HUB_CACHE: "/tmp/huggingface"
511

612
# pipeline configuration
713
parameters:
8-
dataset_id: htahir1/zenml-codegen-v1
14+
dataset_id: zenml/zenml-codegen-v1
915

1016
steps:
1117
mirror_repositories:
1218
parameters:
1319
repositories:
14-
- zenml
20+
- zenml

llm-finetuning/huggingface/__init__.py

Whitespace-only changes.

0 commit comments

Comments
 (0)