|
| 1 | +# Train and Deploy ML Project |
| 2 | + |
| 3 | +This README provides step-by-step instructions for running the training and deployment pipeline using ZenML. |
| 4 | + |
| 5 | +## Prerequisites |
| 6 | + |
| 7 | +- Git installed |
| 8 | +- Python environment set up |
| 9 | +- ZenML installed |
| 10 | +- Access to the ZenML project repository |
| 11 | + |
| 12 | +## Project Setup |
| 13 | + |
| 14 | +1. Clone the repository and checkout the feature branch: |
| 15 | +```bash |
| 16 | +git clone [email protected]:zenml-io/zenml-projects.git |
| 17 | +git checkout feature/update-train-deploy |
| 18 | +``` |
| 19 | + |
| 20 | +2. Navigate to the project directory: |
| 21 | +```bash |
| 22 | +cd train_and_deploy |
| 23 | +``` |
| 24 | + |
| 25 | +3. Initialize ZenML in the project: |
| 26 | +```bash |
| 27 | +zenml init |
| 28 | +``` |
| 29 | + |
| 30 | +## Running the Pipeline |
| 31 | + |
| 32 | +### Training |
| 33 | + |
| 34 | +You have two options for running the training pipeline: |
| 35 | + |
| 36 | +#### Option 1: Automatic via CI |
| 37 | +Make any change to the code and push it. This will automatically trigger the CI pipeline that launches training in SkyPilot. |
| 38 | + |
| 39 | +#### Option 2: Manual Execution |
| 40 | +1. First, set up your stack. You can choose between: |
| 41 | + - Local stack (uses local orchestrator): |
| 42 | + ```bash |
| 43 | + zenml stack set LocalGitGuardian |
| 44 | + ``` |
| 45 | + - Remote stack (uses SkyPilot orchestrator): |
| 46 | + ```bash |
| 47 | + zenml stack set RemoteGitGuardian |
| 48 | + ``` |
| 49 | + |
| 50 | +2. Run the training pipeline: |
| 51 | +```bash |
| 52 | +python run --training |
| 53 | +``` |
| 54 | + |
| 55 | +### Model Deployment |
| 56 | + |
| 57 | +1. After training completes, deploy the model: |
| 58 | +```bash |
| 59 | +python run --deployment |
| 60 | +``` |
| 61 | + |
| 62 | +Note: At this stage, the deployment is done to the model set as "staging" (configured in `target_env`), and the model is deployed locally using BentoML. |
| 63 | + |
| 64 | +2. Test the deployed model: |
| 65 | +```bash |
| 66 | +python run --inference |
| 67 | +``` |
| 68 | + |
| 69 | +### Production Deployment |
| 70 | + |
| 71 | +If the staging model performs well and you want to proceed with production deployment: |
| 72 | + |
| 73 | +1. Deploy to Kubernetes: |
| 74 | +```bash |
| 75 | +python run --production |
| 76 | +``` |
| 77 | +This pipeline will: |
| 78 | +- Build a Docker image from the BentoML service |
| 79 | +- Deploy it to Kubernetes |
| 80 | + |
| 81 | +## Additional Resources |
| 82 | + |
| 83 | +- [ZenML Projects Tenant Dashboard](https://cloud.zenml.io/organizations/fc992c14-d960-4db7-812e-8f070c99c6f0/tenants/12ec0fd2-ed02-4479-8ff9-ecbfbaae3285) |
| 84 | +- [Example GitHub Actions Pipeline](https://github.com/zenml-io/zenml-projects/actions/runs/12075854945/job/33676323427) |
| 85 | + |
| 86 | +## Pipeline Flow Overview |
| 87 | + |
| 88 | +1. Training → Creates and trains the model |
| 89 | +2. Deployment → Deploys model to staging environment (local BentoML) |
| 90 | +3. Inference → Tests the deployed model |
| 91 | +4. Production → Deploys to production Kubernetes environment |
| 92 | + |
| 93 | +## Notes |
| 94 | + |
| 95 | +- The deployment configurations are controlled by the `target_env` setting in the configs |
| 96 | +- Make sure you have the necessary permissions and access rights before running the pipelines |
| 97 | +- Monitor the CI/CD pipeline in GitHub Actions when using automatic deployment |
0 commit comments