Skip to content

Commit b30f4b6

Browse files
authored
Merge pull request #50 from replicatedhq/diamonwiggins/improve-mlflow-docs
2 parents 7e9345b + 6056fd1 commit b30f4b6

File tree

5 files changed

+185
-89
lines changed

5 files changed

+185
-89
lines changed

.github/workflows/mlflow-ci.yml

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,14 @@ jobs:
6161
task lint
6262
task template
6363
64+
- name: Upload rendered templates
65+
if: failure()
66+
uses: actions/upload-artifact@v4
67+
with:
68+
name: mlflow-rendered-templates
69+
path: applications/mlflow/charts/.rendered-templates/
70+
retention-days: 7
71+
6472
- name: Check Version Consistency
6573
working-directory: applications/mlflow
6674
run: |
@@ -234,7 +242,7 @@ jobs:
234242
- name: Set up Python
235243
uses: actions/setup-python@v4
236244
with:
237-
python-version: 3.13
245+
python-version: 3.12
238246

239247
- name: Install Task
240248
uses: arduino/setup-task@v1
@@ -392,7 +400,7 @@ jobs:
392400
- name: Set up Python
393401
uses: actions/setup-python@v4
394402
with:
395-
python-version: 3.13
403+
python-version: 3.12
396404

397405
- name: Install Task
398406
uses: arduino/setup-task@v1

.gitignore

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,10 @@ Thumbs.db
3131
*.pyd
3232
__pycache__/
3333

34+
# Mlflow specific
35+
applications/mlflow/tests/.venv/
36+
**/charts/.rendered-templates/
37+
3438
# wg-easy specific
3539
*.kubeconfig
3640
applications/wg-easy/release/

applications/mlflow/DEVELOPMENT.md

Lines changed: 1 addition & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,6 @@ Follow this workflow for development:
1818

1919
1. Add required Helm repositories and update dependencies:
2020
```bash
21-
task add:repos:helm
2221
task update:deps:helm
2322
```
2423

@@ -56,42 +55,6 @@ Follow this workflow for development:
5655

5756
This workflow allows rapid iteration without needing to publish to the Replicated registry.
5857

59-
### Task Reference
60-
61-
Tasks follow a `verb:resource[:subresource]` naming convention for clarity:
62-
63-
```bash
64-
# Validation and verification
65-
task lint # Lint Helm charts
66-
task template # Render templates to stdout (SDK disabled)
67-
task check:versions # Verify Chart.yaml and KOTS manifest versions match
68-
69-
# Repository and dependency management
70-
task add:repos:helm # Add required Helm repositories
71-
task update:deps:helm # Update Helm chart dependencies
72-
73-
# Packaging and versioning
74-
task update:versions:chart # Update chart version refs in KOTS manifests
75-
task package:charts # Package Helm charts for distribution
76-
task extract:version:chart # Extract current MLflow chart version
77-
78-
# Installation
79-
task install:helm:local # Install charts for local development (SDK disabled)
80-
81-
# Testing
82-
task test:install:helm # Test with charts from Replicated registry
83-
task test:install:kots # Test KOTS installation
84-
task run:tests:app # Run application tests against running MLflow
85-
task run:tests:all # Run all tests (Helm install + app tests)
86-
87-
# Release management
88-
task create:release # Create a Replicated release
89-
90-
# Cleanup
91-
task clean:files:charts # Clean packaged chart files
92-
task clean:all # Clean all generated files
93-
```
94-
9558
## Releasing
9659

9760
### Updating Documentation
@@ -212,4 +175,4 @@ The pipeline is triggered on:
212175
- Pull requests affecting the MLflow application
213176
- Pushes to the main branch
214177

215-
For more details, see the workflow definition in [.github/workflows/mlflow-ci.yml](../../.github/workflows/mlflow-ci.yml).
178+
For more details, see the workflow definition in [.github/workflows/mlflow-ci.yml](../../.github/workflows/mlflow-ci.yml).

applications/mlflow/README.md

Lines changed: 0 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -28,36 +28,6 @@ helm registry login registry.replicated.com --username=<license-id>
2828
helm install mlflow oci://registry.replicated.com/mlflow/stable
2929
```
3030

31-
### Embedded Cluster
32-
33-
For customers without an existing Kubernetes cluster, the embedded option provides:
34-
- Integrated Kubernetes cluster managed by Replicated
35-
- Simple installation on VMs or bare metal
36-
- No Kubernetes expertise required
37-
- Optimized resource usage
38-
39-
```bash
40-
# Download installer from the provided license URL
41-
# Run the installer script
42-
bash ./install.sh
43-
```
44-
45-
### KOTS Existing Cluster
46-
47-
For customers with existing Kubernetes clusters, the KOTS installation method provides:
48-
- Admin console for application management
49-
- Version updates with rollback capability
50-
- Configuration validation
51-
- Pre-flight checks to verify environment requirements
52-
53-
```bash
54-
# Install KOTS CLI
55-
curl https://kots.io/install | bash
56-
57-
# Install MLflow with KOTS
58-
kubectl kots install mlflow/stable
59-
```
60-
6131
## Documentation
6232

6333
- [MLflow Helm Chart Documentation](./charts/mlflow/README.md) - Installation and configuration details

applications/mlflow/Taskfile.yml

Lines changed: 170 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -148,19 +148,36 @@ tasks:
148148

149149
# Template rendering
150150
template:
151-
desc: Template Helm charts with Replicated SDK disabled and output to stdout
151+
desc: Template Helm charts with standard configuration and output to a directory
152152
deps: [add:repos:helm, update:deps:helm]
153153
cmds:
154-
- echo "Templating Helm charts with Replicated SDK disabled..."
154+
- echo "Templating Helm charts..."
155+
- |
156+
# Create templates directory if it doesn't exist
157+
TEMPLATES_DIR="{{.CHART_DIR}}/.rendered-templates"
158+
echo "Creating templates directory: $TEMPLATES_DIR"
159+
mkdir -p "$TEMPLATES_DIR"
160+
161+
# Clean up any previous templates
162+
echo "Cleaning up previous templates..."
163+
rm -rf "$TEMPLATES_DIR"/*
155164
- for: { var: CHARTS }
156165
cmd: |
157166
echo "=== Rendering templates for {{.ITEM}} chart ==="
158167
echo "==============================================="
159-
helm template {{.CHART_DIR}}/{{.ITEM}} --debug
160-
echo ""
168+
169+
# Create directory for this chart
170+
CHART_TEMPLATES_DIR="{{.CHART_DIR}}/.rendered-templates/{{.ITEM}}"
171+
mkdir -p "$CHART_TEMPLATES_DIR"
172+
173+
# Render templates to file with default values
174+
helm template {{.CHART_DIR}}/{{.ITEM}} --output-dir "$CHART_TEMPLATES_DIR" --debug
175+
176+
# Also output to stdout for visibility
177+
echo "Templates written to: $CHART_TEMPLATES_DIR"
161178
echo "=== End of templates for {{.ITEM}} chart ==="
162179
echo ""
163-
- echo "All chart templates have been output to stdout."
180+
- echo "All chart templates have been output to {{.CHART_DIR}}/.rendered-templates"
164181

165182
# Version update for packaged charts
166183
update:versions:chart:
@@ -386,10 +403,16 @@ tasks:
386403
- rm -f {{.KOTS_DIR}}/*.tgz
387404
- echo "Chart packages cleaned from {{.KOTS_DIR}}"
388405

406+
clean:files:templates:
407+
desc: Clean rendered templates directory
408+
cmds:
409+
- rm -rf {{.CHART_DIR}}/.rendered-templates
410+
- echo "Rendered templates cleaned from {{.CHART_DIR}}/.rendered-templates"
411+
389412
# Main clean task
390413
clean:all:
391414
desc: Clean all generated files
392-
deps: [clean:files:charts]
415+
deps: [clean:files:charts, clean:files:templates]
393416
cmds:
394417
- echo "All generated files cleaned successfully"
395418

@@ -894,24 +917,152 @@ tasks:
894917
run:tests:app:
895918
desc: Run application tests against the running MLflow service
896919
cmds:
897-
- echo "Running application tests against MLflow on localhost:{{.PORT}}..."
920+
- echo "Running MLflow application tests against localhost:{{.PORT}}..."
898921
- |
899-
# Check if running inside a virtual environment already
900-
if [ -z "$VIRTUAL_ENV" ]; then
922+
# Detect if we're running in a CI environment
923+
if [ "{{.CI}}" = "true" ]; then
924+
echo "📦 Running in CI environment - using direct package installation..."
925+
926+
# In CI, we just install packages directly without using a virtual environment
901927
echo "Installing Python dependencies directly..."
902-
# Try to use binary wheels whenever possible
903928
python -m pip install --upgrade pip wheel setuptools
904-
# Install the required packages directly
905-
python -m pip install mlflow numpy pandas scikit-learn pytest requests
929+
930+
# Install required packages directly
931+
echo "Installing MLflow and test dependencies..."
932+
python -m pip install "mlflow>=2.8.0,<3.0.0" "numpy>=1.24.0" "pandas>=2.0.0" "scikit-learn>=1.2.0" pytest requests
933+
934+
# Run the tests directly
935+
echo "🧪 Running MLflow application tests..."
936+
if python {{.TESTS_DIR}}/mlflow_test.py localhost:{{.PORT}} --protocol http --connection-timeout 180 --debug; then
937+
echo "✅ All tests passed successfully!"
938+
else
939+
TEST_EXIT_CODE=$?
940+
echo "❌ Tests failed with exit code: $TEST_EXIT_CODE"
941+
exit $TEST_EXIT_CODE
942+
fi
906943
else
907-
echo "Running in virtual environment $VIRTUAL_ENV, skipping dependency installation"
944+
# For local development, use a persistent virtual environment for better isolation and speed
945+
echo "🔧 Setting up Python test environment..."
946+
TEST_ENV_DIR="{{.TESTS_DIR}}/.venv"
947+
948+
# Create virtual environment if it doesn't exist
949+
if [ ! -d "$TEST_ENV_DIR" ]; then
950+
echo " Creating new Python environment (first-time setup)..."
951+
python3 -m venv "$TEST_ENV_DIR" || {
952+
echo "❌ Failed to create Python virtual environment."
953+
echo " Please ensure python3 and python3-venv are installed."
954+
echo " On Ubuntu/Debian: sudo apt-get install python3-venv"
955+
echo " On macOS: brew install python3"
956+
exit 1
957+
}
958+
FRESH_ENV=true
959+
else
960+
echo " Using existing Python environment from $TEST_ENV_DIR"
961+
FRESH_ENV=false
962+
fi
963+
964+
# Determine the correct activation script based on shell
965+
if [ -f "$TEST_ENV_DIR/bin/activate" ]; then
966+
ACTIVATE_SCRIPT="$TEST_ENV_DIR/bin/activate"
967+
elif [ -f "$TEST_ENV_DIR/Scripts/activate" ]; then
968+
ACTIVATE_SCRIPT="$TEST_ENV_DIR/Scripts/activate"
969+
else
970+
echo "❌ Unable to find activation script for virtual environment"
971+
exit 1
972+
fi
973+
974+
# Activate the virtual environment
975+
echo " Activating test environment..."
976+
source "$ACTIVATE_SCRIPT" || {
977+
echo "❌ Failed to activate virtual environment."
978+
echo " Trying alternative approach..."
979+
980+
# Alternative approach using python -m venv approach
981+
echo " Using python directly from the venv bin directory..."
982+
VENV_PYTHON="$TEST_ENV_DIR/bin/python"
983+
if [ ! -f "$VENV_PYTHON" ]; then
984+
if [ -f "$TEST_ENV_DIR/Scripts/python.exe" ]; then
985+
VENV_PYTHON="$TEST_ENV_DIR/Scripts/python.exe"
986+
else
987+
echo "❌ Cannot find python in the virtual environment."
988+
echo " Falling back to system Python..."
989+
VENV_PYTHON="python"
990+
fi
991+
fi
992+
993+
# Install using the venv python directly
994+
echo " Installing dependencies using $VENV_PYTHON..."
995+
"$VENV_PYTHON" -m pip install --upgrade pip wheel setuptools
996+
"$VENV_PYTHON" -m pip install "mlflow>=2.8.0,<3.0.0" "numpy>=1.24.0" "pandas>=2.0.0" "scikit-learn>=1.2.0" pytest requests
997+
998+
# Run the tests using venv python
999+
echo "🧪 Running MLflow application tests..."
1000+
if "$VENV_PYTHON" {{.TESTS_DIR}}/mlflow_test.py localhost:{{.PORT}} --protocol http --connection-timeout 180 --debug; then
1001+
echo "✅ All tests passed successfully!"
1002+
else
1003+
TEST_EXIT_CODE=$?
1004+
echo "❌ Tests failed with exit code: $TEST_EXIT_CODE"
1005+
exit $TEST_EXIT_CODE
1006+
fi
1007+
1008+
echo "💡 Environment is persistent for faster future runs."
1009+
echo " To force dependency updates: FORCE_DEPS_UPDATE=yes task run:tests:app"
1010+
echo " To clean up environment: task clean:venv"
1011+
1012+
# Exit early since we've already run the tests
1013+
exit 0
1014+
}
1015+
1016+
# Only install/upgrade packages if it's a fresh environment or forced
1017+
if [ "$FRESH_ENV" = true ] || [ "${FORCE_DEPS_UPDATE:-no}" = "yes" ]; then
1018+
# Install dependencies with detailed progress
1019+
echo "🔄 Installing required dependencies..."
1020+
echo " Upgrading package tools..."
1021+
python -m pip install --upgrade pip wheel setuptools &> "$TEST_ENV_DIR/pip-upgrade.log" || {
1022+
echo "❌ Failed to upgrade pip/wheel/setuptools."
1023+
echo " See error log at: $TEST_ENV_DIR/pip-upgrade.log"
1024+
cat "$TEST_ENV_DIR/pip-upgrade.log"
1025+
exit 1
1026+
}
1027+
1028+
echo " Installing MLflow and test dependencies (this may take a minute)..."
1029+
# Install all dependencies with a single command to resolve dependency conflicts properly
1030+
python -m pip install "mlflow>=2.8.0,<3.0.0" "numpy>=1.24.0" "pandas>=2.0.0" "scikit-learn>=1.2.0" pytest requests &> "$TEST_ENV_DIR/pip-install.log" || {
1031+
echo "❌ Failed to install dependencies."
1032+
echo " See error log at: $TEST_ENV_DIR/pip-install.log"
1033+
echo " Common issues:"
1034+
echo " - Python version compatibility"
1035+
echo " - Network connectivity problems"
1036+
echo " - System package dependencies missing"
1037+
echo ""
1038+
echo "Error details:"
1039+
tail -n 20 "$TEST_ENV_DIR/pip-install.log"
1040+
exit 1
1041+
}
1042+
1043+
# Show the installed versions
1044+
echo "✅ Successfully installed dependencies:"
1045+
python -m pip list | grep -E "mlflow|numpy|pandas|scikit-learn|pytest|requests"
1046+
else
1047+
echo "🔍 Using existing dependencies (use FORCE_DEPS_UPDATE=yes to update)"
1048+
fi
1049+
1050+
# Run the tests with proper error handling
1051+
echo "🧪 Running MLflow application tests..."
1052+
if python {{.TESTS_DIR}}/mlflow_test.py localhost:{{.PORT}} --protocol http --connection-timeout 180 --debug; then
1053+
echo "✅ All tests passed successfully!"
1054+
else
1055+
TEST_EXIT_CODE=$?
1056+
echo "❌ Tests failed with exit code: $TEST_EXIT_CODE"
1057+
echo " Check the test output above for details."
1058+
exit $TEST_EXIT_CODE
1059+
fi
1060+
1061+
# Note about cleaning up
1062+
echo "💡 Environment is persistent for faster future runs."
1063+
echo " To force dependency updates: FORCE_DEPS_UPDATE=yes task run:tests:app"
1064+
echo " To clean up environment: task clean:venv"
9081065
fi
909-
910-
echo "Running MLflow application tests"
911-
python {{.TESTS_DIR}}/mlflow_test.py localhost:{{.PORT}} \
912-
--protocol http \
913-
--connection-timeout 180 \
914-
--debug
9151066
9161067
# All tests task
9171068
run:tests:all:

0 commit comments

Comments
 (0)