Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Oct 30, 2025

Clarifai backend now supports Pipeline Step Secrets for step-level secret isolation. This PR implements SDK and CLI support for the feature.

Changes

Pipeline Builder (clarifai/runners/pipelines/pipeline_builder.py)

  • Parse step_version_secrets from config section in config.yaml (matching proto structure PipelineVersion.config.step_version_secrets)
  • Inject secrets into PipelineVersion.config via new _add_step_version_secrets() method
  • Preserve secrets in lockfile generation (prepare_lockfile_with_step_versions(), generate_lockfile_data())

Pipeline Client (clarifai/client/pipeline.py)

  • get_pipeline_version() - retrieve version with secrets config
  • create_pipeline_version() - create version with secrets via PatchPipelineVersions
  • list_step_secrets() - query secrets for all/specific steps
  • Python API accepts simplified format: {step_ref: {secret_name: secret_path}} (no secrets wrapper)

CLI Template (clarifai/cli/templates/pipeline_templates.py)

  • Add commented step_version_secrets example to generated pipeline configs in correct config section

Documentation (docs/pipeline_step_secrets.md)

  • Usage guide with YAML and Python examples
  • Format differences (YAML requires secrets key, Python API omits it)
  • Security model and best practices

Usage

Config YAML:

pipeline:
  orchestration_spec:
    argo_orchestration_spec: |
      # workflow definition
  config:
    step_version_secrets:
      step-0:
        secrets:
          API_KEY: users/user123/secrets/my-api-key
      step-1:
        secrets:
          DB_PASSWORD: users/user123/secrets/db-password

Python SDK:

# Create version with secrets
version_id = pipeline.create_pipeline_version(
    orchestration_spec=orch_spec,
    step_version_secrets={
        "step-0": {"API_KEY": "users/user123/secrets/my-api-key"}
    }
)

# Query secrets
secrets = pipeline.list_step_secrets()
# Returns: {"step-0": {"API_KEY": "users/user123/secrets/my-api-key"}, ...}

Security

  • SDK handles only secret references (paths), not values
  • Backend injects actual values at runtime via Kubernetes SecretKeyRef
  • Logging uses debug level and omits secret names to minimize exposure
  • CodeQL alerts are false positives (logging step references like "step-0", not secrets)

Backward Compatibility

  • step_version_secrets is optional
  • Existing pipelines work unchanged
  • All existing tests pass (19 total, 9 new)
Original prompt

This section details on the original issue you should resolve

<issue_title>Pipeline Step Secrets Feature Adoption in Python SDK and CLI</issue_title>
<issue_description>## Overview

Clarifai backend now supports Pipeline Step Secrets—enabling different pipeline steps to access distinct sets of secrets with step-level isolation. This document outlines the technical changes required to adopt this feature in the Python SDK and CLI, ensuring compatibility with updated API fields and backend functionality.


API Feature Summary

1. New Proto Fields

  • PipelineVersionConfig
    • step_version_secrets: map<string, StepSecretConfig>
      • Key: Step reference (step1, step2, ...)
      • Value: StepSecretConfig
  • StepSecretConfig (New Message)
    • secrets: map<string, string>
      • Key: Secret name (e.g., API_KEY)
      • Value: Secret reference path (e.g., users/user123/secrets/my-api-key)

2. Updated API Endpoints

  • GetPipelineVersion: Returns pipeline version with step secret config.
  • PostPipelineVersions: Create pipeline version with step secrets.

3. Example API Usage

Creation

{
  "pipeline_version": {
    "config": {
      "step_version_secrets": {
        "step1": {
          "secrets": {
            "API_KEY": "users/user123/secrets/my-api-key",
            "DB_PASSWORD": "users/user123/secrets/db-secret"
          }
        },
        "step2": {
          "secrets": {
            "EMAIL_TOKEN": "users/user123/secrets/email-token"
          }
        }
      }
    }
  }
}

Python SDK Changes

1. Proto/Model Updates

  • Regenerate Python proto files from updated .proto definitions.
  • Ensure PipelineVersionConfig includes the step_version_secrets field.
  • Add StepSecretConfig model/class for mapping secrets per step.

2. Client Method Updates

  • Update get_pipeline_version() to handle new step secrets field.
  • Update create_pipeline_version() to accept step secrets.
  • Add type hints for new fields.

3. Helper Methods (Recommended)

  • Add convenience methods:
    • add_step_secret(step_ref, secret_name, secret_ref)
    • remove_step_secret(step_ref, secret_name)
    • list_step_secrets(step_ref)

CLI Changes

1. Command Updates

  • Update clarifai pipeline version create to accept step secrets specified in the orchestration spec (config.yaml), not as CLI parameters.
  • Update clarifai pipeline version get to display step secrets.

2. Config File Integration (Orchestration Spec)

  • Step secrets should be defined within the config.yaml file as part of the pipeline orchestration specification, matching the API schema.

  • Example config.yaml:

    pipeline_version:
      config:
        step_version_secrets:
          step1:
            secrets:
              API_KEY: users/user123/secrets/my-api-key
              DB_PASSWORD: users/user123/secrets/db-secret
          step2:
            secrets:
              EMAIL_TOKEN: users/user123/secrets/email-token
  • CLI usage examples:

    clarifai pipeline version create --pipeline-id "my-pipeline" --config config.yaml
    
  • All step secrets should be managed via the spec file; do not support CLI flags for individual secrets.

3. Output Formatting

  • Update list/get commands to display step secrets in human-readable/table format.
  • Add JSON output option for programmatic use.

Backward Compatibility

  • Fully Backward Compatible: All existing API calls will continue to work.
  • Optional Field: step_version_secrets is optional; existing pipelines unaffected.
  • No Breaking Changes: No existing fields were modified or removed.

Testing Requirements

1. Unit Tests

  • Test pipeline version creation with step secrets from config file.
  • Test step secrets validation and error handling.
  • Test backward compatibility (existing pipelines without secrets).

2. Integration Tests

  • End-to-end testing for pipeline creation, execution, and CLI commands with step secrets defined in config file.
  • Error scenarios (invalid secret references, missing secrets).

3. Example Code & Documentation

  • Add step secrets usage examples to docs.
  • Update any existing pipeline examples to show step secrets usage in orchestration spec.

Security Considerations

  • Step-Level Isolation: Each step only accesses explicitly configured secrets.
  • Secret References Only: API handles references, not actual secret values.
  • Kubernetes Integration: Backend mounts secrets securely as environment variables.
  • No Secret Value Leakage: Only references are stored in config; values are injected at runtime.

Additional Context

  • Backend uses env var injection and Kubernetes SecretKeyRef for secure secret mounting.
  • Isolation is guaranteed for each step.
  • Multiple secrets per step and multiple steps per pipeline are supported.

---...


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

- Implement step_version_secrets parsing in pipeline_builder.py
- Add helper methods to Pipeline client (get_pipeline_version, create_pipeline_version, list_step_secrets)
- Include step secrets in lockfile generation
- Update pipeline config template with example step secrets
- Add comprehensive test coverage for step secrets functionality

Co-authored-by: ydixit-clarifai <[email protected]>
Copilot AI and others added 3 commits October 30, 2025 06:45
Add note explaining that Python API uses simplified format without 'secrets' wrapper compared to YAML configuration

Co-authored-by: ydixit-clarifai <[email protected]>
- Change info logging to debug level to reduce verbosity
- Rename 'secrets' variable to 'secret_refs' for clarity
- Add comments explaining that only references (not values) are handled
- Remove per-secret logging to minimize exposure surface
- Address CodeQL security alerts (false positives documented)

Co-authored-by: ydixit-clarifai <[email protected]>
# Secret references are like "users/user123/secrets/my-api-key"
secret_refs = step_config.get("secrets", {})
if not secret_refs:
logger.debug(f"No secret references found for step {step_ref}, skipping")

Check failure

Code scanning / CodeQL

Clear-text logging of sensitive information High

This expression logs
sensitive data (secret)
as clear text.
This expression logs
sensitive data (secret)
as clear text.
This expression logs
sensitive data (secret)
as clear text.

Copilot Autofix

AI 3 days ago

To fix this issue, avoid logging the value of step_ref in log statements related to secret management, especially when those references may be sensitive or could leak implementation details. Specifically, the log message on line 396 should be reworded or removed, so as not to include the step_ref value. The recommended approach is to log a generic message that does not reveal the step or key name, or else omit the log entirely if it's not essential for debugging. No new imports or method definitions are necessary, and only the single log statement on line 396 needs to be changed.

Suggested changeset 1
clarifai/runners/pipelines/pipeline_builder.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/clarifai/runners/pipelines/pipeline_builder.py b/clarifai/runners/pipelines/pipeline_builder.py
--- a/clarifai/runners/pipelines/pipeline_builder.py
+++ b/clarifai/runners/pipelines/pipeline_builder.py
@@ -393,7 +393,7 @@
             # Secret references are like "users/user123/secrets/my-api-key"
             secret_refs = step_config.get("secrets", {})
             if not secret_refs:
-                logger.debug(f"No secret references found for step {step_ref}, skipping")
+                logger.debug("No secret references found for step, skipping")
                 continue
 
             # Create StepSecretConfig proto
EOF
@@ -393,7 +393,7 @@
# Secret references are like "users/user123/secrets/my-api-key"
secret_refs = step_config.get("secrets", {})
if not secret_refs:
logger.debug(f"No secret references found for step {step_ref}, skipping")
logger.debug("No secret references found for step, skipping")
continue

# Create StepSecretConfig proto
Copilot is powered by AI and may make mistakes. Always verify output.

# Add to pipeline version config
pipeline_version.config.step_version_secrets[step_ref].CopyFrom(step_secret_config)
logger.debug(f"Configured secret references for step {step_ref}")

Check failure

Code scanning / CodeQL

Clear-text logging of sensitive information High

This expression logs
sensitive data (secret)
as clear text.
This expression logs
sensitive data (secret)
as clear text.
This expression logs
sensitive data (secret)
as clear text.

Copilot Autofix

AI 3 days ago

To fix the problem, we should avoid logging any information derived from step_ref if it could contain sensitive or identifying information. The log statement on line 406 currently logs step_ref directly. The best approach is to replace this log statement with text that does not output sensitive information. For example, you can simply indicate that secret references have been configured for a step, without using a dynamic identifier, or even suppress the log altogether if it serves no unique troubleshooting purpose. Alternatively, you may redacted the reference (e.g., log a hash, length, or generic indicator) if visibility is required, but here the safest approach is to remove or generalize the message.

Changes required:

  • Edit line 406 in clarifai/runners/pipelines/pipeline_builder.py to remove or generalize the log statement so that step_ref is not logged.
  • No new methods or imports needed.
Suggested changeset 1
clarifai/runners/pipelines/pipeline_builder.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/clarifai/runners/pipelines/pipeline_builder.py b/clarifai/runners/pipelines/pipeline_builder.py
--- a/clarifai/runners/pipelines/pipeline_builder.py
+++ b/clarifai/runners/pipelines/pipeline_builder.py
@@ -403,7 +403,7 @@
 
             # Add to pipeline version config
             pipeline_version.config.step_version_secrets[step_ref].CopyFrom(step_secret_config)
-            logger.debug(f"Configured secret references for step {step_ref}")
+            logger.debug("Configured secret references for a pipeline step.")
 
     def create_pipeline(self) -> tuple[bool, str]:
         """Create the pipeline using PostPipelines RPC.
EOF
@@ -403,7 +403,7 @@

# Add to pipeline version config
pipeline_version.config.step_version_secrets[step_ref].CopyFrom(step_secret_config)
logger.debug(f"Configured secret references for step {step_ref}")
logger.debug("Configured secret references for a pipeline step.")

def create_pipeline(self) -> tuple[bool, str]:
"""Create the pipeline using PostPipelines RPC.
Copilot is powered by AI and may make mistakes. Always verify output.
Copilot AI changed the title [WIP] Add Pipeline Step Secrets feature to Python SDK and CLI Add Pipeline Step Secrets support to Python SDK and CLI Oct 30, 2025
Copilot AI requested a review from ydixit-clarifai October 30, 2025 06:59
Copilot finished work on behalf of ydixit-clarifai October 30, 2025 06:59
@ydixit-clarifai
Copy link

@copilot The stepversionsecret should be part of the orchestration spec in each of the pipeline steps.
The pipelineverison structure should be PipelineVersion.Config -> StepVersionSecrets: Map<String, Map<String, String / Secret Reference>>

Why Map<SecretName, SecretRef>?

For one pipeline,
SecretName can be EMAIL_PROVIDER_API_KEY and SecretRef can be users/1/secrets/secret-1

While for another one,
SecretName can be EMAIL_PROVIDER_API_KEY and SecretRef can be users/2/secrets/secret-3 (example representation of secret-reference)

Per review feedback, step_version_secrets should be part of the pipeline config section (matching the proto structure PipelineVersion.config.step_version_secrets) rather than nested inside orchestration_spec.

Changes:
- Move step_version_secrets from orchestration_spec to config section in YAML structure
- Update pipeline_builder.py to read from config section
- Update lockfile generation to preserve secrets in config section
- Update templates and documentation to show correct structure
- Update all tests to use new config structure

The new YAML structure is:
```yaml
pipeline:
  orchestration_spec:
    argo_orchestration_spec: |
      ...
  config:
    step_version_secrets:
      step-0:
        secrets:
          API_KEY: users/user123/secrets/my-api-key
```

Co-authored-by: ydixit-clarifai <[email protected]>
Copy link
Contributor Author

Copilot AI commented Oct 30, 2025

@copilot The stepversionsecret should be part of the orchestration spec in each of the pipeline steps.
The pipelineverison structure should be PipelineVersion.Config -> StepVersionSecrets: Map<String...

Updated the implementation to match the proto structure. The step_version_secrets is now part of the config section at the pipeline level (matching PipelineVersion.config.step_version_secrets), not nested inside orchestration_spec.

The new YAML structure is:

pipeline:
  orchestration_spec:
    argo_orchestration_spec: |
      ...
  config:
    step_version_secrets:
      step-0:
        secrets:
          API_KEY: users/user123/secrets/my-api-key

All tests updated and passing. Changes in commit 9cdadbd.

Copilot finished work on behalf of ydixit-clarifai October 30, 2025 08:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

Pipeline Step Secrets Feature Adoption in Python SDK and CLI

2 participants