Skip to content

Add routine webhook trigger endpoint and setup validation#689

Open
tsubasakong wants to merge 3 commits intonearai:mainfrom
tsubasakong:fix/routine-webhook-validation-651
Open

Add routine webhook trigger endpoint and setup validation#689
tsubasakong wants to merge 3 commits intonearai:mainfrom
tsubasakong:fix/routine-webhook-validation-651

Conversation

@tsubasakong
Copy link

Summary\n- validate channel setup credentials via substituted validation endpoint with SSRF guard\n- add public /api/webhooks/{path} routine trigger with secret validation\n- add routine engine webhook fire helper + coverage test\n\n## Testing\n- not run (cargo not available)\n\nCloses #651

@github-actions github-actions bot added scope: agent Agent core (agent loop, router, scheduler) scope: channel/web Web gateway channel scope: tool/builtin Built-in tools scope: setup Onboarding / setup scope: dependencies Dependency updates size: XL 500+ changed lines risk: high Safety, secrets, auth, or critical infrastructure contributor: new First-time contributor labels Mar 7, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new public webhook endpoint for triggering routines, enhancing the system's integration capabilities. It also significantly improves the security and reliability of channel setup by implementing credential validation with SSRF protection. Furthermore, the core time utility tool has been upgraded to offer comprehensive timezone handling and flexible timestamp manipulation, providing more powerful time-related operations.

Highlights

  • New Webhook Trigger Endpoint: A public API endpoint /api/webhooks/{path} has been added to allow external systems to trigger routines via POST requests, including secret validation for security.
  • Secure Channel Setup Validation: Channel setup now includes a validation step for credentials against a specified endpoint. This validation features secret substitution and a robust Server-Side Request Forgery (SSRF) guard to prevent malicious requests to internal or private networks.
  • Enhanced Time Tool Functionality: The built-in time tool has been significantly expanded to support parsing, formatting, and converting timestamps across different IANA timezones, handling naive timestamps, and providing more detailed output for time operations.
  • Routine Engine Webhook Helper: A new fire_webhook helper function was introduced in the routine engine to programmatically trigger routines, enforcing ownership, enabled status, and concurrent run limits.
Changelog
  • Cargo.toml
    • Added chrono-tz dependency for advanced timezone handling.
  • src/agent/routine_engine.rs
    • Implemented fire_webhook method to allow external triggering of routines with validation checks.
    • Ensured webhook-triggered runs create appropriate RoutineRun records.
  • src/channels/web/server.rs
    • Added /api/webhooks/{*path} POST route for triggering routines.
    • Implemented webhook_trigger_handler to process webhook requests, validate secrets, and fire routines.
    • Introduced normalize_webhook_path and webhook_path_for_routine helper functions for path management.
    • Incorporated subtle::ConstantTimeEq for secure secret comparison.
  • src/setup/channels.rs
    • Added HashMap import for secret management during validation.
    • Modified setup_wasm_channel to collect and substitute secrets into validation endpoints.
    • Implemented validate_channel_credentials to perform HTTP validation with SSRF protection.
    • Added helper functions substitute_validation_placeholders, find_unresolved_placeholders, ensure_public_http_url, is_private_ip, truncate_for_display, and redact_secrets for secure and robust validation.
  • src/tools/builtin/time.rs
    • Updated TimeTool description to reflect new parsing and formatting capabilities.
    • Expanded parameters_schema to include convert operation and new parameters like input, timezone, from_timezone, to_timezone, and format_string.
    • Refactored execute method to delegate to specialized functions for each time operation.
    • Implemented execute_now, execute_parse, execute_convert, execute_format, and execute_diff with enhanced timezone awareness.
    • Added comprehensive timezone resolution and naive timestamp parsing logic.
    • Introduced chrono-tz for accurate timezone conversions.
  • tests/e2e_routine_heartbeat.rs
    • Added webhook_trigger_fires end-to-end test to verify webhook routine triggering.
    • Updated test numbering to accommodate the new test.
Activity
  • The pull request was created by tsubasakong.
  • The pull request introduces new features for webhook integration and improved time handling.
  • No human review comments or activity have been recorded yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a webhook trigger for routines, adds credential validation for channel setups, and significantly enhances the time tool with timezone and parsing capabilities. However, the public webhook endpoint is vulnerable to Denial of Service (DoS) attacks due to inefficient routine lookup and a lack of rate limiting, and it also leaks internal error details. Furthermore, a potential TOCTOU vulnerability exists in the new SSRF protection mechanism, and the SSRF guard needs to be updated to handle IPv4-mapped IPv6 addresses. There are also opportunities for refactoring to improve maintainability and performance.

Comment on lines +2144 to +2155
let routines = store
.list_all_routines()
.await
.map_err(|e| (StatusCode::INTERNAL_SERVER_ERROR, e.to_string()))?;

let mut matches: Vec<crate::agent::routine::Routine> = routines
.into_iter()
.filter(|routine| routine.user_id == state.user_id)
.filter(|routine| {
webhook_path_for_routine(routine).as_deref() == Some(requested_path.as_str())
})
.collect();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The webhook_trigger_handler is a public endpoint that performs an expensive operation by fetching all routines from the database and filtering them in-memory for every request. This occurs before any authentication or rate limiting, making it vulnerable to Denial of Service (DoS) attacks. An attacker could exploit this by flooding the endpoint, leading to high CPU and memory usage. For better scalability and to mitigate this DoS risk, consider adding a more specific query to the database layer to fetch a routine by its webhook path directly, rather than listing all routines and filtering in memory. This would involve adding a method like async fn get_routine_by_webhook_path(&self, user_id: &str, path: &str) -> Result<Option<Routine>, DatabaseError>; to the RoutineStore trait and implementing it in database backends, potentially with an index on the trigger_config column.

Comment on lines +841 to +875
async fn validate_channel_credentials(
validation_endpoint: &str,
secrets: &HashMap<String, String>,
) -> Result<(), String> {
let resolved = substitute_validation_placeholders(validation_endpoint, secrets);
let missing = find_unresolved_placeholders(&resolved);
if !missing.is_empty() {
return Err(format!(
"missing secrets for placeholders: {}",
missing.join(", ")
));
}

ensure_public_http_url(&resolved)?;

let client = reqwest::Client::builder()
.timeout(std::time::Duration::from_secs(5))
.build()
.map_err(|e| format!("Failed to build HTTP client: {e}"))?;

let response = client
.get(&resolved)
.send()
.await
.map_err(|e| redact_secrets(&format!("Validation request failed: {e}"), secrets))?;

if response.status().is_success() {
return Ok(());
}

let status = response.status();
let body = response.text().await.unwrap_or_default();
let truncated = truncate_for_display(&redact_secrets(&body, secrets), 200);
Err(format!("Validation failed: HTTP {status} ({truncated})"))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This function has a Time-of-check-to-time-of-use (TOCTOU) vulnerability that could lead to a Server-Side Request Forgery (SSRF).

The ensure_public_http_url function correctly resolves a hostname and validates that the resulting IP addresses are not in private ranges. However, the reqwest client in validate_channel_credentials is then called with the original URL, causing it to perform its own DNS resolution. An attacker could change the DNS record between the check and the use, making the client connect to a private IP address and bypassing the validation.

To fix this, you should pin the validated IP addresses for the reqwest client to use.

  1. Modify ensure_public_http_url to return the validated public IP addresses along with the host. For example: Result<(String, Vec<std::net::SocketAddr>), String>.
  2. In validate_channel_credentials, use reqwest::ClientBuilder::resolve_to_addrs to pin these addresses to the host for the HTTP client.

Here's an example of how you could adjust validate_channel_credentials:

async fn validate_channel_credentials(
    validation_endpoint: &str,
    secrets: &HashMap<String, String>,
) -> Result<(), String> {
    let resolved = substitute_validation_placeholders(validation_endpoint, secrets);
    // ... check for missing placeholders ...

    // 1. Get validated host and addresses
    let (host, addrs) = ensure_public_http_url(&resolved)?;

    // 2. Build client with pinned addresses
    let client = reqwest::Client::builder()
        .timeout(std::time::Duration::from_secs(5))
        .resolve_to_addrs(&host, &addrs) // Pin the resolved addresses
        .build()
        .map_err(|e| format!("Failed to build HTTP client: {e}"))?;

    // 3. Make the request
    let response = client
        .get(&resolved)
        .send()
        .await
        .map_err(|e| redact_secrets(&format!("Validation request failed: {e}"), secrets))?;

    // ... rest of the function ...
}
References
  1. To prevent DNS-based SSRF vulnerabilities, resolve hostnames to IP addresses, validate all resolved IPs against restricted ranges, and pin the validated addresses for the connection to prevent TOCTOU race conditions.

let routines = store
.list_all_routines()
.await
.map_err(|e| (StatusCode::INTERNAL_SERVER_ERROR, e.to_string()))?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The webhook_trigger_handler returns internal database error messages directly to the client when list_all_routines() fails. Since this is a public endpoint, this could leak sensitive information about the database structure or configuration to an unauthenticated attacker.

Comment on lines +962 to +968
std::net::IpAddr::V6(v6) => {
v6.is_loopback()
|| v6.is_unspecified()
|| (v6.segments()[0] & 0xFE00) == 0xFC00
|| (v6.segments()[0] & 0xFFC0) == 0xFE80
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The is_private_ip function used in the SSRF guard does not account for IPv4-mapped IPv6 addresses (e.g., ::ffff:127.0.0.1). An attacker could potentially bypass the SSRF protection by using these addresses to hit internal services on the server. While this is in a setup tool, it's best practice to ensure the guard is comprehensive.

        std::net::IpAddr::V6(v6) => {
            v6.is_loopback()
                || v6.is_unspecified()
                || (v6.segments()[0] & 0xFE00) == 0xFC00
                || (v6.segments()[0] & 0xFFC0) == 0xFE80
                || v6.to_ipv4_mapped().map(|v4| is_private_ip(std::net::IpAddr::V4(v4))).unwrap_or(false)
        }
References
  1. To prevent DNS-based SSRF vulnerabilities, resolve hostnames to IP addresses, validate all resolved IPs against restricted ranges, and pin the validated addresses for the connection to prevent TOCTOU race conditions.

Comment on lines +258 to +330
/// Fire a routine via webhook (from the web gateway).
///
/// Enforces enabled check and concurrent run limit.
pub async fn fire_webhook(
&self,
routine_id: Uuid,
user_id: Option<&str>,
trigger_detail: Option<String>,
) -> Result<Uuid, RoutineError> {
let routine = self
.store
.get_routine(routine_id)
.await
.map_err(|e| RoutineError::Database {
reason: e.to_string(),
})?
.ok_or(RoutineError::NotFound { id: routine_id })?;

// Enforce ownership when a user_id is provided (gateway calls).
if let Some(uid) = user_id
&& routine.user_id != uid
{
return Err(RoutineError::NotAuthorized { id: routine_id });
}

if !routine.enabled {
return Err(RoutineError::Disabled {
name: routine.name.clone(),
});
}

if !self.check_concurrent(&routine).await {
return Err(RoutineError::MaxConcurrent {
name: routine.name.clone(),
});
}

let run_id = Uuid::new_v4();
let run = RoutineRun {
id: run_id,
routine_id: routine.id,
trigger_type: "webhook".to_string(),
trigger_detail,
started_at: Utc::now(),
completed_at: None,
status: RunStatus::Running,
result_summary: None,
tokens_used: None,
job_id: None,
created_at: Utc::now(),
};

if let Err(e) = self.store.create_routine_run(&run).await {
return Err(RoutineError::Database {
reason: format!("failed to create run record: {e}"),
});
}

let engine = EngineContext {
store: self.store.clone(),
llm: self.llm.clone(),
workspace: self.workspace.clone(),
notify_tx: self.notify_tx.clone(),
running_count: self.running_count.clone(),
scheduler: self.scheduler.clone(),
};

tokio::spawn(async move {
execute_routine(engine, routine, run).await;
});

Ok(run_id)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There is significant code duplication between this new fire_webhook function and the existing fire_manual function. Both functions perform similar steps: fetching the routine, checking for authorization, enabled status, and concurrent runs, creating a RoutineRun record, and spawning the execution task.

To improve maintainability and reduce redundancy, consider refactoring the common logic into a private helper function. This helper could handle the core firing logic, while fire_webhook and fire_manual would be responsible for their specific trigger details and initial checks.

For example, you could have a private function like:

async fn fire_routine(
    &self,
    routine: Routine,
    trigger_type: &str,
    trigger_detail: Option<String>,
) -> Result<Uuid, RoutineError> {
    // ... common logic to create run, spawn task, etc.
}
References
  1. Consolidate related sequences of operations, such as creating, persisting, and scheduling a job, into a single reusable method to improve code consistency and maintainability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor: new First-time contributor risk: high Safety, secrets, auth, or critical infrastructure scope: agent Agent core (agent loop, router, scheduler) scope: channel/web Web gateway channel scope: dependencies Dependency updates scope: setup Onboarding / setup scope: tool/builtin Built-in tools size: XL 500+ changed lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant