Add routine webhook trigger endpoint and setup validation#689
Add routine webhook trigger endpoint and setup validation#689tsubasakong wants to merge 3 commits intonearai:mainfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a new public webhook endpoint for triggering routines, enhancing the system's integration capabilities. It also significantly improves the security and reliability of channel setup by implementing credential validation with SSRF protection. Furthermore, the core time utility tool has been upgraded to offer comprehensive timezone handling and flexible timestamp manipulation, providing more powerful time-related operations. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a webhook trigger for routines, adds credential validation for channel setups, and significantly enhances the time tool with timezone and parsing capabilities. However, the public webhook endpoint is vulnerable to Denial of Service (DoS) attacks due to inefficient routine lookup and a lack of rate limiting, and it also leaks internal error details. Furthermore, a potential TOCTOU vulnerability exists in the new SSRF protection mechanism, and the SSRF guard needs to be updated to handle IPv4-mapped IPv6 addresses. There are also opportunities for refactoring to improve maintainability and performance.
| let routines = store | ||
| .list_all_routines() | ||
| .await | ||
| .map_err(|e| (StatusCode::INTERNAL_SERVER_ERROR, e.to_string()))?; | ||
|
|
||
| let mut matches: Vec<crate::agent::routine::Routine> = routines | ||
| .into_iter() | ||
| .filter(|routine| routine.user_id == state.user_id) | ||
| .filter(|routine| { | ||
| webhook_path_for_routine(routine).as_deref() == Some(requested_path.as_str()) | ||
| }) | ||
| .collect(); |
There was a problem hiding this comment.
The webhook_trigger_handler is a public endpoint that performs an expensive operation by fetching all routines from the database and filtering them in-memory for every request. This occurs before any authentication or rate limiting, making it vulnerable to Denial of Service (DoS) attacks. An attacker could exploit this by flooding the endpoint, leading to high CPU and memory usage. For better scalability and to mitigate this DoS risk, consider adding a more specific query to the database layer to fetch a routine by its webhook path directly, rather than listing all routines and filtering in memory. This would involve adding a method like async fn get_routine_by_webhook_path(&self, user_id: &str, path: &str) -> Result<Option<Routine>, DatabaseError>; to the RoutineStore trait and implementing it in database backends, potentially with an index on the trigger_config column.
| async fn validate_channel_credentials( | ||
| validation_endpoint: &str, | ||
| secrets: &HashMap<String, String>, | ||
| ) -> Result<(), String> { | ||
| let resolved = substitute_validation_placeholders(validation_endpoint, secrets); | ||
| let missing = find_unresolved_placeholders(&resolved); | ||
| if !missing.is_empty() { | ||
| return Err(format!( | ||
| "missing secrets for placeholders: {}", | ||
| missing.join(", ") | ||
| )); | ||
| } | ||
|
|
||
| ensure_public_http_url(&resolved)?; | ||
|
|
||
| let client = reqwest::Client::builder() | ||
| .timeout(std::time::Duration::from_secs(5)) | ||
| .build() | ||
| .map_err(|e| format!("Failed to build HTTP client: {e}"))?; | ||
|
|
||
| let response = client | ||
| .get(&resolved) | ||
| .send() | ||
| .await | ||
| .map_err(|e| redact_secrets(&format!("Validation request failed: {e}"), secrets))?; | ||
|
|
||
| if response.status().is_success() { | ||
| return Ok(()); | ||
| } | ||
|
|
||
| let status = response.status(); | ||
| let body = response.text().await.unwrap_or_default(); | ||
| let truncated = truncate_for_display(&redact_secrets(&body, secrets), 200); | ||
| Err(format!("Validation failed: HTTP {status} ({truncated})")) | ||
| } |
There was a problem hiding this comment.
This function has a Time-of-check-to-time-of-use (TOCTOU) vulnerability that could lead to a Server-Side Request Forgery (SSRF).
The ensure_public_http_url function correctly resolves a hostname and validates that the resulting IP addresses are not in private ranges. However, the reqwest client in validate_channel_credentials is then called with the original URL, causing it to perform its own DNS resolution. An attacker could change the DNS record between the check and the use, making the client connect to a private IP address and bypassing the validation.
To fix this, you should pin the validated IP addresses for the reqwest client to use.
- Modify
ensure_public_http_urlto return the validated public IP addresses along with the host. For example:Result<(String, Vec<std::net::SocketAddr>), String>. - In
validate_channel_credentials, usereqwest::ClientBuilder::resolve_to_addrsto pin these addresses to the host for the HTTP client.
Here's an example of how you could adjust validate_channel_credentials:
async fn validate_channel_credentials(
validation_endpoint: &str,
secrets: &HashMap<String, String>,
) -> Result<(), String> {
let resolved = substitute_validation_placeholders(validation_endpoint, secrets);
// ... check for missing placeholders ...
// 1. Get validated host and addresses
let (host, addrs) = ensure_public_http_url(&resolved)?;
// 2. Build client with pinned addresses
let client = reqwest::Client::builder()
.timeout(std::time::Duration::from_secs(5))
.resolve_to_addrs(&host, &addrs) // Pin the resolved addresses
.build()
.map_err(|e| format!("Failed to build HTTP client: {e}"))?;
// 3. Make the request
let response = client
.get(&resolved)
.send()
.await
.map_err(|e| redact_secrets(&format!("Validation request failed: {e}"), secrets))?;
// ... rest of the function ...
}References
- To prevent DNS-based SSRF vulnerabilities, resolve hostnames to IP addresses, validate all resolved IPs against restricted ranges, and pin the validated addresses for the connection to prevent TOCTOU race conditions.
| let routines = store | ||
| .list_all_routines() | ||
| .await | ||
| .map_err(|e| (StatusCode::INTERNAL_SERVER_ERROR, e.to_string()))?; |
There was a problem hiding this comment.
| std::net::IpAddr::V6(v6) => { | ||
| v6.is_loopback() | ||
| || v6.is_unspecified() | ||
| || (v6.segments()[0] & 0xFE00) == 0xFC00 | ||
| || (v6.segments()[0] & 0xFFC0) == 0xFE80 | ||
| } | ||
| } |
There was a problem hiding this comment.
The is_private_ip function used in the SSRF guard does not account for IPv4-mapped IPv6 addresses (e.g., ::ffff:127.0.0.1). An attacker could potentially bypass the SSRF protection by using these addresses to hit internal services on the server. While this is in a setup tool, it's best practice to ensure the guard is comprehensive.
std::net::IpAddr::V6(v6) => {
v6.is_loopback()
|| v6.is_unspecified()
|| (v6.segments()[0] & 0xFE00) == 0xFC00
|| (v6.segments()[0] & 0xFFC0) == 0xFE80
|| v6.to_ipv4_mapped().map(|v4| is_private_ip(std::net::IpAddr::V4(v4))).unwrap_or(false)
}References
- To prevent DNS-based SSRF vulnerabilities, resolve hostnames to IP addresses, validate all resolved IPs against restricted ranges, and pin the validated addresses for the connection to prevent TOCTOU race conditions.
| /// Fire a routine via webhook (from the web gateway). | ||
| /// | ||
| /// Enforces enabled check and concurrent run limit. | ||
| pub async fn fire_webhook( | ||
| &self, | ||
| routine_id: Uuid, | ||
| user_id: Option<&str>, | ||
| trigger_detail: Option<String>, | ||
| ) -> Result<Uuid, RoutineError> { | ||
| let routine = self | ||
| .store | ||
| .get_routine(routine_id) | ||
| .await | ||
| .map_err(|e| RoutineError::Database { | ||
| reason: e.to_string(), | ||
| })? | ||
| .ok_or(RoutineError::NotFound { id: routine_id })?; | ||
|
|
||
| // Enforce ownership when a user_id is provided (gateway calls). | ||
| if let Some(uid) = user_id | ||
| && routine.user_id != uid | ||
| { | ||
| return Err(RoutineError::NotAuthorized { id: routine_id }); | ||
| } | ||
|
|
||
| if !routine.enabled { | ||
| return Err(RoutineError::Disabled { | ||
| name: routine.name.clone(), | ||
| }); | ||
| } | ||
|
|
||
| if !self.check_concurrent(&routine).await { | ||
| return Err(RoutineError::MaxConcurrent { | ||
| name: routine.name.clone(), | ||
| }); | ||
| } | ||
|
|
||
| let run_id = Uuid::new_v4(); | ||
| let run = RoutineRun { | ||
| id: run_id, | ||
| routine_id: routine.id, | ||
| trigger_type: "webhook".to_string(), | ||
| trigger_detail, | ||
| started_at: Utc::now(), | ||
| completed_at: None, | ||
| status: RunStatus::Running, | ||
| result_summary: None, | ||
| tokens_used: None, | ||
| job_id: None, | ||
| created_at: Utc::now(), | ||
| }; | ||
|
|
||
| if let Err(e) = self.store.create_routine_run(&run).await { | ||
| return Err(RoutineError::Database { | ||
| reason: format!("failed to create run record: {e}"), | ||
| }); | ||
| } | ||
|
|
||
| let engine = EngineContext { | ||
| store: self.store.clone(), | ||
| llm: self.llm.clone(), | ||
| workspace: self.workspace.clone(), | ||
| notify_tx: self.notify_tx.clone(), | ||
| running_count: self.running_count.clone(), | ||
| scheduler: self.scheduler.clone(), | ||
| }; | ||
|
|
||
| tokio::spawn(async move { | ||
| execute_routine(engine, routine, run).await; | ||
| }); | ||
|
|
||
| Ok(run_id) | ||
| } |
There was a problem hiding this comment.
There is significant code duplication between this new fire_webhook function and the existing fire_manual function. Both functions perform similar steps: fetching the routine, checking for authorization, enabled status, and concurrent runs, creating a RoutineRun record, and spawning the execution task.
To improve maintainability and reduce redundancy, consider refactoring the common logic into a private helper function. This helper could handle the core firing logic, while fire_webhook and fire_manual would be responsible for their specific trigger details and initial checks.
For example, you could have a private function like:
async fn fire_routine(
&self,
routine: Routine,
trigger_type: &str,
trigger_detail: Option<String>,
) -> Result<Uuid, RoutineError> {
// ... common logic to create run, spawn task, etc.
}References
- Consolidate related sequences of operations, such as creating, persisting, and scheduling a job, into a single reusable method to improve code consistency and maintainability.
Summary\n- validate channel setup credentials via substituted validation endpoint with SSRF guard\n- add public /api/webhooks/{path} routine trigger with secret validation\n- add routine engine webhook fire helper + coverage test\n\n## Testing\n- not run (cargo not available)\n\nCloses #651