feat(stt): support custom Azure Speech transcription endpoint URL#6333
feat(stt): support custom Azure Speech transcription endpoint URL#6333fengfeng-zi wants to merge 1 commit intoFlowiseAI:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces the ability to configure a custom base URL for Azure Speech-to-Text services, including a new UI field and a helper function to construct the final endpoint URL. The review feedback highlights a security concern regarding potential SSRF when using user-provided URLs with the standard axios client and suggests refactoring the URL builder to use the native URL API for more reliable query parameter handling.
| formData.append('definition', JSON.stringify(definition)) | ||
|
|
||
| const response = await axios.post(`${baseUrl}?api-version=${apiVersion}`, formData, { | ||
| const response = await axios.post(azureSpeechToTextUrl, formData, { |
There was a problem hiding this comment.
Since the baseUrl is now user-configurable via the UI, using the raw axios client poses a security risk as it bypasses Server-Side Request Forgery (SSRF) protections. It is highly recommended to use the secureAxiosRequest wrapper from ./httpSecurity, which validates the target URL against the repository's deny list (e.g., internal network ranges). Note that you will need to import secureAxiosRequest from ./httpSecurity and adjust the call to use the configuration object syntax.
| export const buildAzureSpeechToTextUrl = (serviceRegion: string, apiVersion: string, baseUrl?: string) => { | ||
| const trimmedBaseUrl = baseUrl?.trim() | ||
| const base = trimmedBaseUrl | ||
| ? trimmedBaseUrl.replace(/\/+$/, '') | ||
| : `https://${serviceRegion}.cognitiveservices.azure.com/speechtotext/transcriptions:transcribe` | ||
|
|
||
| if (/[?&]api-version=/.test(base)) { | ||
| return base | ||
| } | ||
|
|
||
| return `${base}${base.includes('?') ? '&' : '?'}api-version=${encodeURIComponent(apiVersion)}` | ||
| } |
There was a problem hiding this comment.
While the current string-based URL construction works for standard cases, it is susceptible to issues with trailing separators (e.g., ?&) or fragments (e.g., #frag?api-version=...). Using the native URL API would make this helper more robust and maintainable by automatically handling query parameter merging and encoding.
export const buildAzureSpeechToTextUrl = (serviceRegion: string, apiVersion: string, baseUrl?: string) => {
const trimmedBaseUrl = baseUrl?.trim()
const base = trimmedBaseUrl
? trimmedBaseUrl.replace(/\/+$/, "")
: "https://" + serviceRegion + ".cognitiveservices.azure.com/speechtotext/transcriptions:transcribe"
try {
const url = new URL(base)
if (!url.searchParams.has("api-version")) {
url.searchParams.set("api-version", apiVersion)
}
return url.toString()
} catch {
if (/[?&]api-version=/.test(base)) {
return base
}
return base + (base.includes("?") ? "&" : "?") + "api-version=" + encodeURIComponent(apiVersion)
}
}
Summary\nThis addresses #3767 by enabling Azure STT users to provide a custom endpoint URL in Speech-to-Text configuration.\n\nFlowise's current Azure STT implementation uses the Azure REST transcription endpoint (not the browser SpeechSDK path that sets endpointId directly). To align with current architecture while enabling custom deployments, this PR adds a configurable �aseUrl override for Azure STT requests.\n\n## Changes\n- Add �uildAzureSpeechToTextUrl() helper in packages/components/src/speechToText.ts to safely construct Azure transcription URL.\n- Support optional speechToTextConfig.baseUrl for Azure STT; default behavior remains unchanged when omitted.\n- Add Azure STT Base URL input in UI (packages/ui/src/ui-component/extended/SpeechToText.jsx).\n- Add focused URL-construction regression tests in packages/components/src/utils.test.ts.\n\n## Notes\n- No behavior change for existing users unless they set Base URL.\n- Local test execution was not possible in this environment because dependencies are not installed (pnpm unavailable and network-constrained for install).