Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions src/core/mentions/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,9 @@ function getUrlErrorMessage(error: unknown): string {
if (errorMessage.includes("net::ERR_INTERNET_DISCONNECTED")) {
return t("common:errors.no_internet")
}
if (errorMessage.includes("net::ERR_ABORTED")) {
return t("common:errors.url_request_aborted")
}
if (errorMessage.includes("403") || errorMessage.includes("Forbidden")) {
return t("common:errors.url_forbidden")
}
Expand Down
1 change: 1 addition & 0 deletions src/i18n/locales/en/common.json
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@
"no_internet": "No internet connection. Please check your network connection and try again.",
"url_forbidden": "Access to this website is forbidden. The site may block automated access or require authentication.",
"url_page_not_found": "The page was not found. Please check if the URL is correct.",
"url_request_aborted": "The request to fetch the URL was aborted. This may happen if the site blocks automated access, requires authentication, or if there's a network issue. Please try again or check if the URL is accessible in a regular browser.",
"url_fetch_failed": "Failed to fetch URL content: {{error}}",
"url_fetch_error_with_url": "Error fetching content for {{url}}: {{error}}",
"command_timeout": "Command execution timed out after {{seconds}} seconds",
Expand Down
58 changes: 39 additions & 19 deletions src/services/browser/UrlContentFetcher.ts
Original file line number Diff line number Diff line change
Expand Up @@ -100,26 +100,46 @@ export class UrlContentFetcher {
const errorMessage = serializedError.message || String(error)
const errorName = serializedError.name

// Only retry for timeout or network-related errors
const shouldRetry =
errorMessage.includes("timeout") ||
errorMessage.includes("net::") ||
errorMessage.includes("NetworkError") ||
errorMessage.includes("ERR_") ||
errorName === "TimeoutError"

if (shouldRetry) {
// If networkidle2 fails due to timeout/network issues, try with just domcontentloaded as fallback
console.warn(
`Failed to load ${url} with networkidle2, retrying with domcontentloaded only: ${errorMessage}`,
)
await this.page.goto(url, {
timeout: URL_FETCH_FALLBACK_TIMEOUT,
waitUntil: ["domcontentloaded"],
})
// Special handling for ERR_ABORTED
if (errorMessage.includes("net::ERR_ABORTED")) {
console.error(`Navigation to ${url} was aborted: ${errorMessage}`)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Using console.error here while line 133 uses console.warn for other retries. Should we use consistent logging levels for similar retry operations?

// For ERR_ABORTED, we'll try a more aggressive retry with just domcontentloaded
// and a shorter timeout to quickly determine if the page is accessible
try {
await this.page.goto(url, {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice the retry logic for ERR_ABORTED is similar to the existing retry logic but with different parameters. Could we extract this into a reusable helper method to reduce code duplication? Something like retryNavigation(url, timeout, waitUntil) perhaps?

timeout: 10000, // 10 seconds for quick check
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider extracting this timeout value to a constant like URL_FETCH_ABORTED_RETRY_TIMEOUT for better maintainability.

waitUntil: ["domcontentloaded"],
})
} catch (retryError) {
// If retry also fails, throw a more descriptive error
const retrySerializedError = serializeError(retryError)
const retryErrorMessage = retrySerializedError.message || String(retryError)
throw new Error(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error handling here catches all errors in the retry block and converts them to a generic message. Would it be helpful to preserve more specific error types (like ERR_CONNECTION_REFUSED vs ERR_TIMEOUT) in the final error message? This could help users understand why the retry failed.

`Failed to fetch URL content: ${retryErrorMessage}. The request was aborted, which may indicate the URL is inaccessible or blocked.`,
)
}
} else {
// For other errors, throw them as-is
throw error
// Only retry for timeout or network-related errors
const shouldRetry =
errorMessage.includes("timeout") ||
errorMessage.includes("net::") ||
errorMessage.includes("NetworkError") ||
errorMessage.includes("ERR_") ||
errorName === "TimeoutError"

if (shouldRetry) {
// If networkidle2 fails due to timeout/network issues, try with just domcontentloaded as fallback
console.warn(
`Failed to load ${url} with networkidle2, retrying with domcontentloaded only: ${errorMessage}`,
)
await this.page.goto(url, {
timeout: URL_FETCH_FALLBACK_TIMEOUT,
waitUntil: ["domcontentloaded"],
})
} else {
// For other errors, throw them as-is
throw error
}
}
}

Expand Down
30 changes: 30 additions & 0 deletions src/services/browser/__tests__/UrlContentFetcher.spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -273,6 +273,36 @@ describe("UrlContentFetcher", () => {
await expect(urlContentFetcher.urlToMarkdown("https://example.com")).rejects.toThrow("Simple string error")
expect(mockPage.goto).toHaveBeenCalledTimes(1)
})

it("should handle net::ERR_ABORTED with special retry logic", async () => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great test coverage! One edge case to consider: what happens when the initial request fails with ERR_ABORTED but the retry succeeds after a delay? This could simulate a temporarily blocked request that becomes available.

const abortedError = new Error("net::ERR_ABORTED at https://example.com")
mockPage.goto.mockRejectedValueOnce(abortedError).mockResolvedValueOnce(undefined)

const result = await urlContentFetcher.urlToMarkdown("https://example.com")

expect(mockPage.goto).toHaveBeenCalledTimes(2)
expect(mockPage.goto).toHaveBeenNthCalledWith(1, "https://example.com", {
timeout: 30000,
waitUntil: ["domcontentloaded", "networkidle2"],
})
expect(mockPage.goto).toHaveBeenNthCalledWith(2, "https://example.com", {
timeout: 10000,
waitUntil: ["domcontentloaded"],
})
expect(result).toBe("# Test content")
})

it("should throw descriptive error when ERR_ABORTED retry also fails", async () => {
const abortedError = new Error("net::ERR_ABORTED at https://example.com")
const retryError = new Error("net::ERR_CONNECTION_REFUSED")
mockPage.goto.mockRejectedValueOnce(abortedError).mockRejectedValueOnce(retryError)

await expect(urlContentFetcher.urlToMarkdown("https://example.com")).rejects.toThrow(
"Failed to fetch URL content: net::ERR_CONNECTION_REFUSED. The request was aborted, which may indicate the URL is inaccessible or blocked.",
)

expect(mockPage.goto).toHaveBeenCalledTimes(2)
})
})

describe("closeBrowser", () => {
Expand Down
Loading