Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 3 additions & 6 deletions .github/workflows/build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,14 +17,11 @@ jobs:
- uses: actions/setup-node@v4
with:
node-version-file: '.nvmrc'
- name: Tailscale
uses: tailscale/github-action@v3
with:
authkey: ${{ secrets.TS_AUTHKEY }}
- run: npm ci
- uses: wenoa/[email protected]
with:
WG_CONFIG: ${{ secrets.WG_CONFIG }}
- run: npm run build
env:
PROXY_URL: ${{ secrets.SOLVERR_PROXY_URL }}
- run: npm test
- run: git checkout -- package-lock.json #prevent package-lock.json-only feat changes
- uses: stefanzweifel/git-auto-commit-action@v6
Expand Down
9 changes: 3 additions & 6 deletions .github/workflows/static.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -40,18 +40,15 @@ jobs:
- uses: actions/setup-node@v4
with:
node-version-file: '.nvmrc'
- name: Tailscale
uses: tailscale/github-action@v3
with:
authkey: ${{ secrets.TS_AUTHKEY }}
- uses: actions/cache/restore@v4
id: restore-cache
with:
path: node_modules/
key: ${{ runner.os }}-${{ github.run_id }}${{ github.run_number }}
- uses: wenoa/[email protected]
with:
WG_CONFIG: ${{ secrets.WG_CONFIG }}
- run: npm run build
env:
PROXY_URL: ${{ secrets.SOLVERR_PROXY_URL }}
- uses: actions/cache/save@v4
with:
path: |
Expand Down
40 changes: 3 additions & 37 deletions build/scraper.js
Original file line number Diff line number Diff line change
Expand Up @@ -7,15 +7,6 @@ import sleep from './sleep.js';
import title from './title.js';

const baseUrl = 'https://forums.warframe.com/forum/3-pc-update-notes/';
const proxyUrl = process.env.PROXY_URL;
const isCI = process.env.CI === 'true';
const ciTimeout = process.env.CI_TIMEOUT ? parseInt(process.env.CI_TIMEOUT, 10) : 60000;
const localTimeout = process.env.LOCAL_TIMEOUT ? parseInt(process.env.LOCAL_TIMEOUT, 10) : 12000000;

if (!proxyUrl) {
console.error('PROXY_URL environment variable is not set.');
process.exit(1);
}

/**
* Scraper to get patch logs from forums.
Expand Down Expand Up @@ -47,38 +38,13 @@ class Scraper {
process.exit(1);
}

async #fetch(url = baseUrl, session = 'fetch-warframe') {
try {
const res = await fetch(`${proxyUrl}/v1`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
cmd: 'request.get',
url,
session,
maxTimeout: isCI ? ciTimeout : localTimeout,
returnOnlyCookies: false,
returnPageContent: true,
}),
});
const { solution } = await res.json();
if (!solution?.response) {
throw solution;
}
return solution.response;
} catch (error) {
console.error(`Failed to fetch from proxy ${url}:`, error);
throw error;
}
}

/**
* Retrieve number of post pages to look through. This value should be set to
* 1 through the constructor if we only need the most recent changes.
* @returns {Promise<number>} set the total number of pages
*/
async getPageNumbers() {
const html = await this.#fetch(undefined, 'get-page-numbers');
const html = await fetch(baseUrl).then((r) => r.text());
const $ = load(html);
Comment on lines +47 to 48
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add error handling and response validation to fetch call.

The fetch call lacks error handling for network failures and doesn't validate the response status. Network errors or HTTP error responses (4xx, 5xx) will cause unhandled promise rejections or may result in parsing error pages as valid HTML.

Apply this diff to add proper error handling and response validation:

-    const html = await fetch(baseUrl).then((r) => r.text());
+    const res = await fetch(baseUrl);
+    if (!res.ok) {
+      throw new Error(`HTTP error! status: ${res.status}`);
+    }
+    const html = await res.text();

Additionally, consider adding timeout handling since the proxy-based timeout configuration was removed:

+    const controller = new AbortController();
+    const timeoutId = setTimeout(() => controller.abort(), 30000); // 30s timeout
+    try {
-      const res = await fetch(baseUrl);
+      const res = await fetch(baseUrl, { signal: controller.signal });
       if (!res.ok) {
         throw new Error(`HTTP error! status: ${res.status}`);
       }
       const html = await res.text();
+      clearTimeout(timeoutId);
+    } catch (error) {
+      clearTimeout(timeoutId);
+      throw error;
+    }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const html = await fetch(baseUrl).then((r) => r.text());
const $ = load(html);
const res = await fetch(baseUrl);
if (!res.ok) {
throw new Error(`HTTP error! status: ${res.status}`);
}
const html = await res.text();
const $ = load(html);
🤖 Prompt for AI Agents
In build/scraper.js around lines 47-48, the fetch call directly reads
response.text() without validating the HTTP status or handling network errors
and timeouts; wrap the fetch in a try/catch, validate response.ok (throw or
return a controlled error for 4xx/5xx), and only call response.text() for
successful responses; implement request timeout using AbortController (create a
controller, pass controller.signal to fetch, and clear the timeout on success)
so network hangs are aborted; ensure errors are logged/propagated so calling
code can handle failures instead of parsing error pages as HTML.

const text = $('a[id^="elPagination"]').text().trim().split(' ');

Expand All @@ -96,7 +62,7 @@ class Scraper {
* @returns {void}
*/
async scrape(url) {
const html = await this.#fetch(url);
const html = await fetch(url).then((r) => r.text());
const $ = load(html);
Comment on lines +65 to 66
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add error handling and response validation to fetch call.

Same issue as in getPageNumbers: the fetch call lacks error handling and response status validation, which can lead to crashes or incorrect parsing of error pages.

Apply this diff to add proper error handling and response validation:

-    const html = await fetch(url).then((r) => r.text());
+    const res = await fetch(url);
+    if (!res.ok) {
+      throw new Error(`HTTP error! status: ${res.status}`);
+    }
+    const html = await res.text();

Consider adding timeout handling for consistency:

+    const controller = new AbortController();
+    const timeoutId = setTimeout(() => controller.abort(), 30000);
+    try {
-      const res = await fetch(url);
+      const res = await fetch(url, { signal: controller.signal });
       if (!res.ok) {
         throw new Error(`HTTP error! status: ${res.status}`);
       }
       const html = await res.text();
+      clearTimeout(timeoutId);
+    } catch (error) {
+      clearTimeout(timeoutId);
+      throw error;
+    }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const html = await fetch(url).then((r) => r.text());
const $ = load(html);
const res = await fetch(url);
if (!res.ok) {
throw new Error(`HTTP error! status: ${res.status}`);
}
const html = await res.text();
const $ = load(html);
🤖 Prompt for AI Agents
In build/scraper.js around lines 65-66, the fetch call directly converts to text
without validating the response or handling errors; wrap the fetch in a
try/catch, check response.ok and status before calling response.text(), and
throw or return a controlled error/null when non-2xx status is returned so you
don’t parse error pages; additionally implement a timeout using AbortController
(create a controller, pass signal to fetch, set a setTimeout to controller.abort
after a configured ms and clear the timer on success) and log or surface the
fetch/network/timeout error so callers can handle it.

const selector = $('ol[id^="elTable"] .ipsDataItem');
const page /** @type {PatchData[]} */ = [];
Expand Down Expand Up @@ -191,7 +157,7 @@ class Scraper {
* @returns {void}
*/
async #scrapePost(url, data) {
const html = await this.#fetch(url);
const html = await fetch(url).then((r) => r.text());
const $ = load(html);
Comment on lines +160 to 161
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add error handling and response validation to fetch call.

Same issue as in the other methods: the fetch call lacks error handling and response status validation.

Apply this diff to add proper error handling and response validation:

-    const html = await fetch(url).then((r) => r.text());
+    const res = await fetch(url);
+    if (!res.ok) {
+      throw new Error(`HTTP error! status: ${res.status}`);
+    }
+    const html = await res.text();

Consider adding timeout handling for consistency:

+    const controller = new AbortController();
+    const timeoutId = setTimeout(() => controller.abort(), 30000);
+    try {
-      const res = await fetch(url);
+      const res = await fetch(url, { signal: controller.signal });
       if (!res.ok) {
         throw new Error(`HTTP error! status: ${res.status}`);
       }
       const html = await res.text();
+      clearTimeout(timeoutId);
+    } catch (error) {
+      clearTimeout(timeoutId);
+      throw error;
+    }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const html = await fetch(url).then((r) => r.text());
const $ = load(html);
const res = await fetch(url);
if (!res.ok) {
throw new Error(`HTTP error! status: ${res.status}`);
}
const html = await res.text();
const $ = load(html);
🤖 Prompt for AI Agents
In build/scraper.js around lines 160-161, the fetch call directly converts the
response to text without validating the response or catching network/timeouts;
wrap the network call in a try/catch, use an AbortController to implement a
timeout, await fetch(url, { signal }), check response.ok (and handle non-2xx
responses by throwing or returning a clear error containing status and
statusText), then call response.text() only on a valid response; ensure the
catch block logs or rethrows a descriptive error so callers can handle fetch
failures.

const article = $('article').first();
const post = article.find('div[data-role="commentContent"]');
Expand Down
Loading