Skip to content

Commit 9bd6183

Browse files
authored
Fetch multiple source documents sequentially to prevent bot detection (#1176)
2 parents e069c8f + efb3b50 commit 9bd6183

File tree

2 files changed

+11
-3
lines changed

2 files changed

+11
-3
lines changed

CHANGELOG.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,14 @@
22

33
All changes that impact users of this module are documented in this file, in the [Common Changelog](https://common-changelog.org) format with some additional specifications defined in the CONTRIBUTING file. This codebase adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
44

5+
## Unreleased [minor]
6+
7+
> Development of this release was supported by the [Lab Platform Governance, Media and Technology](https://platform-governance.org) (PGMT), Centre for Media, Communication and Information Research (ZeMKI), University of Bremen as part of the project [Governance: Private ordering of ComAI through corporate communication and policies](https://comai.space/en/projects/p4-governance-private-ordering-of-comai-through-corporate-communication-and-policies/) in the research unit [Communicative AI](https://comai.space/en/), funded by the German Research Foundation (DFG) ([Grant No. 516511468)](https://gepris.dfg.de/gepris/projekt/544643936?language=en).
8+
9+
### Changed
10+
11+
- Fetch multiple source documents sequentially to prevent bot detection and improve tracking success rate
12+
513
## 6.0.1 - 2025-07-07
614

715
_Full changeset and discussions: [#1171](https://github.com/OpenTermsArchive/engine/pull/1171)._

src/archivist/index.js

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -183,11 +183,11 @@ export default class Archivist extends events.EventEmitter {
183183

184184
const fetchDocumentErrors = [];
185185

186-
await Promise.all(terms.sourceDocuments.map(async sourceDocument => {
186+
for (const sourceDocument of terms.sourceDocuments) {
187187
const { location: url, executeClientScripts, cssSelectors } = sourceDocument;
188188

189189
try {
190-
const { mimeType, content, fetcher } = await this.fetch({ url, executeClientScripts, cssSelectors });
190+
const { mimeType, content, fetcher } = await this.fetch({ url, executeClientScripts, cssSelectors }); // eslint-disable-line no-await-in-loop
191191

192192
sourceDocument.content = content;
193193
sourceDocument.mimeType = mimeType;
@@ -199,7 +199,7 @@ export default class Archivist extends events.EventEmitter {
199199

200200
fetchDocumentErrors.push(error);
201201
}
202-
}));
202+
}
203203

204204
if (fetchDocumentErrors.length) {
205205
throw new InaccessibleContentError(fetchDocumentErrors);

0 commit comments

Comments
 (0)