-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Description:
bioRxiv and medRxiv are showing ingest failures on the es-journals cluster deployed on Hetzner. For bioRxiv, indexing fails with a document_parsing_exception on the funding field (mapped as text) when a structured value is encountered. For medRxiv, no Elasticsearch parsing error is recorded; the failure occurs during the download step, where the R script intermittently halts with a progress-bar error, interrupting ingestion. Last known successful updates were on 2025-07-29 (bioRxiv run hit the parsing error) and 2025-08-05 (medRxiv indexed 42 records before subsequent download attempts failed).
2025-07-29 02:30:17.213 | ERROR | Failed to index data for biorxiv: BadRequestError(400, 'document_parsing_exception', "... failed to parse field [funding] of type [text] ... Preview: '{award=CRSII5_170930;, name=Swiss National Science Foundation, id=https://ror.org/00yjd3n13, id-type=ROR}'")
2025-08-05 11:30:08.246 | INFO | Starting the indexing process for medrxiv...
Estimated total number of records as per API metadata: 100
Error in pb_tick(self, private, len, tokens) : !self$finished is not TRUE
[ERROR]: Download process failed for medrxiv.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working