error(R-script): bioRxiv `funding` parse error and medRxiv download interruptions

### **Description:**
bioRxiv and medRxiv are showing ingest failures on the `es-journals` cluster deployed on Hetzner. For bioRxiv, indexing fails with a `document_parsing_exception` on the `funding` field (mapped as `text`) when a structured value is encountered. For medRxiv, no Elasticsearch parsing error is recorded; the failure occurs during the download step, where the R script intermittently halts with a progress-bar error, interrupting ingestion. Last known successful updates were on 2025-07-29 (bioRxiv run hit the parsing error) and 2025-08-05 (medRxiv indexed 42 records before subsequent download attempts failed).

```
2025-07-29 02:30:17.213 | ERROR | Failed to index data for biorxiv: BadRequestError(400, 'document_parsing_exception', "... failed to parse field [funding] of type [text] ... Preview: '{award=CRSII5_170930;, name=Swiss National Science Foundation, id=https://ror.org/00yjd3n13, id-type=ROR}'")
2025-08-05 11:30:08.246 | INFO  | Starting the indexing process for medrxiv...
Estimated total number of records as per API metadata: 100
Error in pb_tick(self, private, len, tokens) : !self$finished is not TRUE
[ERROR]: Download process failed for medrxiv.
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

error(R-script): bioRxiv `funding` parse error and medRxiv download interruptions #20

Description:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

error(R-script): bioRxiv funding parse error and medRxiv download interruptions #20

Description

Description:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

error(R-script): bioRxiv `funding` parse error and medRxiv download interruptions #20