Skip to content

Transfer our bibliography data to zbmath, then replace our bibliography with a link to swmath #343

@fingolfin

Description

@fingolfin

The link: https://zbmath.org/software/320

This list provides basically everything we have at https://www.gap-system.org/Doc/Bib/bib.html and even has additional nice features. And unlike MathSciNet it is free to use

While it has overall more publications than we do, it does miss some -- potentially in some cases papers might not be indexed by them at all, but so far all cases I found were a paper is in our list but in theirs is a matter of missing metadata on their part, i.e., the "tag" "sw:gap" is missing on some papers for whatever reasons.

I have contacted them and in principle I can send them lists of papers that are missing this tag and they'll add it (presumably after some validation, of course).

That leaves the problem as to how we get that list. Of course we can manually check things but there are thousands. So better to automate it. Here is how one could do that:

  1. get our data -- easy, just download https://www.gap-system.org/Doc/Bib/gap-publishednicer.bib
  2. get their data
    • I wrote a script to do so with their help and some manual tweaking, and have that .bib file (it is 1.8 MB so I am not attaching it but instead I'll add the crude script below)
  3. write a tool which parses the bib files (e.g. in Python and using https://bibtexparser.readthedocs.io/en/main/), then lists papers we have but they don't
    • this is easy for papers with a DOI and if both sides have the DOI, so let's drop those first
    • next compare using title, year, author(s)?
    • keep refining but at some point it will be more efficient to just let humans consider the lists...
  4. for the remaining papers, try to get their zbmath ID ... this could use their website, but it seems they have an API for that, with some Python bindings here: https://github.com/zbMATHOpen/zbRestApiClient
    • actually it may make sense to combine 3 with 4: if we can identify one of "our" papers using the zbmath API then it is easy to determine if it is in their list of "papers using GAP" or not...
  5. the final result would be two lists of papers
    • one with papers we have but they don't and which we successfully identified (we probably just need the list of ids here and then can send it to them
    • papers they don't seem to have in the database at all
      • this will certainly include many theses!
      • how we deal with this we'll have to decide once we have that list..

Script for getting zbmath data

#!/bin/sh
echo > zbmath.bib
curl "https://zbmath.org/bibtexoutput/?q=si%3A320&start=0&count=200" >> zbmath.bib
curl "https://zbmath.org/bibtexoutput/?q=si%3A320&start=200&count=200" >> zbmath.bib
curl "https://zbmath.org/bibtexoutput/?q=si%3A320&start=400&count=200" >> zbmath.bib
curl "https://zbmath.org/bibtexoutput/?q=si%3A320&start=600&count=200" >> zbmath.bib
curl "https://zbmath.org/bibtexoutput/?q=si%3A320&start=800&count=200" >> zbmath.bib
curl "https://zbmath.org/bibtexoutput/?q=si%3A320&start=1000&count=200" >> zbmath.bib
curl "https://zbmath.org/bibtexoutput/?q=si%3A320&start=1200&count=200" >> zbmath.bib
curl "https://zbmath.org/bibtexoutput/?q=si%3A320&start=1400&count=200" >> zbmath.bib
curl "https://zbmath.org/bibtexoutput/?q=si%3A320&start=1600&count=200" >> zbmath.bib
curl "https://zbmath.org/bibtexoutput/?q=si%3A320&start=1800&count=200" >> zbmath.bib
curl "https://zbmath.org/bibtexoutput/?q=si%3A320&start=2000&count=200" >> zbmath.bib
curl "https://zbmath.org/bibtexoutput/?q=si%3A320&start=2200&count=200" >> zbmath.bib
curl "https://zbmath.org/bibtexoutput/?q=si%3A320&start=2400&count=200" >> zbmath.bib
curl "https://zbmath.org/bibtexoutput/?q=si%3A320&start=2600&count=200" >> zbmath.bib
curl "https://zbmath.org/bibtexoutput/?q=si%3A320&start=2800&count=200" >> zbmath.bib
curl "https://zbmath.org/bibtexoutput/?q=si%3A320&start=3000&count=200" >> zbmath.bib
curl "https://zbmath.org/bibtexoutput/?q=si%3A320&start=3200&count=200" >> zbmath.bib
curl "https://zbmath.org/bibtexoutput/?q=si%3A320&start=3400&count=200" >> zbmath.bib
curl "https://zbmath.org/bibtexoutput/?q=si%3A320&start=3600&count=200" >> zbmath.bib

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions