BioGPT is a domain-specific generative Transformer language model pretrained on large-scale biomedical literature, developed by Microsoft Research. Unlike BERT-based models common in biomedical NLP, BioGPT brings generative capabilities to the biomedical domain while achieving state-of-the-art performance on discriminative tasks.
- Base Architecture: GPT-2 style transformer
- Domain: Biomedical and scientific text
- Training Data: Large-scale biomedical literature (PubMed abstracts and full-text articles)
- Model Variants: BioGPT (base), BioGPT-Large
BC5CDR (Chemical-Disease Relations):
- F1 Score: 44.98%
- New state-of-the-art record
KD-DTI (Drug-Target Interactions):
- F1 Score: 38.42%
- New state-of-the-art record
DDI (Drug-Drug Interactions):
- F1 Score: 40.76%
- New state-of-the-art record
PubMedQA:
- Accuracy: 78.2%
- Achieves human-expert-level performance
- New state-of-the-art for the benchmark
- Superior biomedical text generation compared to general-purpose GPT-2
- Maintains domain-specific terminology and context
- Coherent and accurate scientific writing
- Biomedical text generation
- Scientific abstract writing
- Medical report generation
- Literature summarization
- Hypothesis generation
- Named entity recognition
- Relation extraction (chemicals, diseases, drugs, proteins)
- Document classification
- Question answering on medical literature
- Information extraction from scientific papers
While BioBERT, PubMedBERT, and similar models excel at discriminative tasks, they lack generation ability which constrains their application scope. BioGPT provides:
- Full generative capabilities for content creation
- Competitive or superior performance on discriminative tasks
- Unified model for both understanding and generation
- Literature review assistance
- Hypothesis generation from existing research
- Automated scientific writing support
- Knowledge discovery from biomedical texts
- Medical report generation
- Clinical note assistance
- Patient information summarization
- Drug interaction checking
- Medical education tools
- Scientific writing tutorials
- Biomedical concept explanation
Pretrained on:
- PubMed abstracts (millions of scientific papers)
- PMC full-text articles
- Biomedical and clinical text corpora
- Domain-specific terminology and concepts
- Hugging Face Hub: https://huggingface.co/microsoft/biogpt
- Microsoft Research: Official model checkpoints
- Pre-trained Checkpoints: Base models
- Fine-tuned Checkpoints: Task-specific models
- microsoft/biogpt (base model)
- microsoft/BioGPT-Large (larger variant)
- Fine-tuned variants for specific tasks
Compatible with:
- Hugging Face Transformers library
- PyTorch workflows
- Standard NLP pipelines
- Custom fine-tuning frameworks
BioGPT can be fine-tuned for:
- Specific medical specialties
- Custom relation extraction tasks
- Domain-specific question answering
- Institutional terminology and formats
- Human-expert-level accuracy on biomedical QA
- State-of-the-art relation extraction
- Coherent domain-specific generation
- Better than general-purpose models on biomedical tasks
- Maintains scientific accuracy in generated text
Developed by Microsoft Research:
- Renqian Luo
- Liai Sun
- Yingce Xia
- Tao Qin
- Sheng Zhang
- Hoifung Poon
- Tie-Yan Liu
MIT License - applies to:
- Pre-trained model checkpoints
- Fine-tuned model checkpoints
- Training and inference code
- Commercial use permitted
Free and open source under MIT License.