awesome-open-source-llms/details/biogpt.md at master · ever-works/awesome-open-source-llms

Overview

BioGPT is a domain-specific generative Transformer language model pretrained on large-scale biomedical literature, developed by Microsoft Research. Unlike BERT-based models common in biomedical NLP, BioGPT brings generative capabilities to the biomedical domain while achieving state-of-the-art performance on discriminative tasks.

Architecture

Base Architecture: GPT-2 style transformer
Domain: Biomedical and scientific text
Training Data: Large-scale biomedical literature (PubMed abstracts and full-text articles)
Model Variants: BioGPT (base), BioGPT-Large

Benchmark Performance

Relation Extraction Tasks

BC5CDR (Chemical-Disease Relations):

F1 Score: 44.98%
New state-of-the-art record

KD-DTI (Drug-Target Interactions):

F1 Score: 38.42%
New state-of-the-art record

DDI (Drug-Drug Interactions):

F1 Score: 40.76%
New state-of-the-art record

Question Answering

PubMedQA:

Accuracy: 78.2%
Achieves human-expert-level performance
New state-of-the-art for the benchmark

Text Generation

Superior biomedical text generation compared to general-purpose GPT-2
Maintains domain-specific terminology and context
Coherent and accurate scientific writing

Key Capabilities

Generative Tasks

Biomedical text generation
Scientific abstract writing
Medical report generation
Literature summarization
Hypothesis generation

Discriminative Tasks

Named entity recognition
Relation extraction (chemicals, diseases, drugs, proteins)
Document classification
Question answering on medical literature
Information extraction from scientific papers

Advantages Over BERT Variants

While BioBERT, PubMedBERT, and similar models excel at discriminative tasks, they lack generation ability which constrains their application scope. BioGPT provides:

Full generative capabilities for content creation
Competitive or superior performance on discriminative tasks
Unified model for both understanding and generation

Use Cases

Research Applications

Literature review assistance
Hypothesis generation from existing research
Automated scientific writing support
Knowledge discovery from biomedical texts

Clinical Applications

Medical report generation
Clinical note assistance
Patient information summarization
Drug interaction checking

Educational Applications

Medical education tools
Scientific writing tutorials
Biomedical concept explanation

Training Data

Pretrained on:

PubMed abstracts (millions of scientific papers)
PMC full-text articles
Biomedical and clinical text corpora
Domain-specific terminology and concepts

Deployment Options

Available Platforms

Hugging Face Hub: https://huggingface.co/microsoft/biogpt
Microsoft Research: Official model checkpoints
Pre-trained Checkpoints: Base models
Fine-tuned Checkpoints: Task-specific models

Model Sizes

microsoft/biogpt (base model)
microsoft/BioGPT-Large (larger variant)
Fine-tuned variants for specific tasks

Integration

Compatible with:

Hugging Face Transformers library
PyTorch workflows
Standard NLP pipelines
Custom fine-tuning frameworks

Fine-tuning

BioGPT can be fine-tuned for:

Specific medical specialties
Custom relation extraction tasks
Domain-specific question answering
Institutional terminology and formats

Performance Characteristics

Human-expert-level accuracy on biomedical QA
State-of-the-art relation extraction
Coherent domain-specific generation
Better than general-purpose models on biomedical tasks
Maintains scientific accuracy in generated text

Research Team

Developed by Microsoft Research:

Renqian Luo
Liai Sun
Yingce Xia
Tao Qin
Sheng Zhang
Hoifung Poon
Tie-Yan Liu

Licensing

MIT License - applies to:

Pre-trained model checkpoints
Fine-tuned model checkpoints
Training and inference code
Commercial use permitted

Pricing

Free and open source under MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Overview

Architecture

Benchmark Performance

Relation Extraction Tasks

Question Answering

Text Generation

Key Capabilities

Generative Tasks

Discriminative Tasks

Advantages Over BERT Variants

Use Cases

Research Applications

Clinical Applications

Educational Applications

Training Data

Deployment Options

Available Platforms

Model Sizes

Integration

Fine-tuning

Performance Characteristics

Research Team

Licensing

Pricing

FilesExpand file tree

biogpt.md

Latest commit

History

biogpt.md

File metadata and controls

Overview

Architecture

Benchmark Performance

Relation Extraction Tasks

Question Answering

Text Generation

Key Capabilities

Generative Tasks

Discriminative Tasks

Advantages Over BERT Variants

Use Cases

Research Applications

Clinical Applications

Educational Applications

Training Data

Deployment Options

Available Platforms

Model Sizes

Integration

Fine-tuning

Performance Characteristics

Research Team

Licensing

Pricing