Merged
Conversation
Contributor
Author
|
Quick links (staging server):
Login: chart-diff: ✅
data-diff: ❌ Found differences= Dataset garden/artificial_intelligence/2025-03-12/epoch
~ Table epoch (changed metadata)
- - date_accessed: '2026-01-26'
+ + date_accessed: '2026-02-27'
~ Dim days_since_1949
+ + New values: 9 / 981 (0.92%)
model days_since_1949
GPT-3.5 Turbo 27191
GPT-3.5 Turbo Instruct 27298
o1-pro 27836
GPT-5.2 Codex 28110
MiniMax-M2.1 28115
- - Removed values: 4 / 981 (0.41%)
model days_since_1949
GPT-3.5 (davinci-002)\n 26994
GPT-3.5 Turbo 26996
GPT-4o (May 2024) 27526
Qwen3-235B-A22B-Instruct (Jul 2025) 27964
~ Dim model
+ + New values: 9 / 981 (0.92%)
days_since_1949 model
27191 GPT-3.5 Turbo
27298 GPT-3.5 Turbo Instruct
27836 o1-pro
28110 GPT-5.2 Codex
28115 MiniMax-M2.1
- - Removed values: 4 / 981 (0.41%)
days_since_1949 model
26994 GPT-3.5 (davinci-002)\n
26996 GPT-3.5 Turbo
27526 GPT-4o (May 2024)
27964 Qwen3-235B-A22B-Instruct (Jul 2025)
~ Column domain (new data, changed data)
+ + New values: 9 / 981 (0.92%)
days_since_1949 model domain
27191 GPT-3.5 Turbo Language
27298 GPT-3.5 Turbo Instruct Language
27836 o1-pro Multiple domains
28110 GPT-5.2 Codex Language
28115 MiniMax-M2.1 Language
- - Removed values: 4 / 981 (0.41%)
days_since_1949 model domain
26994 GPT-3.5 (davinci-002)\n Language
26996 GPT-3.5 Turbo Language
27526 GPT-4o (May 2024) Multiple domains
27964 Qwen3-235B-A22B-Instruct (Jul 2025) Language
~ Column organization_categorization (new data, changed data)
+ + New values: 9 / 981 (0.92%)
days_since_1949 model organization_categorization
27191 GPT-3.5 Turbo Industry
27298 GPT-3.5 Turbo Instruct Industry
27836 o1-pro Industry
28110 GPT-5.2 Codex Industry
28115 MiniMax-M2.1 Industry
- - Removed values: 4 / 981 (0.41%)
days_since_1949 model organization_categorization
26994 GPT-3.5 (davinci-002)\n Industry
26996 GPT-3.5 Turbo Industry
27526 GPT-4o (May 2024) Industry
27964 Qwen3-235B-A22B-Instruct (Jul 2025) Industry
~ Column parameters (new data, changed data)
+ + New values: 9 / 981 (0.92%)
days_since_1949 model parameters
27191 GPT-3.5 Turbo 20000000000
27298 GPT-3.5 Turbo Instruct 20000000000
27836 o1-pro <NA>
28110 GPT-5.2 Codex <NA>
28115 MiniMax-M2.1 229000000000
- - Removed values: 4 / 981 (0.41%)
days_since_1949 model parameters
26994 GPT-3.5 (davinci-002)\n <NA>
26996 GPT-3.5 Turbo 20000000000
27526 GPT-4o (May 2024) <NA>
27964 Qwen3-235B-A22B-Instruct (Jul 2025) 235000000000
~ Column publication_date (new data, changed data)
+ + New values: 9 / 981 (0.92%)
days_since_1949 model publication_date
27191 GPT-3.5 Turbo 2023-06-13
27298 GPT-3.5 Turbo Instruct 2023-09-28
27836 o1-pro 2025-03-19
28110 GPT-5.2 Codex 2025-12-18
28115 MiniMax-M2.1 2025-12-23
- - Removed values: 4 / 981 (0.41%)
days_since_1949 model publication_date
26994 GPT-3.5 (davinci-002)\n 2022-11-28
26996 GPT-3.5 Turbo 2022-11-30
27526 GPT-4o (May 2024) 2024-05-13
27964 Qwen3-235B-A22B-Instruct (Jul 2025) 2025-07-25
~ Column training_computation_petaflop (new data, changed data)
+ + New values: 9 / 981 (0.92%)
days_since_1949 model training_computation_petaflop
27191 GPT-3.5 Turbo <NA>
27298 GPT-3.5 Turbo Instruct <NA>
27836 o1-pro <NA>
28110 GPT-5.2 Codex <NA>
28115 MiniMax-M2.1 <NA>
- - Removed values: 4 / 981 (0.41%)
days_since_1949 model training_computation_petaflop
26994 GPT-3.5 (davinci-002)\n 2577999872.0
26996 GPT-3.5 Turbo <NA>
27526 GPT-4o (May 2024) <NA>
27964 Qwen3-235B-A22B-Instruct (Jul 2025) 4752000000.0
~ Changed values: 3 / 981 (0.31%)
days_since_1949 model training_computation_petaflop - training_computation_petaflop +
27751 DeepSeek-V3 3407800064.0 3300000000.0
27828 Hunyuan-TurboS <NA> 5400000000.0
28082 Olmo 3 <NA> 1100000000.0
~ Column training_dataset_size__gradients (changed metadata, new data, changed data)
+ + {}
- - title: Training dataset size
- - description_short: |-
- - The number of unique data points used to train the model. Each domain has a specific data point unit; for example, for vision it is images, for language it is words, and for games it is timesteps. This means systems can only be compared directly within the same domain.
- - description_key:
- - - |-
- - Training data size measures the volume of unique examples used to train an AI model during its learning phase. It represents the total number of distinct data points the model learns from, counted only once regardless of how many times they're seen during training.
- - - |-
- - To understand this concept, imagine teaching someone to identify different bird species. Each unique bird photo you show them is one piece of training data. If you show 100 different photos, your training data size is 100, even if you review those same photos multiple times.
- - - |-
- - Since datasets vary by domain, there's no universal unit for measuring size. Text models might count tokens, image models count pictures, and video models count clips. Epoch AI typically uses the smallest unit that triggers a model update during training. For language models that predict the next word, this would be individual tokens.
- - - |-
- - Training data size directly impacts model performance. Larger datasets enable deeper learning and more nuanced pattern recognition, allowing models to identify subtle distinctions and handle diverse real-world scenarios more effectively.
- - unit: unique datapoints
- - display:
- - numDecimalPlaces: 0
- - zeroDay: '1949-01-01'
- - yearIsDay: true
- - processing_level: major
- - presentation:
- - topic_tags:
- - - Artificial Intelligence
+ + New values: 9 / 981 (0.92%)
days_since_1949 model training_dataset_size__gradients
27191 GPT-3.5 Turbo NaN
27298 GPT-3.5 Turbo Instruct NaN
27836 o1-pro NaN
28110 GPT-5.2 Codex NaN
28115 MiniMax-M2.1 NaN
- - Removed values: 4 / 981 (0.41%)
days_since_1949 model training_dataset_size__gradients
26994 GPT-3.5 (davinci-002)\n <NA>
26996 GPT-3.5 Turbo <NA>
27526 GPT-4o (May 2024) <NA>
27964 Qwen3-235B-A22B-Instruct (Jul 2025) 36000000000000
~ Changed values: 656 / 981 (66.87%)
days_since_1949 model training_dataset_size__gradients - training_dataset_size__gradients +
12661 ASE+ACE 500000 NaN
21735 Denoising Autoencoders 7840000 NaN
25017 NoisyNet-Dueling 320000000 NaN
26792 UL2 1000000000000 NaN
27877 Qwen3-235B-A22B 36000000000000 NaN
~ Column training_dataset_size__total (changed metadata, new data, changed data)
- - {}
+ + title: Training dataset size
+ + description_short: |-
+ + The number of unique data points used to train the model. Each domain has a specific data point unit; for example, for vision it is images, for language it is words, and for games it is timesteps. This means systems can only be compared directly within the same domain.
+ + description_key:
+ + - |-
+ + Training data size measures the volume of unique examples used to train an AI model during its learning phase. It represents the total number of distinct data points the model learns from, counted only once regardless of how many times they're seen during training.
+ + - |-
+ + To understand this concept, imagine teaching someone to identify different bird species. Each unique bird photo you show them is one piece of training data. If you show 100 different photos, your training data size is 100, even if you review those same photos multiple times.
+ + - |-
+ + Since datasets vary by domain, there's no universal unit for measuring size. Text models might count tokens, image models count pictures, and video models count clips. Epoch AI typically uses the smallest unit that triggers a model update during training. For language models that predict the next word, this would be individual tokens.
+ + - |-
+ + Training data size directly impacts model performance. Larger datasets enable deeper learning and more nuanced pattern recognition, allowing models to identify subtle distinctions and handle diverse real-world scenarios more effectively.
+ + unit: unique datapoints
+ + display:
+ + numDecimalPlaces: 0
+ + zeroDay: '1949-01-01'
+ + yearIsDay: true
+ + processing_level: major
+ + presentation:
+ + topic_tags:
+ + - Artificial Intelligence
+ + New values: 9 / 981 (0.92%)
days_since_1949 model training_dataset_size__total
27191 GPT-3.5 Turbo <NA>
27298 GPT-3.5 Turbo Instruct <NA>
27836 o1-pro <NA>
28110 GPT-5.2 Codex <NA>
28115 MiniMax-M2.1 <NA>
- - Removed values: 4 / 981 (0.41%)
days_since_1949 model training_dataset_size__total
26994 GPT-3.5 (davinci-002)\n NaN
26996 GPT-3.5 Turbo NaN
27526 GPT-4o (May 2024) NaN
27964 Qwen3-235B-A22B-Instruct (Jul 2025) NaN
~ Changed values: 656 / 981 (66.87%)
days_since_1949 model training_dataset_size__total - training_dataset_size__total +
12661 ASE+ACE NaN 500000
21735 Denoising Autoencoders NaN 7840000
25017 NoisyNet-Dueling NaN 320000000
26792 UL2 NaN 1000000000000
27877 Qwen3-235B-A22B NaN 36000000000000
= Dataset garden/artificial_intelligence/2025-03-12/epoch_aggregates_affiliation
~ Table epoch_aggregates_affiliation (changed metadata)
- - date_accessed: '2026-01-26'
+ + date_accessed: '2026-02-27'
~ Column cumulative_count (changed metadata, changed data)
- - Describes the sector where the authors of a notable AI system have their primary affiliations. The 2026 data is incomplete and was last updated 26 January 2026.
+ + Describes the sector where the authors of a notable AI system have their primary affiliations. The 2026 data is incomplete and was last updated 27 February 2026.
~ Changed values: 4 / 305 (1.31%)
year organization_categorization cumulative_count - cumulative_count +
2022 Industry 284 283
2023 Industry 351 352
2024 Industry 432 433
2025 Industry 515 520
~ Column yearly_count (changed metadata, changed data)
- - Describes the sector where the authors of a notable AI system have their primary affiliations. The 2026 data is incomplete and was last updated 26 January 2026.
+ + Describes the sector where the authors of a notable AI system have their primary affiliations. The 2026 data is incomplete and was last updated 27 February 2026.
~ Changed values: 3 / 305 (0.98%)
year organization_categorization yearly_count - yearly_count +
2022 Industry 51 50
2023 Industry 67 69
2025 Industry 83 87
= Dataset garden/artificial_intelligence/2025-03-12/epoch_aggregates_domain
~ Table epoch_aggregates_domain (changed metadata)
- - date_accessed: '2026-01-26'
+ + date_accessed: '2026-02-27'
~ Column cumulative_count (changed metadata, changed data)
- - Describes the specific area, application, or field in which an AI system is designed to operate. An AI system can operate in more than one domain, thus contributing to the count for multiple domains. The 2026 data is incomplete and was last updated 26 January 2026.
+ + Describes the specific area, application, or field in which an AI system is designed to operate. An AI system can operate in more than one domain, thus contributing to the count for multiple domains. The 2026 data is incomplete and was last updated 27 February 2026.
~ Changed values: 6 / 793 (0.76%)
year domain cumulative_count - cumulative_count +
2022 Language 255 254
2023 Language 329 330
2024 Language 406 407
2025 Language 484 489
2025 Multimodal 103 104
~ Column yearly_count (changed metadata, changed data)
- - Describes the specific area, application, or field in which an AI system is designed to operate. An AI system can operate in more than one domain, thus contributing to the count for multiple domains. The 2026 data is incomplete and was last updated 26 January 2026.
+ + Describes the specific area, application, or field in which an AI system is designed to operate. An AI system can operate in more than one domain, thus contributing to the count for multiple domains. The 2026 data is incomplete and was last updated 27 February 2026.
~ Changed values: 5 / 793 (0.63%)
year domain yearly_count - yearly_count +
2022 Language 48 47
2023 Language 74 76
2025 Language 78 82
2025 Mathematics 3 4
2025 Multimodal 34 35
= Dataset garden/artificial_intelligence/2025-03-12/epoch_compute_intensive
~ Table epoch_compute_intensive (changed metadata)
- - date_accessed: '2026-01-26'
+ + date_accessed: '2026-02-27'
~ Dim days_since_1949
+ + New values: 9 / 485 (1.86%)
model days_since_1949
GPT-3.5 Turbo 27191
GPT-3.5 Turbo Instruct 27298
o1-pro 27836
Qwen3-235B-A22B (Jul 2025) 27964
Olmo 3 28082
- - Removed values: 8 / 485 (1.65%)
model days_since_1949
GPT-3.5 Turbo 26996
GPT-4o (May 2024) 27526
Gemini 2.5 Flash (Apr 2025) 27865
Qwen3-235B-A22B-Instruct (Jul 2025) 27964
Gemini 2.5 Flash (Sep 2025) 28026
~ Dim model
+ + New values: 9 / 485 (1.86%)
days_since_1949 model
27191 GPT-3.5 Turbo
27298 GPT-3.5 Turbo Instruct
27836 o1-pro
27964 Qwen3-235B-A22B (Jul 2025)
28082 Olmo 3
- - Removed values: 8 / 485 (1.65%)
days_since_1949 model
26996 GPT-3.5 Turbo
27526 GPT-4o (May 2024)
27865 Gemini 2.5 Flash (Apr 2025)
27964 Qwen3-235B-A22B-Instruct (Jul 2025)
28026 Gemini 2.5 Flash (Sep 2025)
~ Column domain (new data, changed data)
+ + New values: 9 / 485 (1.86%)
days_since_1949 model domain
27191 GPT-3.5 Turbo Language
27298 GPT-3.5 Turbo Instruct Language
27836 o1-pro Language,Mathematics,Multimodal
27964 Qwen3-235B-A22B (Jul 2025) Language
28082 Olmo 3 Language
- - Removed values: 8 / 485 (1.65%)
days_since_1949 model domain
26996 GPT-3.5 Turbo Language
27526 GPT-4o (May 2024) Multimodal,Language,Audio,Speech,Vision
27865 Gemini 2.5 Flash (Apr 2025) Language,Multimodal,Vision,Speech,Video
27964 Qwen3-235B-A22B-Instruct (Jul 2025) Language
28026 Gemini 2.5 Flash (Sep 2025) Language,Multimodal,Vision,Speech,Video
~ Column parameters (new data, changed data)
+ + New values: 9 / 485 (1.86%)
days_since_1949 model parameters
27191 GPT-3.5 Turbo 20000000000
27298 GPT-3.5 Turbo Instruct 20000000000
27836 o1-pro <NA>
27964 Qwen3-235B-A22B (Jul 2025) 235000000000
28082 Olmo 3 32000000000
- - Removed values: 8 / 485 (1.65%)
days_since_1949 model parameters
26996 GPT-3.5 Turbo 20000000000
27526 GPT-4o (May 2024) <NA>
27865 Gemini 2.5 Flash (Apr 2025) <NA>
27964 Qwen3-235B-A22B-Instruct (Jul 2025) 235000000000
28026 Gemini 2.5 Flash (Sep 2025) <NA>
~ Column publication_date (new data, changed data)
+ + New values: 9 / 485 (1.86%)
days_since_1949 model publication_date
27191 GPT-3.5 Turbo 2023-06-13
27298 GPT-3.5 Turbo Instruct 2023-09-28
27836 o1-pro 2025-03-19
27964 Qwen3-235B-A22B (Jul 2025) 2025-07-25
28082 Olmo 3 2025-11-20
- - Removed values: 8 / 485 (1.65%)
days_since_1949 model publication_date
26996 GPT-3.5 Turbo 2022-11-30
27526 GPT-4o (May 2024) 2024-05-13
27865 Gemini 2.5 Flash (Apr 2025) 2025-04-17
27964 Qwen3-235B-A22B-Instruct (Jul 2025) 2025-07-25
28026 Gemini 2.5 Flash (Sep 2025) 2025-09-25
~ Column training_computation_petaflop (new data, changed data)
+ + New values: 9 / 485 (1.86%)
days_since_1949 model training_computation_petaflop
27191 GPT-3.5 Turbo <NA>
27298 GPT-3.5 Turbo Instruct <NA>
27836 o1-pro <NA>
27964 Qwen3-235B-A22B (Jul 2025) 4752000000.0
28082 Olmo 3 1100000000.0
- - Removed values: 8 / 485 (1.65%)
days_since_1949 model training_computation_petaflop
26996 GPT-3.5 Turbo <NA>
27526 GPT-4o (May 2024) <NA>
27865 Gemini 2.5 Flash (Apr 2025) <NA>
27964 Qwen3-235B-A22B-Instruct (Jul 2025) 4752000000.0
28026 Gemini 2.5 Flash (Sep 2025) <NA>
~ Changed values: 2 / 485 (0.41%)
days_since_1949 model training_computation_petaflop - training_computation_petaflop +
27751 DeepSeek-V3 3407800064.0 3300000000.0
27828 Hunyuan-TurboS <NA> 5400000000.0
= Dataset garden/artificial_intelligence/2025-03-12/epoch_compute_intensive_countries
~ Table epoch_compute_intensive_countries (changed metadata)
- - date_accessed: '2026-01-26'
+ + date_accessed: '2026-02-27'
~ Column cumulative_count (changed metadata, changed data)
- - Refers to the location of the primary organization with which the authors of a large-scale AI systems are affiliated. The 2026 data is incomplete and was last updated 26 January 2026.
+ + Refers to the location of the primary organization with which the authors of a large-scale AI systems are affiliated. The 2026 data is incomplete and was last updated 27 February 2026.
~ Changed values: 12 / 154 (7.79%)
year country cumulative_count - cumulative_count +
2023 All large-scale AI systems 163 164
2023 United States 73 74
2024 United Kingdom 22 8
2025 United Kingdom 32 8
2025 United States 214 213
~ Column yearly_count (changed metadata, changed data)
- - Refers to the location of the primary organization with which the authors of a large-scale AI systems are affiliated. The 2026 data is incomplete and was last updated 26 January 2026.
+ + Refers to the location of the primary organization with which the authors of a large-scale AI systems are affiliated. The 2026 data is incomplete and was last updated 27 February 2026.
~ Changed values: 9 / 154 (5.84%)
year country yearly_count - yearly_count +
2022 United States 23 22
2023 All large-scale AI systems 119 121
2023 United States 46 48
2025 United Kingdom 10 0
2025 United States 60 58
= Dataset garden/artificial_intelligence/2025-03-12/epoch_compute_intensive_domain
~ Table epoch_compute_intensive_domain (changed metadata)
- - date_accessed: '2026-01-26'
+ + date_accessed: '2026-02-27'
~ Column cumulative_count (changed metadata, changed data)
- - Describes the specific area, application, or field in which a large-scale AI model is designed to operate. The 2026 data is incomplete and was last updated 26 January 2026.
+ + Describes the specific area, application, or field in which a large-scale AI model is designed to operate. The 2026 data is incomplete and was last updated 27 February 2026.
~ Changed values: 13 / 91 (14.29%)
year domain cumulative_count - cumulative_count +
2023 All large-scale AI systems 163 164
2024 All large-scale AI systems 331 332
2025 All large-scale AI systems 484 485
2025 Speech 30 27
2025 Video 64 61
~ Column yearly_count (changed metadata, changed data)
- - Describes the specific area, application, or field in which a large-scale AI model is designed to operate. The 2026 data is incomplete and was last updated 26 January 2026.
+ + Describes the specific area, application, or field in which a large-scale AI model is designed to operate. The 2026 data is incomplete and was last updated 27 February 2026.
~ Changed values: 9 / 91 (9.89%)
year domain yearly_count - yearly_count +
2022 Language 24 23
2023 All large-scale AI systems 119 121
2025 Mathematics 1 2
2025 Video 35 32
2025 Vision 51 48
= Dataset garden/artificial_intelligence/2025-03-12/epoch_regressions
~ Table epoch_regressions (changed metadata)
- - date_accessed: '2026-01-26'
+ + date_accessed: '2026-02-27'
~ Dim days_since_1949
+ + New values: 10 / 993 (1.01%)
model days_since_1949
GPT-3.5 Turbo 27191
GPT-3.5 Turbo Instruct 27298
4.2x/year between 2010–2025 27828
GPT-5.2 Codex 28110
MiniMax-M2.1 28115
- - Removed values: 5 / 993 (0.50%)
model days_since_1949
GPT-3.5 (davinci-002)\n 26994
GPT-3.5 Turbo 26996
GPT-4o (May 2024) 27526
4.2x/year between 2010–2025 27823
Qwen3-235B-A22B-Instruct (Jul 2025) 27964
~ Dim model
+ + New values: 10 / 993 (1.01%)
days_since_1949 model
27191 GPT-3.5 Turbo
27298 GPT-3.5 Turbo Instruct
27828 4.2x/year between 2010–2025
28110 GPT-5.2 Codex
28115 MiniMax-M2.1
- - Removed values: 5 / 993 (0.50%)
days_since_1949 model
26994 GPT-3.5 (davinci-002)\n
26996 GPT-3.5 Turbo
27526 GPT-4o (May 2024)
27823 4.2x/year between 2010–2025
27964 Qwen3-235B-A22B-Instruct (Jul 2025)
~ Column domain (new data, changed data)
+ + New values: 10 / 993 (1.01%)
days_since_1949 model domain
27191 GPT-3.5 Turbo Language
27298 GPT-3.5 Turbo Instruct Language
27828 4.2x/year between 2010–2025 NaN
28110 GPT-5.2 Codex Language
28115 MiniMax-M2.1 Language
- - Removed values: 5 / 993 (0.50%)
days_since_1949 model domain
26994 GPT-3.5 (davinci-002)\n Language
26996 GPT-3.5 Turbo Language
27526 GPT-4o (May 2024) Multiple domains
27823 4.2x/year between 2010–2025 NaN
27964 Qwen3-235B-A22B-Instruct (Jul 2025) Language
~ Column organization_categorization (new data, changed data)
+ + New values: 10 / 993 (1.01%)
days_since_1949 model organization_categorization
27191 GPT-3.5 Turbo Industry
27298 GPT-3.5 Turbo Instruct Industry
27828 4.2x/year between 2010–2025 NaN
28110 GPT-5.2 Codex Industry
28115 MiniMax-M2.1 Industry
- - Removed values: 5 / 993 (0.50%)
days_since_1949 model organization_categorization
26994 GPT-3.5 (davinci-002)\n Industry
26996 GPT-3.5 Turbo Industry
27526 GPT-4o (May 2024) Industry
27823 4.2x/year between 2010–2025 NaN
27964 Qwen3-235B-A22B-Instruct (Jul 2025) Industry
~ Column parameters (new data, changed data)
+ + New values: 10 / 993 (1.01%)
days_since_1949 model parameters
27191 GPT-3.5 Turbo 20000000000.0
27298 GPT-3.5 Turbo Instruct 20000000000.0
27828 4.2x/year between 2010–2025 <NA>
28110 GPT-5.2 Codex <NA>
28115 MiniMax-M2.1 229000003584.0
- - Removed values: 5 / 993 (0.50%)
days_since_1949 model parameters
26994 GPT-3.5 (davinci-002)\n <NA>
26996 GPT-3.5 Turbo 20000000000.0
27526 GPT-4o (May 2024) <NA>
27823 4.2x/year between 2010–2025 <NA>
27964 Qwen3-235B-A22B-Instruct (Jul 2025) 235000004608.0
~ Changed values: 2 / 993 (0.20%)
days_since_1949 model parameters - parameters +
22280 2.0x/year between 2010–2025 372285.1875 370840.96875
27828 2.0x/year between 2010–2025 19731761152.0 19801747456.0
~ Column publication_date (new data, changed data)
+ + New values: 10 / 993 (1.01%)
days_since_1949 model publication_date
27191 GPT-3.5 Turbo 2023-06-13
27298 GPT-3.5 Turbo Instruct 2023-09-28
27828 4.2x/year between 2010–2025 NaT
28110 GPT-5.2 Codex 2025-12-18
28115 MiniMax-M2.1 2025-12-23
- - Removed values: 5 / 993 (0.50%)
days_since_1949 model publication_date
26994 GPT-3.5 (davinci-002)\n 2022-11-28
26996 GPT-3.5 Turbo 2022-11-30
27526 GPT-4o (May 2024) 2024-05-13
27823 4.2x/year between 2010–2025 NaT
27964 Qwen3-235B-A22B-Instruct (Jul 2025) 2025-07-25
~ Column training_computation_petaflop (new data, changed data)
+ + New values: 10 / 993 (1.01%)
days_since_1949 model training_computation_petaflop
27191 GPT-3.5 Turbo <NA>
27298 GPT-3.5 Turbo Instruct <NA>
27828 4.2x/year between 2010–2025 606301504.0
28110 GPT-5.2 Codex <NA>
28115 MiniMax-M2.1 <NA>
- - Removed values: 5 / 993 (0.50%)
days_since_1949 model training_computation_petaflop
26994 GPT-3.5 (davinci-002)\n 2577999872.0
26996 GPT-3.5 Turbo <NA>
27526 GPT-4o (May 2024) <NA>
27823 4.2x/year between 2010–2025 597218112.0
27964 Qwen3-235B-A22B-Instruct (Jul 2025) 4752000000.0
~ Changed values: 4 / 993 (0.40%)
days_since_1949 model training_computation_petaflop - training_computation_petaflop +
22412 4.2x/year between 2010–2025 0.196284 0.194301
27751 DeepSeek-V3 3407800064.0 3300000000.0
27828 Hunyuan-TurboS <NA> 5400000000.0
28082 Olmo 3 <NA> 1100000000.0
~ Column training_dataset_size__gradients (changed metadata, new data, changed data)
+ + {}
- - title: Training dataset size
- - description_short: |-
- - The number of unique data points used to train the model. Each domain has a specific data point unit; for example, for vision it is images, for language it is words, and for games it is timesteps. This means systems can only be compared directly within the same domain.
- - description_key:
- - - |-
- - Training data size measures the volume of unique examples used to train an AI model during its learning phase. It represents the total number of distinct data points the model learns from, counted only once regardless of how many times they're seen during training.
- - - |-
- - To understand this concept, imagine teaching someone to identify different bird species. Each unique bird photo you show them is one piece of training data. If you show 100 different photos, your training data size is 100, even if you review those same photos multiple times.
- - - |-
- - Since datasets vary by domain, there's no universal unit for measuring size. Text models might count tokens, image models count pictures, and video models count clips. Epoch AI typically uses the smallest unit that triggers a model update during training. For language models that predict the next word, this would be individual tokens.
- - - |-
- - Training data size directly impacts model performance. Larger datasets enable deeper learning and more nuanced pattern recognition, allowing models to identify subtle distinctions and handle diverse real-world scenarios more effectively.
- - unit: unique datapoints
- - display:
- - numDecimalPlaces: 0
- - zeroDay: '1949-01-01'
- - yearIsDay: true
- - processing_level: major
- - presentation:
- - topic_tags:
- - - Artificial Intelligence
+ + New values: 10 / 993 (1.01%)
days_since_1949 model training_dataset_size__gradients
27191 GPT-3.5 Turbo NaN
27298 GPT-3.5 Turbo Instruct NaN
27828 4.2x/year between 2010–2025 NaN
28110 GPT-5.2 Codex NaN
28115 MiniMax-M2.1 NaN
- - Removed values: 5 / 993 (0.50%)
days_since_1949 model training_dataset_size__gradients
26994 GPT-3.5 (davinci-002)\n <NA>
26996 GPT-3.5 Turbo <NA>
27526 GPT-4o (May 2024) <NA>
27823 4.2x/year between 2010–2025 <NA>
27964 Qwen3-235B-A22B-Instruct (Jul 2025) 36000000638976.0
~ Changed values: 660 / 993 (66.47%)
days_since_1949 model training_dataset_size__gradients - training_dataset_size__gradients +
11893 Kohonen network 4000.0 NaN
22537 Pooling CNN (Caltech 101) 3060.0 NaN
25443 (ensemble): AWD-LSTM-DOC (fin) × 5 (WT2) 2000000.0 NaN
26980 EVA-01 7577600000.0 NaN
27057 BLIP-2 (Q-Former) 2321999872.0 NaN
~ Column training_dataset_size__total (changed metadata, new data, changed data)
- - {}
+ + title: Training dataset size
+ + description_short: |-
+ + The number of unique data points used to train the model. Each domain has a specific data point unit; for example, for vision it is images, for language it is words, and for games it is timesteps. This means systems can only be compared directly within the same domain.
+ + description_key:
+ + - |-
+ + Training data size measures the volume of unique examples used to train an AI model during its learning phase. It represents the total number of distinct data points the model learns from, counted only once regardless of how many times they're seen during training.
+ + - |-
+ + To understand this concept, imagine teaching someone to identify different bird species. Each unique bird photo you show them is one piece of training data. If you show 100 different photos, your training data size is 100, even if you review those same photos multiple times.
+ + - |-
+ + Since datasets vary by domain, there's no universal unit for measuring size. Text models might count tokens, image models count pictures, and video models count clips. Epoch AI typically uses the smallest unit that triggers a model update during training. For language models that predict the next word, this would be individual tokens.
+ + - |-
+ + Training data size directly impacts model performance. Larger datasets enable deeper learning and more nuanced pattern recognition, allowing models to identify subtle distinctions and handle diverse real-world scenarios more effectively.
+ + unit: unique datapoints
+ + display:
+ + numDecimalPlaces: 0
+ + zeroDay: '1949-01-01'
+ + yearIsDay: true
+ + processing_level: major
+ + presentation:
+ + topic_tags:
+ + - Artificial Intelligence
+ + New values: 10 / 993 (1.01%)
days_since_1949 model training_dataset_size__total
27191 GPT-3.5 Turbo <NA>
27298 GPT-3.5 Turbo Instruct <NA>
27828 4.2x/year between 2010–2025 <NA>
28110 GPT-5.2 Codex <NA>
28115 MiniMax-M2.1 <NA>
- - Removed values: 5 / 993 (0.50%)
days_since_1949 model training_dataset_size__total
26994 GPT-3.5 (davinci-002)\n NaN
26996 GPT-3.5 Turbo NaN
27526 GPT-4o (May 2024) NaN
27823 4.2x/year between 2010–2025 NaN
27964 Qwen3-235B-A22B-Instruct (Jul 2025) NaN
~ Changed values: 660 / 993 (66.47%)
days_since_1949 model training_dataset_size__total - training_dataset_size__total +
11893 Kohonen network NaN 4000.0
22537 Pooling CNN (Caltech 101) NaN 3060.0
25443 (ensemble): AWD-LSTM-DOC (fin) × 5 (WT2) NaN 2000000.0
26980 EVA-01 NaN 7577600000.0
27057 BLIP-2 (Q-Former) NaN 2321999872.0
= Dataset garden/artificial_intelligence/2025-10-10/epoch_gpus
~ Table epoch_gpus (changed metadata)
- - date_accessed: '2025-10-10'
+ + date_accessed: '2026-02-27'
~ Dim days_since_2000
+ + New values: 7 / 116 (6.03%)
hardware_name days_since_2000
NVIDIA GeForce GTX Titan Black 5162
NVIDIA GB200 9177
AMD Instinct MI350X 9294
AMD Instinct MI355X 9294
Amazon Trainium3 9467
- - Removed values: 4 / 116 (3.45%)
hardware_name days_since_2000
NVIDIA GTX Titan Black 5162
NVIDIA GB200 NVL2 (per GPU) 9177
NVIDIA B300 9358
NVIDIA Blackwell Ultra 9365
~ Dim hardware_name
+ + New values: 7 / 116 (6.03%)
days_since_2000 hardware_name
5162 NVIDIA GeForce GTX Titan Black
9177 NVIDIA GB200
9294 AMD Instinct MI350X
9294 AMD Instinct MI355X
9467 Amazon Trainium3
- - Removed values: 4 / 116 (3.45%)
days_since_2000 hardware_name
5162 NVIDIA GTX Titan Black
9177 NVIDIA GB200 NVL2 (per GPU)
9358 NVIDIA B300
9365 NVIDIA Blackwell Ultra
~ Column comp_performance_per_dollar (new data, changed data)
+ + New values: 7 / 116 (6.03%)
days_since_2000 hardware_name comp_performance_per_dollar
5162 NVIDIA GeForce GTX Titan Black 4263595166
9177 NVIDIA GB200 <NA>
9294 AMD Instinct MI350X <NA>
9294 AMD Instinct MI355X <NA>
9467 Amazon Trainium3 <NA>
- - Removed values: 4 / 116 (3.45%)
days_since_2000 hardware_name comp_performance_per_dollar
5162 NVIDIA GTX Titan Black 4263595166
9177 NVIDIA GB200 NVL2 (per GPU) <NA>
9358 NVIDIA B300 <NA>
9365 NVIDIA Blackwell Ultra <NA>
~ Changed values: 1 / 116 (0.86%)
days_since_2000 hardware_name comp_performance_per_dollar - comp_performance_per_dollar +
8298 NVIDIA H100 SXM5 80GB 1403078342 1857837012
~ Column manufacturer (new data, changed data)
+ + New values: 7 / 116 (6.03%)
days_since_2000 hardware_name manufacturer
5162 NVIDIA GeForce GTX Titan Black NVIDIA
9177 NVIDIA GB200 NVIDIA
9294 AMD Instinct MI350X AMD
9294 AMD Instinct MI355X AMD
9467 Amazon Trainium3 Amazon AWS
- - Removed values: 4 / 116 (3.45%)
days_since_2000 hardware_name manufacturer
5162 NVIDIA GTX Titan Black NVIDIA
9177 NVIDIA GB200 NVL2 (per GPU) NVIDIA
9358 NVIDIA B300 NVIDIA
9365 NVIDIA Blackwell Ultra NVIDIA
= Dataset garden/artificial_intelligence/2026-01-30/frontiermath
~ Table epoch_benchmark_data (changed metadata)
- - date_accessed: '2026-01-30'
+ + date_accessed: '2026-02-27'
~ Dim release_date
+ + New values: 9 / 83 (10.84%)
model_version release_date
Kimi K2P5 2026-01-27
Claude Opus 4 2026-02-05
Claude Opus 4, 64K 2026-02-05
Claude Sonnet 4, 16K 2026-02-17
Gemini 3.1 Pro preview 2026-02-19
~ Dim model_version
+ + New values: 9 / 83 (10.84%)
release_date model_version
2026-01-27 Kimi K2P5
2026-02-05 Claude Opus 4
2026-02-05 Claude Opus 4, 64K
2026-02-17 Claude Sonnet 4, 16K
2026-02-19 Gemini 3.1 Pro preview
~ Column mean_score (new data)
+ + New values: 9 / 83 (10.84%)
release_date model_version mean_score
2026-01-27 Kimi K2P5 27.900002
2026-02-05 Claude Opus 4 38.275864
2026-02-05 Claude Opus 4, 64K 39.655174
2026-02-17 Claude Sonnet 4, 16K 32.400002
2026-02-19 Gemini 3.1 Pro preview 36.899998
Legend: +New ~Modified -Removed =Identical Details
Hint: Run this locally with etl diff REMOTE data/ --include yourdataset --verbose --snippetAutomatically updated datasets matching excess_mortality|covid|fluid|flunet|country_profile|garden/ihme_gbd/2019/gbd_risk are not included Edited: 2026-02-27 14:40:29 UTC |
e15626d to
ffe1618
Compare
Updated all references to 'training_dataset_size__gradients' to 'training_dataset_size__total' across meadow, garden, and grapher steps to match the new column name in the Epoch snapshot. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
- epoch_gpus: Updated column name from 'Memory size per board (Byte)' to 'Memory (bytes)' - frontiermath: Fixed duplicate index issue by recognizing 'max' as a valid context size suffix 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
ffe1618 to
a5dd7f4
Compare
veronikasamborska1994
approved these changes
Feb 27, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.