Skip to content
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
178 changes: 75 additions & 103 deletions database/schema.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,33 +50,22 @@ For runtime benchmarks the schema very similar, but there are different table na

A description of a rustc compiler artifact being benchmarked.

This description includes:
* name: usually a commit sha or a tag like "1.51.0" but is free-form text so can be anything.
* date: the date associated with this compiler artifact (usually only when the name is a commit)
* type: currently one of "master" (i.e., we're testing a merge commit), "try" (someone is testing a PR), and "release" (usually a release candidate - though local compilers also get labeled like this).
Columns:

```
sqlite> select * from artifact limit 1;
id name date type
---------- ---------- ---------- -------
1 LOCAL_TEST release
```
* **name** (`text`): Usually a commit sha "fefce3cecd69cebf2d7c9aa3dd90a84379f4201a" or a tag like "1.51.0" but is free-form text so could be anything.
* **date** (`timestamptz`: The date associated with this compiler artifact (usually only when the name is a commit)
* **type** (`text`): Currently one of "master" (i.e., we're testing a merge commit), "try" (someone is testing a PR), and "release" (usually a release candidate - though local compilers also get labeled like this).

### artifact size

Records the size of individual components (like `librustc_driver.so` or `libLLVM.so`) of a single
artifact.

This description includes:
* component: normalized name of the component (hashes are removed)
* size: size of the component in bytes
Columns:

```
sqlite> select * from artifact_size limit 1;
aid component size
---------- ---------- ----------
1 libLLVM.so 177892352
```
* **aid** (`integer`): Artifact id, references the id in the artifact table
* **component** (`text`): Normalized name of the component (hashes are removed)
* **size** (`integer`): Size of the component in bytes

### collection

Expand All @@ -88,34 +77,31 @@ This is a way to collect statistics together signifying that they belong to the

Currently, the collection also marks the git sha of the currently running collector binary.

```
sqlite> select * from collection limit 1;
id perf_commit
---------- ----------------------------------------
1 d9fd96f409a15429757030f225b082744a72516c
```
Columns:

* **id** (`integer`): Unique identifier
* **perf_commit** (`text`): Commit sha / tag

### collector_progress

Keeps track of the collector's start and finish time as well as which step it's currently on.

```
sqlite> select * from collector_progress limit 1;
aid step start end
---------- ---------- ---------- ----------
1 helloworld 1625829961 1625829965
```
Columns:

* **aid** (`integer`): Artifact id, references the id in the artifact table
* **step** (`text`): The step the collector is currently benchmarking
* **start** (`timestamptz`): When the collector started
* **end** (`timestamptz`): When the collector finished

### artifact_collection_duration

Records how long benchmarking takes in seconds.

```
sqlite> select * from artifact_collection_duration limit 1;
aid date_recorded duration
---------- ------------- ----------
1 1625829965 4
```
Columns:

* **aid** (`integer`): Artifact id, references the id in the artifact table
* **date_recorded** (`timestamptz`): When this was recorded
* **duration** (`integer`): How long the benchmarking took in seconds

### benchmark

Expand All @@ -127,35 +113,28 @@ and its category. The benchmark name is used as a foreign key in many of the oth
Category is either `primary` (real-world benchmark) or `secondary` (stress test).
Stable benchmarks have `category` set to `primary` and `stabilized` set to `1`.

```
sqlite> select * from benchmark limit 1;
name stabilized category
---------- ---------- ----------
helloworld 0 primary
```
Columns:

* **name** (`text`): Name of the benchmark
* **stablized** (`boolean`): Whether the benchmark supports stable
* **category** (`category`): `primary` if this is a 'real-world' benchmark or `secondary` if a 'stress test'

### pstat_series

Describes the parametrization of a compile-time benchmark. Contains a unique combination
of a crate, profile, scenario and the metric being collected.

* crate (aka `benchmark`): the benchmarked crate which might be a crate from crates.io or a crate made specifically to stress some part of the compiler.
* profile: what type of compilation is happening - check build, optimized build (a.k.a. release build), debug build, or doc build.
* scenario: describes how much of the incremental cache is full. An empty incremental cache means that the compiler must do a full build.
* backend: codegen backend used for compilation.
Columns:

* **crate** (`text`) (aka `benchmark`): The benchmarked crate which might be a crate from crates.io or a crate made specifically to stress some part of the compiler.
* **profile** (`text`): What type of compilation is happening - check build, optimized build (a.k.a. release build), debug build, or doc build.
* **scenario** (`text`): Describes how much of the incremental cache is full. An empty incremental cache means that the compiler must do a full build.
* **backend** (`text`): Codegen backend used for compilation, for example 'llvm'
* metric: the type of metric being collected.

This corresponds to a [`statistic description`](../docs/glossary.md).

There is a separate table for this collection to avoid duplicating crates, profiles, scenarios etc.
many times in the `pstat` table.

```
sqlite> select * from pstat_series limit 1;
id crate profile scenario backend target metric
---------- ---------- ---------- ---------- ------- ------------ ------------
1 helloworld check full llvm x86_64-linux-unknown-gnu task-clock:u
```
There is a separate table for this collection to avoid duplicating crates, profiles, scenarios etc... many times in the `pstat` table.

### pstat

Expand All @@ -164,12 +143,12 @@ A measured value of a compile-time metric that is unique to a `pstat_series`, `a
Each measured combination of a collection, rustc artifact, benchmarked crate, profile, scenario and a metric
has its own unique entry in this table.

```
sqlite> select * from pstat limit 1;
series aid cid value
---------- ---------- ---------- ----------
1 1 1 24.93
```
Columns:

* **series** (`integer`): References `pstat_series` id
* **aid** (`integer`): Artifact id, references the id in the artifact table
* **cid** (`integer`): Collection id, references the id in the collection table
* **value** (`double precision`): The value of the metric that has been measured, for example time

### runtime_pstat_series

Expand All @@ -178,12 +157,11 @@ of a benchmark and the metric being collected.

This table exists to avoid duplicating crates, profiles, scenarios etc. many times in the `runtime_pstat` table.

```
sqlite> select * from runtime_pstat_series limit 1;
id benchmark metric
---------- --------- --------------
1 nbody-10k instructions:u
```
Columns:

* **id** (`integer`): Unique identifier
* **benchmark** (`text`): The name of the benchmark
* **metric** (`text`): The metric that was measured

### runtime_pstat

Expand All @@ -192,36 +170,37 @@ A measured value of a runtime metric that is unique to a `runtime_pstat_series`,
Each measured combination of a collection, rustc artifact, benchmark and a metric
has its own unique entry in this table.

```
sqlite> select * from runtime_pstat limit 1;
series aid cid value
---------- ---------- ---------- ----------
1 1 1 24.93
```
Columns:

* **series** (`integer`): References `runtime_pstat_series` id
* **aid** (`integer`): Artifact id, references the id in the artifact table
* **cid** (`integer`): Collection id, references the id in the collection table
* **value** (`double precision`): The value of the metric that has been measured, for example time

### rustc_compilation

Records the duration of compiling a `rustc` crate for a given artifact and collection.

```
sqlite> select * from rustc_compilation limit 1;
aid cid crate duration
--- --- ---------- --------
1 42 rustc_mir_transform 28.096
```
Columns:

* **aid** (`integer`): Artifact id, references the id in the artifact table
* **cid** (`integer`): Collection id, references the id in the collection table
* **crate** (`text`): The name of the rustc crate
* **duration** (`big int`): How long compiling the rustc crate took

### raw_self_profile

Records that a given combination of artifact, collection, benchmark, profile and scenario
has a self profile archive available. This profile is then downloaded through an endpoint -
it is not stored in the database directly.

```
sqlite> select * from raw_self_profile limit 1;
aid cid crate profile cache
--- --- ----- ------- -----
1 42 hello-world debug full
```
Columns:

* **aid** (`integer`): Artifact id, references the id in the artifact table
* **cid** (`integer`): Collection id, references the id in the collection table
* **crate** (`text`): The name of the crate
* **profile** (`text`): What type of compilation is happening - check build, optimized build (a.k.a. release build), debug build, or doc build.
* **cache** (`text`): ?

### pull_request_build

Expand All @@ -230,23 +209,16 @@ Records a pull request commit that is waiting in a queue to be benchmarked.
First a merge commit is queued, then its artifacts are built by bors, and once the commit
is attached to the entry in this table, it can be benchmarked.

* bors_sha: SHA of the commit that should be benchmarked
* pr: number of the PR
* parent_sha: SHA of the parent commit, to which will the PR be compared
* complete: bool specifying whether this commit has been already benchmarked or not
* requested: when was the commit queued
* include: which benchmarks should be included (corresponds to the `--include` benchmark parameter)
* exclude: which benchmarks should be excluded (corresponds to the `--exclude` benchmark parameter)
* runs: how many iterations should be used by default for the benchmark run
* commit_date: when was the commit created
* backends: the codegen backends to use for the benchmarks (corresponds to the `--backends` benchmark parameter)

```
sqlite> select * from pull_request_build limit 1;
bors_sha pr parent_sha complete requested include exclude runs commit_date backends
---------- -- ---------- -------- --------- ------- ------- ---- ----------- --------
1w0p83... 42 fq24xq... true <timestamp> 3 <timestamp>
```
* **bors_sha** (`text`): SHA of the commit that should be benchmarked
* **pr** (`integer`): number of the PR
* **parent_sha** (`text`): SHA of the parent commit, to which will the PR be compared
* **complete** (`boolean`): Specifies whether this commit has been already benchmarked or not
* **requested** (`timestamptz`): When was the commit queued
* **include** (`text`): Which benchmarks should be included (corresponds to the `--include` benchmark parameter), comma separated strings
* **exclude** (`text`): Which benchmarks should be excluded (corresponds to the `--exclude` benchmark parameter), comma separated strings
* **runs** (`integer`): How many iterations should be used by default for the benchmark run
* **commit_date** (`timestamptz`): When was the commit created
* **backends** (`text`): The codegen backends to use for the benchmarks (corresponds to the `--backends` benchmark parameter)

### error

Expand Down
Loading