Fonts 2025 queries #4175

IvanUkhov · 2025-07-24T10:57:11Z

Makes progress on #4073

Fonts

Resources

Structure

The queries are split by the section where they are used:

design/ is about foundries and families,
development/ is about tools and technologies, and
performance/ is about hosting and serving.

Each file name starts with one of the following prefixes indicating the primary subject of the corresponding analysis:

fonts_ is about font files,
pages_ is about HTML pages,
scripts_ is about JavaScript scripts, and
styles_ is about CSS style sheets.

The prefix is followed by the property studied given in singular, potentially extended one or several suffixes narrowing down the scope, as in fonts_size_by_table.sql and pages_link_relation.sql.

Content

Each query starts with a preamble indicating the section, question, and normalization type, as illustrated below:

-- Section: Performance
-- Question: What is the distribution of the file size broken down by table?
-- Normalization: Pages

Many queries rely on temporary functions for convenience and clarity. The functions that appear in several queries are extracted into a common file called common.sql. Whenever any of the functions defined in common.sql is used by a query, the query has the following pseudo-directive at the top:

-- INCLUDE https://github.com/HTTPArchive/almanac.httparchive.org/blob/main/sql/{year}/fonts/common.sql

The pseudo-directive has to be replaced with the content of common.sql prior to executing the query in question.

In addition, queries generally have parameters, as in @date, so as to be able to run them for different configurations. The values for the parameters will have to be supplied upon execution.

All the above is taken take of automatically if the queries are executed using execute.py, which we discuss next.

Execution

The queries can be executed using the execute.py script. The results are first saved in local CSV files sitting next to the SQL files and then uploaded to the spreadsheet. In the spreadsheet, for each query, a separate sheet is created and named after the question the query answers, which is given in its preamble. If the CSV file already exists, the corresponding query is not executed. If cell A1 is already populated, the corresponding sheet is not updated.

First, ensure that the Application Default Credentials authorization strategy is configured, and that the HTTP Archive project is used as the quota project:

gcloud auth application-default login \
  --scopes https://www.googleapis.com/auth/cloud-platform,https://www.googleapis.com/auth/spreadsheets
gcloud auth application-default set-quota-project httparchive

Second, install the Python prerequisites for the script:

pip install -r requirements.txt

The script can be run for all or a subset of the queries as illustrated below:

python execute.py
python execute.py design/*.sql
python execute.py development/fonts_*.sql

By default, it operates in a dry-run mode: it does not run the queries but prints an estimate of the amount of data that would be processed by each query. To actually run the queries, pass the --no-dry-run option as follows:

python execute.py --no-dry-run
python execute.py --no-dry-run design/*.sql
python execute.py --no-dry-run development/fonts_*.sql

sql/2025/fonts/execute.py

IvanUkhov · 2025-08-04T14:39:05Z

@tunetheweb, I think you were the one reviewing the queries last year. If you do not mind, I would like to invite you to review this year, too, but please feel free to assign someone else. This year, we did not change anything. We just migrated the queries to crawl and added a Python script for execution. The instructions are in the readme.

sql/2025/fonts/design/styles_family.sql

tunetheweb

LGTM with some non-blocking comments.

sql/2025/fonts/common.sql

sql/2025/fonts/design/fonts_designer.sql

sql/2025/fonts/design/fonts_family_by_script.sql

sql/2025/fonts/design/styles_family.sql

sql/2025/fonts/performance/fonts_format_file.sql

sql/2025/fonts/execute.py

sql/2025/fonts/common.sql

IvanUkhov · 2025-08-22T04:54:22Z

(The linter is failing due to the code elsewhere.)

tunetheweb · 2025-08-22T07:04:17Z

(The linter is failing due to the code elsewhere.)

Fixing in #4196

tunetheweb · 2025-08-22T11:25:52Z

That's fixed in main now if you can resync this branch @IvanUkhov .

After that are you good to merge this?

IvanUkhov · 2025-08-22T11:36:24Z

Thank you. Rebased.

Well, I have not received any feedback from the lead. I would merge, if you are OK with potential follow-up PRs.

tunetheweb · 2025-08-22T11:37:11Z

Thank you. Rebased.

Well, I have not received any feedback from the lead. I would merge, if you are OK with potential follow-up PRs.

Yeah lets do that.

IvanUkhov force-pushed the fonts branch 2 times, most recently from 01d64b2 to 1d97236 Compare July 31, 2025 08:03

IvanUkhov force-pushed the fonts branch 2 times, most recently from 68e71e2 to e12330c Compare August 2, 2025 03:37

IvanUkhov changed the title ~~Fonts 2025~~ Fonts 2025 queries Aug 2, 2025

IvanUkhov force-pushed the fonts branch from 1935cd6 to db38b0f Compare August 3, 2025 03:41

github-advanced-security bot found potential problems Aug 4, 2025

View reviewed changes

sql/2025/fonts/execute.py Fixed Show fixed Hide fixed

github-advanced-security bot found potential problems Aug 4, 2025

View reviewed changes

sql/2025/fonts/execute.py Fixed Show fixed Hide fixed

IvanUkhov force-pushed the fonts branch from 74016ae to 7d3f0bd Compare August 4, 2025 07:11

github-advanced-security bot found potential problems Aug 4, 2025

View reviewed changes

sql/2025/fonts/execute.py Fixed Show fixed Hide fixed

IvanUkhov force-pushed the fonts branch 2 times, most recently from 38e6979 to ac584bd Compare August 4, 2025 09:35

IvanUkhov marked this pull request as ready for review August 4, 2025 09:36

tunetheweb added the analysis Querying the dataset label Aug 18, 2025

max-ostapenko mentioned this pull request Aug 19, 2025

Change parsed_css.css from STRING to JSON HTTPArchive/httparchive.org#1095

Closed

max-ostapenko reviewed Aug 19, 2025

View reviewed changes

sql/2025/fonts/design/styles_family.sql Outdated Show resolved Hide resolved

tunetheweb reviewed Aug 20, 2025

View reviewed changes

max-ostapenko reviewed Aug 21, 2025

View reviewed changes

sql/2025/fonts/common.sql Outdated Show resolved Hide resolved

IvanUkhov added 8 commits August 22, 2025 13:29

Copy the queries from 2024 to 2025

3e101ad

Update the readme

100a649

Update the INCLUDE pseudo-directives

bd194f2

Replace all with crawl

c1bcf92

Update the dates to 2025-07-01

d44598b

Update the usage of custom_metrics

89ec842

Update the usage of summary

51975f2

Update the usage of payload in the common functions

18e5776

IvanUkhov added 25 commits August 22, 2025 13:30

Add a few comments

29a3ce8

Name sheets by the question

a7e01f3

Populate the spreadsheet

013368a

Make a cosmetic adjustment

8c3bf11

Nullify NaNs

cac0adc

Address a lint

a9bf33a

Exclude non-SQL files

fb5bcad

Add a parameter for controlling the number of workers

ffe8e6f

Use SAFE.INT64 for respBodySize

8e3286f

Take the first line of the error

f96c378

Cast file sizes to integers

a5e48f8

Downsample in design/fonts_family_by_script.sql

8874d28

Fix a typo

5fe0e2a

Add rounding in design/fonts_metric.sql

27e2ba0

Fix a typo

9d1efd6

Fix the reporting of failures

e396c87

Update the readme

ee65748

Update the usage of the Chrome UX report

3787dd6

Update the usage of parsed_css

ae31d3a

Use JSON instead of STRING in custom JavaScript functions

ee3a5f0

Make a cosmetic adjustment

dcb8044

Remove JSON_QUERY in favor of direct indexing

14fd419

Simplify SCRIPTS

1e036ff

Simplify HAS_EMOJI

e8f0482

Do no use subsampling

038a4ec

IvanUkhov force-pushed the fonts branch from fafc9f6 to 038a4ec Compare August 22, 2025 11:30

tunetheweb approved these changes Aug 22, 2025

View reviewed changes

tunetheweb merged commit b6bcddb into HTTPArchive:main Aug 22, 2025
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fonts 2025 queries #4175

Fonts 2025 queries #4175

Uh oh!

IvanUkhov commented Jul 24, 2025 •

edited by tunetheweb

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

IvanUkhov commented Aug 4, 2025

Uh oh!

Uh oh!

tunetheweb left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

IvanUkhov commented Aug 22, 2025

Uh oh!

tunetheweb commented Aug 22, 2025

Uh oh!

tunetheweb commented Aug 22, 2025

Uh oh!

IvanUkhov commented Aug 22, 2025

Uh oh!

tunetheweb commented Aug 22, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Fonts 2025 queries #4175

Fonts 2025 queries #4175

Uh oh!

Conversation

IvanUkhov commented Jul 24, 2025 • edited by tunetheweb Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fonts

Resources

Structure

Content

Execution

Uh oh!

Uh oh!

Uh oh!

Uh oh!

IvanUkhov commented Aug 4, 2025

Uh oh!

Uh oh!

tunetheweb left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

IvanUkhov commented Aug 22, 2025

Uh oh!

tunetheweb commented Aug 22, 2025

Uh oh!

tunetheweb commented Aug 22, 2025

Uh oh!

IvanUkhov commented Aug 22, 2025

Uh oh!

tunetheweb commented Aug 22, 2025

Uh oh!

Uh oh!

Uh oh!

IvanUkhov commented Jul 24, 2025 •

edited by tunetheweb

Loading

tunetheweb left a comment •

edited

Loading