Skip to content

Conversation

@burakguneli
Copy link
Contributor

@burakguneli burakguneli commented Jul 21, 2025

πŸ“ Description

Working on queries for #4088.

βœ… Migrated

The following queries have been migrated from 2024 to 2025 using the new crawl dataset: https://har.fyi/guides/migrating-to-crawl-dataset/

  • script_count.sql
  • unused_js_bytes.sql
  • green_third_party_requests.sql
  • cdn_adoption.sql
  • video_autoplay_values.sql
  • video_preload_values.sql
  • text_compression.sql
  • responsive_images.sql
  • favicons.sql
  • cache_header_usage.sql
  • stylesheet_count.sql
  • page_byte_pre_type.sql
  • use_of_prefers_dark_mode_usage.sql
  • green_web_hosting.sql
  • ecommerce_bytes_per_type.sql
  • ssg_bytes_per_type.sql
  • cms_bytes_per_type.sql
  • global_emissions_per_page.sql
  • query_run_size.sql

@burakguneli burakguneli self-assigned this Jul 21, 2025
@burakguneli burakguneli marked this pull request as draft July 21, 2025 07:54
@tunetheweb tunetheweb changed the title [WIP] Migrate 2024 sustainability queries to 2025 crawl dataset [WIP] Sustainability 2025 queries Aug 18, 2025
@tunetheweb tunetheweb added the analysis Querying the dataset label Aug 18, 2025
@tunetheweb tunetheweb changed the title [WIP] Sustainability 2025 queries Sustainability 2025 queries Aug 18, 2025
Comment on lines 13 to 15
FROM UNNEST(JSON_EXTRACT_ARRAY(css, '$.stylesheet.rules')) AS rule
WHERE JSON_EXTRACT_SCALAR(rule, '$.type') = 'media' AND
JSON_EXTRACT_SCALAR(rule, '$.media') = '(prefers-color-scheme:dark)'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
FROM UNNEST(JSON_EXTRACT_ARRAY(css, '$.stylesheet.rules')) AS rule
WHERE JSON_EXTRACT_SCALAR(rule, '$.type') = 'media' AND
JSON_EXTRACT_SCALAR(rule, '$.media') = '(prefers-color-scheme:dark)'
FROM UNNEST(JSON_EXTRACT_ARRAY(css.stylesheet.rules)) AS rule
WHERE STRING(rule.type) = 'media' AND
STRING(rule.media) = '(prefers-color-scheme:dark)'

as we've updated css column to JSON type.

Comment on lines +2 to +12
CREATE TEMPORARY FUNCTION HASCONTENTVISIBILITY(css STRING)
RETURNS ARRAY<STRUCT<property STRING, freq INT64>>
LANGUAGE js
OPTIONS (library = "gs://httparchive/lib/css-utils.js")
AS '''
try {
var ast = JSON.parse(css);

let ret = {};

walkDeclarations(ast, ({property}) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CREATE TEMPORARY FUNCTION HASCONTENTVISIBILITY(css STRING)
RETURNS ARRAY<STRUCT<property STRING, freq INT64>>
LANGUAGE js
OPTIONS (library = "gs://httparchive/lib/css-utils.js")
AS '''
try {
var ast = JSON.parse(css);
let ret = {};
walkDeclarations(ast, ({property}) => {
CREATE TEMPORARY FUNCTION HASCONTENTVISIBILITY(css JSON)
RETURNS ARRAY<STRUCT<property STRING, freq INT64>>
LANGUAGE js
OPTIONS (library = "gs://httparchive/lib/css-utils.js")
AS '''
try {
let ret = {};
walkDeclarations(css, ({property}) => {

client,
page,
tech.technology AS ecommerce,
CAST(JSON_VALUE(summary, '$.bytesTotal') AS INT64) / 1024 AS total_kb,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CAST(JSON_VALUE(summary, '$.bytesTotal') AS INT64) / 1024 AS total_kb,
INT64(summary.bytesTotal) / 1024 AS total_kb,

please update all the similar JSON values to this approach for readability

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

analysis Querying the dataset

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants