Skip to content

Conversation

@lunelson
Copy link
Contributor

@lunelson lunelson commented Dec 16, 2025

🌟 What is the purpose of this PR?

Introduces a technical design for decomposing complex R&D goals into structured, executable plans using a planning agent.

A core concept here, is treating LLM planning as a "compiler front-end" that produces an Intermediate Representation (IR) — the PlanSpec — which can be validated, scored, and eventually compiled into executable workflows.

This PR establishes the foundational patterns for plan generation and quality evaluation. Artifacts for this phase are mostly tests and scorers.

There are examples of four test runs' console outputs in the Demo section below

🔗 Related links

  • agent/docs/PLAN-task-decomposition.md — Full design document and implementation plan
  • agent/docs/E2E-test-results-2024-12-17.md — Latest E2E test outputs

🚫 Blocked by

None

🔍 What does this change?

Core Schema & Types

  • schemas/plan-spec.ts — Full Zod schema for PlanSpec with 4 step types:

    • research — Parallelizable information gathering
    • synthesize — Combining findings (integrative) or evaluating results (evaluative)
    • experiment — Testing hypotheses (exploratory or confirmatory with preregistration)
    • develop — Building/implementing artifacts
  • schemas/planning-fixture.ts — Types for test fixtures (PlanningFixture, ExpectedPlanCharacteristics)

  • constants.ts — 12 agent capability profiles with canHandle mappings for executor assignment

Validation & Analysis

  • tools/plan-validator.ts — 12 structural validation checks:

    • DAG validity (no cycles, valid references)
    • Executor compatibility
    • Preregistration requirements for confirmatory experiments
    • Input/output consistency
  • tools/topology-analyzer.ts — DAG analysis utilities:

    • Entry/exit point detection
    • Critical path calculation
    • Parallel group identification

Scoring System

  • scorers/plan-scorers.ts — 4 deterministic scorers (no LLM, fast):

    • scorePlanStructure — DAG validity, parallelism, step type diversity
    • scorePlanCoverage — Requirement/hypothesis coverage
    • scoreExperimentRigor — Preregistration, success criteria
    • scoreUnknownsCoverage — Epistemic completeness
  • scorers/plan-llm-scorers.ts — 3 LLM-based judges:

    • goalAlignmentScorer — Does plan address the goal?
    • planGranularityScorer — Are steps appropriately sized?
    • hypothesisTestabilityScorer — Are hypotheses testable?

Planning Agent

  • agents/planner-agent.tsgeneratePlan(goal, context) function that uses structured output to produce valid PlanSpec instances

Test Fixtures

4 fixtures of increasing complexity in fixtures/decomposition-prompts/:

Fixture Complexity Step Types
summarize-papers Simple linear research → synthesize
explore-and-recommend Parallel research research (parallel) → synthesize (evaluative)
hypothesis-validation With experiments research → experiment → synthesize
ct-database-goal Full R&D cycle All 4 types, hypotheses, experiments

E2E Test Suite

  • workflows/planning-workflow.test.ts — Comprehensive E2E tests:
    • Runs all 4 fixtures through the full pipeline
    • Validates generated plans
    • Runs deterministic scorers
    • Optional LLM scorers via RUN_LLM_SCORERS=true
    • Generates summary report with score table

Pre-Merge Checklist 🚀

🚢 Has this modified a publishable library?

This PR:

  • does not modify any publishable blocks or libraries, or modifications do not need publishing

📜 Does this require a change to the docs?

The changes in this PR:

  • are internal and do not require a docs change

🕸️ Does this require a change to the Turbo Graph?

The changes in this PR:

  • do not affect the execution graph

⚠️ Known issues

  1. ct-database-goal fixture fails validation — The LLM occasionally generates confirmatory experiments without preregisteredCommitments. This is a known prompt engineering issue that will be addressed in the revision workflow.

  2. explore-and-recommend generates unexpected content — The LLM adds hypotheses and experiments not specified in the fixture expectations. This is valid behavior (more thorough than minimum), but indicates fixture expectations may need adjustment.

🐾 Next steps

Per PLAN-task-decomposition.md Section 18:

  1. Revision workflow loop — Implement dountil loop: generate → validate → feedback → regenerate (max 3 attempts)
  2. Supervisor agent — LLM approval gate before plan finalization
  3. Prompt improvements — Strengthen preregisteredCommitments requirement
  4. Stub execution — Low priority, deferred

🛡 What tests cover this?

  • plan-validator.test.ts — 25 negative fixture tests for validation
  • plan-scorers.test.ts — 23 unit tests for deterministic scorers
  • plan-llm-scorers.test.ts — 6 tests for LLM judges
  • fixtures.test.ts — 4 fixture validation tests
  • planning-workflow.test.ts — E2E pipeline tests (3/4 passing)

❓ How to test this?

  1. Checkout the branch
  2. cd apps/hash-ai-agent
  3. Run unit tests: npx vitest run src/mastra/scorers/plan-scorers.test.ts
  4. Run E2E tests: npx vitest run src/mastra/workflows/planning-workflow.test.ts
  5. (Optional) Run with LLM scorers: RUN_LLM_SCORERS=true npx vitest run src/mastra/workflows/planning-workflow.test.ts

📹 Demo

Individual Fixture Tests

summarize-papers (4.2s) — PASS
============================================================
  FIXTURE: summarize-papers
============================================================
Goal: Summarize 3 recent papers on retrieval-augmented generation (RAG) 
           and produce a comparis...

--- Generating Plan ---
  ID: rag-paper-summary-comparison-plan
  Goal Summary: Summarize 3 recent RAG papers and create a comparison table....
  Steps: 3
  Requirements: 3
  Hypotheses: 0
  Step types: {"research":2,"synthesize":1}

--- Validation ---
  Valid: true
  Errors: 0

--- Topology Analysis ---
  Entry points: [S1]
  Exit points: [S3]
  Critical path: 3 steps
  Parallel groups: 3

--- Deterministic Scores ---
  Overall: 92.8%
  Structure: 76.7%
  Coverage: 100.0%
  Experiment Rigor: 100.0%
  Unknowns Coverage: 93.3%

--- Expected Characteristics Check ---
  All expected characteristics met

  (LLM scorers skipped — set RUN_LLM_SCORERS=true to enable)

  Duration: 4.2s
explore-and-recommend (13.9s) — PASS (with notes)
============================================================
  FIXTURE: explore-and-recommend
============================================================
Goal: Research approaches to vector database indexing and recommend 
           the best approach for our ...

--- Generating Plan ---
  ID: vector-db-indexing-research-plan
  Goal Summary: Research vector database indexing approaches and recommend the best for 10M docu...
  Steps: 11
  Requirements: 7
  Hypotheses: 2
  Step types: {"research":4,"synthesize":5,"experiment":2}

--- Validation ---
  Valid: true
  Errors: 0

--- Topology Analysis ---
  Entry points: [S1]
  Exit points: [S11]
  Critical path: 8 steps
  Parallel groups: 8

--- Deterministic Scores ---
  Overall: 92.5%
  Structure: 85.9%
  Coverage: 92.9%
  Experiment Rigor: 92.5%
  Unknowns Coverage: 100.0%

--- Expected Characteristics Check ---
  Issues:
    - Unexpected hypotheses: 2
    - Unexpected experiment steps: 2

  (LLM scorers skipped — set RUN_LLM_SCORERS=true to enable)

  Duration: 13.9s

Note: The LLM generated hypotheses and experiments that the fixture didn't expect. This is not a validation failure — the plan is valid, just more thorough than the minimum expected.

hypothesis-validation (15.4s) — PASS
============================================================
  FIXTURE: hypothesis-validation
============================================================
Goal: Test whether fine-tuning a small LLM (e.g., Llama 3 8B) on 
           domain-specific data outperfo...

--- Generating Plan ---
  ID: entity-extraction-llm-comparison-plan
  Goal Summary: Compare fine-tuned small LLM vs. few-shot large LLM for entity extraction....
  Steps: 12
  Requirements: 4
  Hypotheses: 2
  Step types: {"research":3,"synthesize":3,"experiment":3,"develop":3}

--- Validation ---
  Valid: true
  Errors: 0

--- Topology Analysis ---
  Entry points: [S1, S2, S3]
  Exit points: [S12]
  Critical path: 8 steps
  Parallel groups: 8

--- Deterministic Scores ---
  Overall: 95.3%
  Structure: 86.0%
  Coverage: 100.0%
  Experiment Rigor: 95.0%
  Unknowns Coverage: 100.0%

--- Expected Characteristics Check ---
  All expected characteristics met

  (LLM scorers skipped — set RUN_LLM_SCORERS=true to enable)

  Duration: 15.4s
ct-database-goal (15.8s) — FAIL
============================================================
  FIXTURE: ct-database-goal
============================================================
Goal: Create a backend language and database that is natively aligned 
           with category-theoretica...

--- Generating Plan ---
  ID: ct-db-backend-plan
  Goal Summary: Create a backend language and database natively aligned with category theory, su...
  Steps: 17
  Requirements: 8
  Hypotheses: 4
  Step types: {"research":4,"synthesize":8,"experiment":4,"develop":1}

--- Validation ---
  Valid: false
  Errors: 1
    [MISSING_PREREGISTERED_COMMITMENTS] Confirmatory experiment "S14" must have preregistered commitments

  Duration: 15.8s

Failure Reason: The LLM generated a confirmatory experiment (S14) without including preregisteredCommitments. This is a known issue — the prompt needs to more strongly emphasize this requirement, or a revision loop needs to catch and fix it.


Summary Report Test

Summary Report (49.0s) — runs all fixtures sequentially
============================================================
  SUMMARY REPORT
============================================================

Total: 4 fixtures
Successful: 3
Failed: 1

Failures:
  - ct-database-goal: Validation failed: Confirmatory experiment "S14" must have preregistered commitments

Deterministic Scores:
  Fixture                     | Overall | Structure | Coverage | Rigor | Unknowns
  -------------------------------------------------------------------------------------
  summarize-papers             |     93% |       77% |     100% |  100% |      93%
  explore-and-recommend        |     92% |       86% |      93% |   93% |     100%
  hypothesis-validation        |     95% |       86% |     100% |   95% |     100%

Total duration: 49.0s

@github-actions github-actions bot added area/deps Relates to third-party dependencies (area) area/apps > hash* Affects HASH (a `hash-*` app) area/infra Relates to version control, CI, CD or IaC (area) area/libs Relates to first-party libraries/crates/packages (area) area/apps labels Dec 16, 2025
@codecov
Copy link

codecov bot commented Dec 16, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 58.90%. Comparing base (06cc531) to head (a597c38).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #8188   +/-   ##
=======================================
  Coverage   58.90%   58.90%           
=======================================
  Files        1193     1193           
  Lines      112723   112723           
  Branches     5013     5013           
=======================================
+ Hits        66394    66396    +2     
+ Misses      45571    45569    -2     
  Partials      758      758           
Flag Coverage Δ
rust.harpc-codec 84.70% <ø> (ø)
rust.hash-graph-validation 83.45% <ø> (ø)
rust.hashql-hir 89.10% <ø> (ø)
rust.hashql-syntax-jexpr 94.05% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@lunelson lunelson changed the base branch from main to ln/h-5746-sync-research-and-plans December 16, 2025 17:38
@lunelson lunelson force-pushed the ln/h-5746-sync-research-and-plans branch from fdb65e6 to 426a377 Compare December 16, 2025 17:40
@lunelson lunelson force-pushed the ln/h-5847-dynamic-workflows branch from 7bfffe8 to f1255fd Compare December 16, 2025 17:40
@vercel vercel bot temporarily deployed to Preview – petrinaut December 16, 2025 17:40 Inactive
@github-actions github-actions bot removed area/deps Relates to third-party dependencies (area) area/libs Relates to first-party libraries/crates/packages (area) labels Dec 16, 2025
@lunelson lunelson force-pushed the ln/h-5847-dynamic-workflows branch from f1255fd to cf06b67 Compare December 17, 2025 14:30
@lunelson lunelson force-pushed the ln/h-5847-dynamic-workflows branch from cf06b67 to 0e31161 Compare December 17, 2025 15:12
@lunelson lunelson force-pushed the ln/h-5847-dynamic-workflows branch from 0e31161 to 43952c1 Compare December 17, 2025 15:15
@lunelson lunelson force-pushed the ln/h-5847-dynamic-workflows branch from 43952c1 to f7aecae Compare December 17, 2025 15:27
@lunelson lunelson marked this pull request as ready for review December 17, 2025 15:28
@cursor
Copy link

cursor bot commented Dec 17, 2025

PR Summary

Introduces an LLM-driven R&D planning framework (PlanSpec schema, planner agent), adds deterministic/LLM scorers, validation/topology tools, fixtures with E2E tests, and minor config/script updates.

  • Planning Framework (apps/hash-ai-agent)
    • Schema: Add schemas/plan-spec.ts (PlanSpec IR with 4 step types) and schemas/planning-fixture.ts.
    • Agent: Add agents/planner-agent.ts (generatePlan() with structured output); register in src/mastra/index.ts.
    • Constants: Add executor capability profiles in src/mastra/constants.ts.
    • Scorers:
      • Deterministic: scorers/plan-scorers.ts (+ unit tests).
      • LLM-based: scorers/plan-llm-scorers.ts (+ opt-in tests via RUN_LLM_SCORERS).
    • Validation/Analysis: Integrate existing plan-validator and add topology usage in tests.
    • Fixtures & Tests: Add fixtures in fixtures/decomposition-prompts/* and suite fixtures.test.ts; includes complex CT database case.
  • Docs & Wiki
    • Add planning design and results: agent/plans/PLAN-task-decomposition.md, E2E-test-results-2024-12-17.md, prompt templates and wiki notes.
  • Tooling
    • Update package.json test/eval scripts and add baseline-browser-mapping.
    • Update markdownlint ignores and extend AGENTS.md with contextual rules.

Written by Cursor Bugbot for commit a597c38. This will update automatically on new commits. Configure here.

@vercel vercel bot temporarily deployed to Preview – petrinaut December 22, 2025 11:22 Inactive
@vercel vercel bot temporarily deployed to Preview – petrinaut December 22, 2025 14:02 Inactive
@vercel vercel bot temporarily deployed to Preview – petrinaut December 22, 2025 17:14 Inactive
@github-actions
Copy link
Contributor

Benchmark results

@rust/hash-graph-benches – Integrations

policy_resolution_large

Function Value Mean Flame graphs
resolve_policies_for_actor user: empty, selectivity: high, policies: 2002 $$26.4 \mathrm{ms} \pm 209 \mathrm{μs}\left({\color{gray}0.843 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: empty, selectivity: low, policies: 1 $$3.23 \mathrm{ms} \pm 12.9 \mathrm{μs}\left({\color{gray}0.562 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: empty, selectivity: medium, policies: 1001 $$11.9 \mathrm{ms} \pm 76.5 \mathrm{μs}\left({\color{gray}1.39 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: seeded, selectivity: high, policies: 3314 $$41.8 \mathrm{ms} \pm 289 \mathrm{μs}\left({\color{gray}-0.652 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: seeded, selectivity: low, policies: 1 $$13.8 \mathrm{ms} \pm 82.8 \mathrm{μs}\left({\color{gray}-0.002 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: seeded, selectivity: medium, policies: 1526 $$23.0 \mathrm{ms} \pm 140 \mathrm{μs}\left({\color{gray}-0.438 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: system, selectivity: high, policies: 2078 $$30.1 \mathrm{ms} \pm 195 \mathrm{μs}\left({\color{lightgreen}-28.786 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: system, selectivity: low, policies: 1 $$3.57 \mathrm{ms} \pm 15.4 \mathrm{μs}\left({\color{lightgreen}-82.224 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: system, selectivity: medium, policies: 1033 $$13.6 \mathrm{ms} \pm 97.9 \mathrm{μs}\left({\color{lightgreen}-51.137 \mathrm{\%}}\right) $$ Flame Graph

policy_resolution_medium

Function Value Mean Flame graphs
resolve_policies_for_actor user: empty, selectivity: high, policies: 102 $$3.61 \mathrm{ms} \pm 18.0 \mathrm{μs}\left({\color{gray}0.006 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: empty, selectivity: low, policies: 1 $$2.84 \mathrm{ms} \pm 12.8 \mathrm{μs}\left({\color{gray}0.808 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: empty, selectivity: medium, policies: 51 $$3.18 \mathrm{ms} \pm 14.8 \mathrm{μs}\left({\color{gray}-0.340 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: seeded, selectivity: high, policies: 269 $$4.97 \mathrm{ms} \pm 26.4 \mathrm{μs}\left({\color{gray}0.146 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: seeded, selectivity: low, policies: 1 $$3.37 \mathrm{ms} \pm 12.3 \mathrm{μs}\left({\color{gray}-0.002 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: seeded, selectivity: medium, policies: 107 $$3.93 \mathrm{ms} \pm 13.9 \mathrm{μs}\left({\color{gray}0.576 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: system, selectivity: high, policies: 133 $$4.22 \mathrm{ms} \pm 23.8 \mathrm{μs}\left({\color{gray}4.03 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: system, selectivity: low, policies: 1 $$3.26 \mathrm{ms} \pm 15.7 \mathrm{μs}\left({\color{gray}0.812 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: system, selectivity: medium, policies: 63 $$3.85 \mathrm{ms} \pm 22.1 \mathrm{μs}\left({\color{gray}0.661 \mathrm{\%}}\right) $$ Flame Graph

policy_resolution_none

Function Value Mean Flame graphs
resolve_policies_for_actor user: empty, selectivity: high, policies: 2 $$2.53 \mathrm{ms} \pm 12.8 \mathrm{μs}\left({\color{red}6.44 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: empty, selectivity: low, policies: 1 $$2.42 \mathrm{ms} \pm 9.62 \mathrm{μs}\left({\color{gray}4.41 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: empty, selectivity: medium, policies: 1 $$2.48 \mathrm{ms} \pm 10.0 \mathrm{μs}\left({\color{gray}3.04 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: system, selectivity: high, policies: 8 $$2.79 \mathrm{ms} \pm 11.3 \mathrm{μs}\left({\color{red}5.74 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: system, selectivity: low, policies: 1 $$2.62 \mathrm{ms} \pm 14.1 \mathrm{μs}\left({\color{red}5.04 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: system, selectivity: medium, policies: 3 $$2.77 \mathrm{ms} \pm 12.4 \mathrm{μs}\left({\color{gray}3.16 \mathrm{\%}}\right) $$ Flame Graph

policy_resolution_small

Function Value Mean Flame graphs
resolve_policies_for_actor user: empty, selectivity: high, policies: 52 $$2.91 \mathrm{ms} \pm 11.9 \mathrm{μs}\left({\color{red}5.03 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: empty, selectivity: low, policies: 1 $$2.62 \mathrm{ms} \pm 13.7 \mathrm{μs}\left({\color{red}8.20 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: empty, selectivity: medium, policies: 25 $$2.79 \mathrm{ms} \pm 12.5 \mathrm{μs}\left({\color{red}7.65 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: seeded, selectivity: high, policies: 94 $$3.23 \mathrm{ms} \pm 12.7 \mathrm{μs}\left({\color{gray}3.33 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: seeded, selectivity: low, policies: 1 $$2.84 \mathrm{ms} \pm 11.8 \mathrm{μs}\left({\color{red}6.33 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: seeded, selectivity: medium, policies: 26 $$3.05 \mathrm{ms} \pm 15.4 \mathrm{μs}\left({\color{red}5.25 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: system, selectivity: high, policies: 66 $$3.17 \mathrm{ms} \pm 14.6 \mathrm{μs}\left({\color{red}5.69 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: system, selectivity: low, policies: 1 $$2.79 \mathrm{ms} \pm 10.6 \mathrm{μs}\left({\color{red}6.02 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: system, selectivity: medium, policies: 29 $$3.06 \mathrm{ms} \pm 15.7 \mathrm{μs}\left({\color{red}6.12 \mathrm{\%}}\right) $$ Flame Graph

read_scaling_complete

Function Value Mean Flame graphs
entity_by_id;one_depth 1 entities $$38.9 \mathrm{ms} \pm 130 \mathrm{μs}\left({\color{gray}2.23 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id;one_depth 10 entities $$76.4 \mathrm{ms} \pm 358 \mathrm{μs}\left({\color{gray}1.87 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id;one_depth 25 entities $$43.3 \mathrm{ms} \pm 149 \mathrm{μs}\left({\color{gray}-0.125 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id;one_depth 5 entities $$45.7 \mathrm{ms} \pm 217 \mathrm{μs}\left({\color{gray}1.51 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id;one_depth 50 entities $$54.0 \mathrm{ms} \pm 285 \mathrm{μs}\left({\color{gray}2.68 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id;two_depth 1 entities $$40.4 \mathrm{ms} \pm 144 \mathrm{μs}\left({\color{gray}0.440 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id;two_depth 10 entities $$418 \mathrm{ms} \pm 763 \mathrm{μs}\left({\color{gray}1.29 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id;two_depth 25 entities $$94.0 \mathrm{ms} \pm 398 \mathrm{μs}\left({\color{gray}1.09 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id;two_depth 5 entities $$84.0 \mathrm{ms} \pm 298 \mathrm{μs}\left({\color{gray}1.08 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id;two_depth 50 entities $$278 \mathrm{ms} \pm 678 \mathrm{μs}\left({\color{gray}0.272 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id;zero_depth 1 entities $$14.5 \mathrm{ms} \pm 54.1 \mathrm{μs}\left({\color{gray}-1.691 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id;zero_depth 10 entities $$14.9 \mathrm{ms} \pm 82.1 \mathrm{μs}\left({\color{gray}1.38 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id;zero_depth 25 entities $$15.2 \mathrm{ms} \pm 82.7 \mathrm{μs}\left({\color{gray}2.45 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id;zero_depth 5 entities $$14.9 \mathrm{ms} \pm 52.9 \mathrm{μs}\left({\color{gray}2.80 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id;zero_depth 50 entities $$18.0 \mathrm{ms} \pm 105 \mathrm{μs}\left({\color{gray}1.06 \mathrm{\%}}\right) $$ Flame Graph

read_scaling_linkless

Function Value Mean Flame graphs
entity_by_id 1 entities $$14.7 \mathrm{ms} \pm 66.9 \mathrm{μs}\left({\color{gray}2.53 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id 10 entities $$14.8 \mathrm{ms} \pm 80.8 \mathrm{μs}\left({\color{gray}2.26 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id 100 entities $$14.7 \mathrm{ms} \pm 80.1 \mathrm{μs}\left({\color{gray}1.96 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id 1000 entities $$15.0 \mathrm{ms} \pm 86.3 \mathrm{μs}\left({\color{gray}-1.371 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id 10000 entities $$22.2 \mathrm{ms} \pm 153 \mathrm{μs}\left({\color{gray}0.126 \mathrm{\%}}\right) $$ Flame Graph

representative_read_entity

Function Value Mean Flame graphs
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/block/v/1 $$29.0 \mathrm{ms} \pm 238 \mathrm{μs}\left({\color{gray}-1.358 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/book/v/1 $$30.6 \mathrm{ms} \pm 313 \mathrm{μs}\left({\color{gray}1.63 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/building/v/1 $$29.3 \mathrm{ms} \pm 294 \mathrm{μs}\left({\color{gray}-0.161 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/organization/v/1 $$29.4 \mathrm{ms} \pm 230 \mathrm{μs}\left({\color{gray}1.25 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/page/v/2 $$29.9 \mathrm{ms} \pm 303 \mathrm{μs}\left({\color{gray}2.60 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/person/v/1 $$29.2 \mathrm{ms} \pm 297 \mathrm{μs}\left({\color{gray}-1.022 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/playlist/v/1 $$28.7 \mathrm{ms} \pm 286 \mathrm{μs}\left({\color{gray}-3.507 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/song/v/1 $$29.5 \mathrm{ms} \pm 287 \mathrm{μs}\left({\color{gray}2.43 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/uk-address/v/1 $$29.9 \mathrm{ms} \pm 241 \mathrm{μs}\left({\color{gray}1.86 \mathrm{\%}}\right) $$ Flame Graph

representative_read_entity_type

Function Value Mean Flame graphs
get_entity_type_by_id Account ID: bf5a9ef5-dc3b-43cf-a291-6210c0321eba $$8.06 \mathrm{ms} \pm 33.1 \mathrm{μs}\left({\color{gray}0.162 \mathrm{\%}}\right) $$ Flame Graph

representative_read_multiple_entities

Function Value Mean Flame graphs
entity_by_property traversal_paths=0 0 $$45.9 \mathrm{ms} \pm 206 \mathrm{μs}\left({\color{gray}0.455 \mathrm{\%}}\right) $$
entity_by_property traversal_paths=255 1,resolve_depths=inherit:1;values:255;properties:255;links:127;link_dests:126;type:true $$93.9 \mathrm{ms} \pm 427 \mathrm{μs}\left({\color{gray}0.586 \mathrm{\%}}\right) $$
entity_by_property traversal_paths=2 1,resolve_depths=inherit:0;values:0;properties:0;links:0;link_dests:0;type:false $$51.4 \mathrm{ms} \pm 253 \mathrm{μs}\left({\color{gray}0.042 \mathrm{\%}}\right) $$
entity_by_property traversal_paths=2 1,resolve_depths=inherit:0;values:0;properties:0;links:1;link_dests:0;type:true $$59.5 \mathrm{ms} \pm 291 \mathrm{μs}\left({\color{gray}-0.240 \mathrm{\%}}\right) $$
entity_by_property traversal_paths=2 1,resolve_depths=inherit:0;values:0;properties:2;links:1;link_dests:0;type:true $$67.9 \mathrm{ms} \pm 318 \mathrm{μs}\left({\color{gray}0.312 \mathrm{\%}}\right) $$
entity_by_property traversal_paths=2 1,resolve_depths=inherit:0;values:2;properties:2;links:1;link_dests:0;type:true $$74.1 \mathrm{ms} \pm 304 \mathrm{μs}\left({\color{gray}0.352 \mathrm{\%}}\right) $$
link_by_source_by_property traversal_paths=0 0 $$49.5 \mathrm{ms} \pm 242 \mathrm{μs}\left({\color{gray}0.043 \mathrm{\%}}\right) $$
link_by_source_by_property traversal_paths=255 1,resolve_depths=inherit:1;values:255;properties:255;links:127;link_dests:126;type:true $$76.5 \mathrm{ms} \pm 365 \mathrm{μs}\left({\color{gray}0.657 \mathrm{\%}}\right) $$
link_by_source_by_property traversal_paths=2 1,resolve_depths=inherit:0;values:0;properties:0;links:0;link_dests:0;type:false $$56.0 \mathrm{ms} \pm 276 \mathrm{μs}\left({\color{gray}0.056 \mathrm{\%}}\right) $$
link_by_source_by_property traversal_paths=2 1,resolve_depths=inherit:0;values:0;properties:0;links:1;link_dests:0;type:true $$63.9 \mathrm{ms} \pm 367 \mathrm{μs}\left({\color{gray}0.452 \mathrm{\%}}\right) $$
link_by_source_by_property traversal_paths=2 1,resolve_depths=inherit:0;values:0;properties:2;links:1;link_dests:0;type:true $$66.2 \mathrm{ms} \pm 317 \mathrm{μs}\left({\color{gray}0.946 \mathrm{\%}}\right) $$
link_by_source_by_property traversal_paths=2 1,resolve_depths=inherit:0;values:2;properties:2;links:1;link_dests:0;type:true $$66.0 \mathrm{ms} \pm 331 \mathrm{μs}\left({\color{gray}0.961 \mathrm{\%}}\right) $$

scenarios

Function Value Mean Flame graphs
full_test query-limited $$136 \mathrm{ms} \pm 494 \mathrm{μs}\left({\color{gray}4.38 \mathrm{\%}}\right) $$ Flame Graph
full_test query-unlimited $$132 \mathrm{ms} \pm 470 \mathrm{μs}\left({\color{gray}-0.484 \mathrm{\%}}\right) $$ Flame Graph
linked_queries query-limited $$38.8 \mathrm{ms} \pm 161 \mathrm{μs}\left({\color{lightgreen}-62.011 \mathrm{\%}}\right) $$ Flame Graph
linked_queries query-unlimited $$582 \mathrm{ms} \pm 1.04 \mathrm{ms}\left({\color{gray}-0.191 \mathrm{\%}}\right) $$ Flame Graph

@lunelson lunelson added this pull request to the merge queue Dec 23, 2025
Merged via the queue into main with commit b6338db Dec 23, 2025
169 checks passed
@lunelson lunelson deleted the ln/h-5847-dynamic-workflows branch December 23, 2025 09:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/apps > hash* Affects HASH (a `hash-*` app) area/apps area/deps Relates to third-party dependencies (area) area/infra Relates to version control, CI, CD or IaC (area)

Development

Successfully merging this pull request may close these issues.

3 participants