Releases: hydro-project/hydro
copy_span v0.1.0
New Features
- introduce
sliced!syntax for processing with anonymous ticks
Bug Fixes
- Add CHANGELOG.md
Commit Statistics
- 2 commits contributed to the release over the course of 17 calendar days.
- 2 commits were understood as conventional.
- 1 unique issue was worked on: #2256
Commit Details
hydro_std v0.14.0
New Features
-
Aggregate client throughput/latency
Co-authored with @shadaj
-
upgrade Stageleft to eliminate
__stagedcompilation during development
Before Stageleft 0.9, we always compiled the__stagedmodule in stage
0, which resulted in significant compilation penalties and Rust Analyzer
thrashing since any file changes triggered a re-run of thebuild.rs.
With Stageleft 0.9, we can defer compiling this module to the trybuild
stage 1.Stageleft 0.9 also cleans up how paths are rewritten to use the
__stagedmodule, so we can simplify our logic as well. The only
significant rewrite remaining is when running unit tests, where we have
to regenerate__stagedto access test-only module, and therefore have
to rewrite all paths to use that module.Finally, in the spirit of improving compilation efficiency, we disable
incremental builds for trybuild stage 1. We generate files with hash
based on contents, so we were never benefitting from incremental
compilation anyways. This reduces the disk space used significantly.
Refactor
- use
async-ssh2-russh(instead oflibssh2bindings), fix #1463
New Features (BREAKING)
-
add stream markers for tracking non-deterministic retries
This introduces an additional type paramter toStreamcalled
Retries, which tracks the presence (or lack) of non-determinstic
retries in the stream.ExactlyOncemeans that each element has
deterministic order, whileAtLeastOncemeans that there may be
non-deterministic duplicates.A
TotalOrder, AtLeastOncestream describes elements with consecutive
duplication, but deterministic order if we ignore those immediate
elements. ANoOrder, AtLeastOncestream has set semantics.Also fixes a bug in the return type for
*_keyed_*, where the output
type was previouslyTotalOrderbut now isNoOrder. We stream the
results of a keyed aggregation out of aHashMap, so the order will
indeed be non-deterministic.
Commit Statistics
- 5 commits contributed to the release over the course of 91 calendar days.
- 4 commits were understood as conventional.
- 4 unique issues were worked on: #1803, #1900, #1907, #1910
Commit Details
view details
- #1803
- #1900
- Aggregate client throughput/latency (17f4a83)
- #1907
- Upgrade Stageleft to eliminate
__stagedcompilation during development (b333b45)
- Upgrade Stageleft to eliminate
- #1910
- Add stream markers for tracking non-deterministic retries (45bd6e9)
- Uncategorized
- Release dfir_lang v0.14.0, dfir_macro v0.14.0, hydro_deploy_integration v0.14.0, lattices_macro v0.5.10, variadics_macro v0.6.1, dfir_rs v0.14.0, hydro_deploy v0.14.0, hydro_lang v0.14.0, hydro_optimize v0.13.0, hydro_std v0.14.0, safety bump 6 crates (0683595)
hydro_optimize v0.13.0
Chore
- clean up dependencies
Usinghydro_optimizeas a regular dependency inhydro_testresults
in leaking many dependencies includinghydro_deploy, so this moves it
to a dev-dependency
New Features
- improve logging for profiling
- Use partitioning analysis results to partition
Test/insta changes stem from changed implementation of
broadcast_bincode, will change again once #1949 is implemented.
Also added missing cases for Persists hidden behind CrossProduct,
Difference, AntiJoin, Join, and Scan for decoupler. - Remove commercial ilp
- Partitioning analysis
Integrating with the partitioner next - capture stack traces for each IR node
Because Hydro is staged, the stack traces capture the structure of the
program, which is helpful for profiling / visualization. - add
scanoperator - Decoupling analysis
A Gurobi license is required to run code that useshydro_optimize(for ILP over decoupling decisions)
Bug Fixes
-
don't snapshot-test backtraces
Backtraces aren't stable across Unix / Windows. Just have a separate
test for them.Also defers resolution of backtraces until we actually need them to
improve performance.
Refactor
-
separate externals from other location kinds, clean up network operators
First, we remove externals fromLocationId, to ensure that a
LocationIdonly is used for locations where we can concretely place
compiled logic. This simplifes a lot of pattern matching where we wanted
to disallow externals.Keys on network inputs / outputs (
from_keyandto_key) are only
relevant to external networking. We extract the logic to instantiate
external network collections, so that the core logic does not need to
deal with keys.
Refactor (BREAKING)
-
invert external sources and clean up locations in IR
First, instead of creating external sources by invoking
external.source_bincode_external(&p), we switch the API to
p.source_bincode_external(&external)for symmetry withsource_iter
andsource_stream.The other, much larger change is to clean up how the IR handles external
inputs and outputs and keeps track of locations. First, we introduce
HydroNode::ExternalInputandHydroLeaf::SendExternalas specialized
nodes for these, so that we no longer create dummy sources / sinks.Then, we eliminate places where we have multiple sources of truth for
where the output of an IR node is located, by instead referring to the
metadata. Because it is easy in optimizer rewrites to corrupt this
metadata, we also add a flag totransform_bottom_upthat lets
developers enable a metadata validity check. We disable it in most
transformations for performance, but enable it in the decoupling
rewrites since it manipulates locations in complex ways.
Commit Statistics
- 12 commits contributed to the release over the course of 12 calendar days.
- 11 commits were understood as conventional.
- 11 unique issues were worked on: #1859, #1930, #1934, #1935, #1937, #1940, #1947, #1952, #1955, #1958, #1962
Commit Details
view details
- #1859
- Decoupling analysis (99a8f1d)
- #1930
- Add
scanoperator (c4b9590)
- Add
- #1934
- Clean up dependencies (edab6c2)
- #1935
- Partitioning analysis (3b013ac)
- #1937
- Capture stack traces for each IR node (d44b225)
- #1940
- Remove commercial ilp (dfaf517)
- #1947
- Don't snapshot-test backtraces (739b622)
- #1952
- Use partitioning analysis results to partition (173d9c0)
- #1955
- Improve logging for profiling (0e6403c)
- #1958
- Invert external sources and clean up locations in IR (fb016b0)
- #1962
- Separate externals from other location kinds, clean up network operators (22a7d0d)
- Uncategorized
- Release dfir_lang v0.14.0, dfir_macro v0.14.0, hydro_deploy_integration v0.14.0, lattices_macro v0.5.10, variadics_macro v0.6.1, dfir_rs v0.14.0, hydro_deploy v0.14.0, hydro_lang v0.14.0, hydro_optimize v0.13.0, hydro_std v0.14.0, safety bump 6 crates (0683595)
hydro_lang v0.14.0
Chore
- enable
vizfeature only in dev-dependencies
Reduces compilation burden when runninghydro_testexamples. - update
proc-macro-crate - update pinned nightly to 2025-04-27, update span API usage
New Features
-
Use partitioning analysis results to partition
Test/insta changes stem from changed implementation of
broadcast_bincode, will change again once #1949 is implemented.
Also added missing cases for Persists hidden behind CrossProduct,
Difference, AntiJoin, Join, and Scan for decoupler. -
make it easier to open trybuild-generated files with Rust Analyzer
To fully compile the generated sources without error, the Stageleft
environment variable needt to be passed to build scripts, so
pre-configure that in workspace settings. -
graph viz for Hydro lang
-
allow running generated binaries with single-threaded Tokio runtime
Before, we had a janky architecture for establishing network connections
which relied on blocking on futures outside an async context, which
required a multi-threaded runtime. Now, we establish all connections
before launching the DFIR code, so that no blocking is required there. -
capture stack traces for each IR node
Because Hydro is staged, the stack traces capture the structure of the
program, which is helpful for profiling / visualization. -
add
scanoperator -
Decoupling analysis
A Gurobi license is required to run code that useshydro_optimize(for ILP over decoupling decisions) -
add
*_idempotentvariations for (keyed) fold / reduce
We were missing variants where the stream is totally ordered with
consecutive retries, which show up when sampling singletons. This adds
those in. -
upgrade Stageleft to eliminate
__stagedcompilation during development
Before Stageleft 0.9, we always compiled the__stagedmodule in stage
0, which resulted in significant compilation penalties and Rust Analyzer
thrashing since any file changes triggered a re-run of thebuild.rs.
With Stageleft 0.9, we can defer compiling this module to the trybuild
stage 1.Stageleft 0.9 also cleans up how paths are rewritten to use the
__stagedmodule, so we can simplify our logic as well. The only
significant rewrite remaining is when running unit tests, where we have
to regenerate__stagedto access test-only module, and therefore have
to rewrite all paths to use that module.Finally, in the spirit of improving compilation efficiency, we disable
incremental builds for trybuild stage 1. We generate files with hash
based on contents, so we were never benefitting from incremental
compilation anyways. This reduces the disk space used significantly. -
Assign cycles IDs that are globally unique (across clusters/processes)
Co-authored with @shadaj
Bug Fixes
-
emit appropriate
Persist/Unpersistnodes forscan -
ensure that singletons and optionals always have cardinality 1 inside a tick
Otherwise, cycling a singleton can result in a memory leak, as we found
in PBFT implementation. Now, we strictly require that inside a tick,
singletons/optionals are represented by a stream with 1 or fewer
elements. -
don't snapshot-test backtraces
Backtraces aren't stable across Unix / Windows. Just have a separate
test for them.Also defers resolution of backtraces until we actually need them to
improve performance. -
correctly enable staged-trybuild mode when cross-compiling
RUSTFLAGSare not passed to build scripts, use a regular environment
variable instead. Should also dramatically improve cache hit rate for
sccache since the rustflags for non-stageleft crates are untouched.
Refactor
-
separate externals from other location kinds, clean up network operators
First, we remove externals fromLocationId, to ensure that a
LocationIdonly is used for locations where we can concretely place
compiled logic. This simplifes a lot of pattern matching where we wanted
to disallow externals.Keys on network inputs / outputs (
from_keyandto_key) are only
relevant to external networking. We extract the logic to instantiate
external network collections, so that the core logic does not need to
deal with keys. -
rename
ExternalProcesstoExternal
In preparation of supporting multi-connection source / sink pairs. There
is no need to distinguish a single external from multiple at the
location level, instead we will have separate APIs for declaring a
single vs multi connection input/output. -
minimize Tokio feature flags
Now thathydro_langno longer needs multi-threaded runtime, we can
eliminate it from the features used intrybuildcompilation. Minimizes
Tokio features elsewhere too. -
migrate tests from hydro_test_local
-
use
async-ssh2-russh(instead oflibssh2bindings), fix #1463
New Features (BREAKING)
-
add stream markers for tracking non-deterministic retries
This introduces an additional type paramter toStreamcalled
Retries, which tracks the presence (or lack) of non-determinstic
retries in the stream.ExactlyOncemeans that each element has
deterministic order, whileAtLeastOncemeans that there may be
non-deterministic duplicates.A
TotalOrder, AtLeastOncestream describes elements with consecutive
duplication, but deterministic order if we ignore those immediate
elements. ANoOrder, AtLeastOncestream has set semantics.Also fixes a bug in the return type for
*_keyed_*, where the output
type was previouslyTotalOrderbut now isNoOrder. We stream the
results of a keyed aggregation out of aHashMap, so the order will
indeed be non-deterministic.
Refactor (BREAKING)
-
invert external sources and clean up locations in IR
First, instead of creating external sources by invoking
external.source_bincode_external(&p), we switch the API to
p.source_bincode_external(&external)for symmetry withsource_iter
andsource_stream.The other, much larger change is to clean up how the IR handles external
inputs and outputs and keeps track of locations. First, we introduce
HydroNode::ExternalInputandHydroLeaf::SendExternalas specialized
nodes for these, so that we no longer create dummy sources / sinks.Then, we eliminate places where we have multiple sources of truth for
where the output of an IR node is located, by instead referring to the
metadata. Because it is easy in optimizer rewrites to corrupt this
metadata, we also add a flag totransform_bottom_upthat lets
developers enable a metadata validity check. We disable it in most
transformations for performance, but enable it in the decoupling
rewrites since it manipulates locations in complex ways. -
remove support for macro entrypoints
Commit Statistics
- 26 commits contributed to the release over the course of 92 calendar days.
- 25 commits were understood as conventional.
- 25 unique issues were worked on: #1803, #1843, #1859, #1902, #1905, #1907, #1909, #1910, #1916, #1930, #1936, #1937, #1938, #1939, #1941, #1944, #1945, #1947, #1948, #1951, #1952, #1956, #1958, #1959, #1962
Commit Details
view details
- #1803
- #1843
- Update pinned nightly to 2025-04-27, update span API usage (98baec7)
- #1859
- Decoupling analysis (99a8f1d)
- #1902
- Assign cycles IDs that are globally unique (across clusters/processes) (863cb9e)
- #1905
- Migrate tests from hydro_test_local (eaac1f4)
- #1907
- Upgrade Stageleft to eliminate
__stagedcompilation during development (b333b45)
- Upgrade Stageleft to eliminate
- #1909
- Remove support for macro entrypoints (6e29285)
- #1910
- Add stream markers for tracking non-deterministic retries (45bd6e9)
- #1916
- Add
*_idempotentvariations for (keyed) fold / reduce (6b0483d)
- Add
- #1930
- Add
scanoperator (c4b9590)
- Add
- #1936
- Graph viz for Hydro lang (4d15ff1)
- #1937
- Capture stack traces for each IR node (d44b225)
- #1938
- Allow running generated binaries with single-threaded Tokio runtime (bd1afdf)
- #1939
- Minimize Tokio feature flags (59041df)
- #1941
- Correctly enable staged-trybuild mode when cross-compiling (6699197)
- #1944
- Update
proc-macro-crate(3d40d1a)
- Update
- #1945
- Make it easier to open trybuild-generated files with Rust Analyzer (4035cae)
- #1947
- Don't snapshot-test backtraces (739b622)
- #1948
- Ensure that singletons and optionals always have cardinality 1 inside a tick (8858abd)
- #1951
- Enable
vizfeature only in dev-dependencies (a3280d9)
- Enable
- #1952
- Use partitioning analysis results to partition (173d9c0)
- #1956
- Rename
ExternalProcesstoExternal(49c1918)
- Rename
- #1958
- Invert external sources and clean up locations in IR (fb016b0)
- #1959
- Emit appropriate
Persist/Unpersistnodes forscan(1fc9f0d)
- Emit appropriate
- #1962
- Separate externals from other location kinds, clean up network operators (22a7d0d)
- Uncategorized
- Release dfir_lang v0.14.0, dfir_macro v0.14.0, hydro_deploy_integration v0.14.0, lattices_macro v0.5.10, variadi...
hydro_deploy v0.14.0
Documentation
- add basic
hydro_deploy,tracingdocs, fix #1205
Also removes extensions from links, for simplicity.
New Features
-
improve logging for profiling
-
allow running generated binaries with single-threaded Tokio runtime
Before, we had a janky architecture for establishing network connections
which relied on blocking on futures outside an async context, which
required a multi-threaded runtime. Now, we establish all connections
before launching the DFIR code, so that no blocking is required there. -
Decoupling analysis
A Gurobi license is required to run code that useshydro_optimize(for ILP over decoupling decisions) -
upgrade Stageleft to eliminate
__stagedcompilation during development
Before Stageleft 0.9, we always compiled the__stagedmodule in stage
0, which resulted in significant compilation penalties and Rust Analyzer
thrashing since any file changes triggered a re-run of thebuild.rs.
With Stageleft 0.9, we can defer compiling this module to the trybuild
stage 1.Stageleft 0.9 also cleans up how paths are rewritten to use the
__stagedmodule, so we can simplify our logic as well. The only
significant rewrite remaining is when running unit tests, where we have
to regenerate__stagedto access test-only module, and therefore have
to rewrite all paths to use that module.Finally, in the spirit of improving compilation efficiency, we disable
incremental builds for trybuild stage 1. We generate files with hash
based on contents, so we were never benefitting from incremental
compilation anyways. This reduces the disk space used significantly. -
Allow VM names to be customized to ease debugging
Co-authored with @shadaj -
update how progress is displayed, fix #1415
Bug Fixes
- VM names violating GCP's regex
- use
--target-dirinstead of environment variable to improve caching
sccache includes all environment variables starting withCARGO_in the
cache key, so this would cause misses for all trybuild compilation.
Along with mozilla/sccache#2424, this improves
compilation caching. - correctly enable staged-trybuild mode when cross-compiling
RUSTFLAGSare not passed to build scripts, use a regular environment
variable instead. Should also dramatically improve cache hit rate for
sccache since the rustflags for non-stageleft crates are untouched.
Other
- remove hydro_cli to fix build on AL2
Refactor
- Encapsulate stdout/stderr handling in new
PriorityBroadcasttype, fix #1357 - use
blake3hash intead of random for buildunique_id, fix #1337 - use
async-ssh2-russh(instead oflibssh2bindings), fix #1463
New Features (BREAKING)
-
re-add loop lifetimes for anti_join_multiset, tests, remove MonotonicMap, fix #1830, fix #1823
Redo of #1835Also updates path of trybuild errors to allow them to be clicked in the
IDE
Previous commit:
Also implements loop lifetimes for
difference_multisetwhich uses the
anti_join_multisetcodegen.Updates tests for
difference,difference_multiset,anti_join, and
anti_join_multiset
Refactor (BREAKING)
- use direct
&dyn Anyupcasting for Rust 1.86, update pyo3, fix #1821
Commit Statistics
- 17 commits contributed to the release over the course of 91 calendar days.
- 16 commits were understood as conventional.
- 16 unique issues were worked on: #1803, #1825, #1844, #1845, #1849, #1856, #1859, #1901, #1907, #1911, #1918, #1938, #1941, #1943, #1955, #1961
Commit Details
view details
- #1803
- #1825
- #1844
- #1845
- #1849
- #1856
- #1859
- Decoupling analysis (99a8f1d)
- #1901
- Allow VM names to be customized to ease debugging (8705f97)
- #1907
- Upgrade Stageleft to eliminate
__stagedcompilation during development (b333b45)
- Upgrade Stageleft to eliminate
- #1911
- #1918
- Remove hydro_cli to fix build on AL2 (555b83e)
- #1938
- Allow running generated binaries with single-threaded Tokio runtime (bd1afdf)
- #1941
- Correctly enable staged-trybuild mode when cross-compiling (6699197)
- #1943
- Use
--target-dirinstead of environment variable to improve caching (bea805d)
- Use
- #1955
- Improve logging for profiling (0e6403c)
- #1961
- VM names violating GCP's regex (3a3ce7f)
- Uncategorized
- Release dfir_lang v0.14.0, dfir_macro v0.14.0, hydro_deploy_integration v0.14.0, lattices_macro v0.5.10, variadics_macro v0.6.1, dfir_rs v0.14.0, hydro_deploy v0.14.0, hydro_lang v0.14.0, hydro_optimize v0.13.0, hydro_std v0.14.0, safety bump 6 crates (0683595)
example_test v0.0.0
Refactor
Test
- test some hydro examples on localhost, fix #1374
Commit Statistics
- 2 commits contributed to the release over the course of 69 calendar days.
- 2 commits were understood as conventional.
- 2 unique issues were worked on: #1847, #1848
Commit Details
dfir_rs v0.14.0
Documentation
- add note to install nodejs
New Features
- allow running generated binaries with single-threaded Tokio runtime
Before, we had a janky architecture for establishing network connections
which relied on blocking on futures outside an async context, which
required a multi-threaded runtime. Now, we establish all connections
before launching the DFIR code, so that no blocking is required there. - add
scanoperator - Decoupling analysis
A Gurobi license is required to run code that useshydro_optimize(for ILP over decoupling decisions)
Bug Fixes
- Revert anti join allocation
Added unit test for Paxos compilation and non-negative throughtput - add type arguments to
anti_join_multiset,difference_multisetto mitigate #1857 - workaround to publish
example_test
Refactor
- minimize Tokio feature flags
Now thathydro_langno longer needs multi-threaded runtime, we can
eliminate it from the features used intrybuildcompilation. Minimizes
Tokio features elsewhere too. - move example testing code into separate crate
To prep for testing of hydro_deploy #1374 #1810
Test
- test some hydro examples on localhost, fix #1374
Chore (BREAKING)
-
move datalog from repo, remove datalog playground from web, #1809
#1809moved to https://github.com/hydro-project/dfir-datalog
tests moved in hydro-project/dfir-datalog#1
Removes dedalus examples in
hydro_cli_examplesChanges pinned nightly rust version from 2024-04-05 to 2024-04-04 as the
former did not have intel mac support.
New Features (BREAKING)
-
re-add loop lifetimes for anti_join_multiset, tests, remove MonotonicMap, fix #1830, fix #1823
Redo of #1835Also updates path of trybuild errors to allow them to be clicked in the
IDE
Previous commit:
Also implements loop lifetimes for
difference_multisetwhich uses the
anti_join_multisetcodegen.Updates tests for
difference,difference_multiset,anti_join, and
anti_join_multiset -
display loops in graph visualizations, refactor, fix #1699
Adds loops to display, newGraphWrite.no_loopsoption.Refactors how the heirarchy of
GraphWriteitems is handled to be
simpler.
Refactor (BREAKING)
- use direct
&dyn Anyupcasting for Rust 1.86, update pyo3, fix #1821
Commit Statistics
- 15 commits contributed to the release over the course of 93 calendar days.
- 14 commits were understood as conventional.
- 13 unique issues were worked on: #1825, #1837, #1847, #1848, #1851, #1858, #1859, #1860, #1911, #1912, #1929, #1938, #1939
Commit Details
view details
- #1825
- #1837
- #1847
- Move example testing code into separate crate (cb54ace)
- #1848
- #1851
- #1858
- #1859
- Decoupling analysis (99a8f1d)
- #1860
- Revert anti join allocation (5b5bbe5)
- #1911
- #1912
- Add note to install nodejs (ec1d8a0)
- #1929
- Add
scanoperator (b58dfc8)
- Add
- #1938
- Allow running generated binaries with single-threaded Tokio runtime (bd1afdf)
- #1939
- Minimize Tokio feature flags (59041df)
- Uncategorized
- Workaround to publish
example_test(96ec97a) - Release dfir_lang v0.14.0, dfir_macro v0.14.0, hydro_deploy_integration v0.14.0, lattices_macro v0.5.10, variadics_macro v0.6.1, dfir_rs v0.14.0, hydro_deploy v0.14.0, hydro_lang v0.14.0, hydro_optimize v0.13.0, hydro_std v0.14.0, safety bump 6 crates (0683595)
- Workaround to publish
variadics_macro v0.6.1
Chore
- update
proc-macro-crate
Commit Statistics
- 1 commit contributed to the release over the course of 5 calendar days.
- 1 commit was understood as conventional.
- 1 unique issue was worked on: #1944
Commit Details
lattices_macro v0.5.10
Chore
- update
proc-macro-crate
Commit Statistics
- 1 commit contributed to the release over the course of 5 calendar days.
- 1 commit was understood as conventional.
- 1 unique issue was worked on: #1944
Commit Details
hydro_deploy_integration v0.14.0
New Features
- allow running generated binaries with single-threaded Tokio runtime
Before, we had a janky architecture for establishing network connections
which relied on blocking on futures outside an async context, which
required a multi-threaded runtime. Now, we establish all connections
before launching the DFIR code, so that no blocking is required there.
Bug Fixes
- leftover logging when setting up Unix sockets
Oops!
Refactor
- minimize Tokio feature flags
Now thathydro_langno longer needs multi-threaded runtime, we can
eliminate it from the features used intrybuildcompilation. Minimizes
Tokio features elsewhere too. - eliminate
pin-projectproc macro dependency
This was the only use of the proc-macro version along the Hydro
dependencies, we can just use the declarative macro version.
Commit Statistics
- 4 commits contributed to the release over the course of 8 calendar days.
- 4 commits were understood as conventional.
- 4 unique issues were worked on: #1933, #1938, #1939, #1963