GitHub - GIS-PuppetMaster/DistJoin: DistJoin: A Decoupled Join Cardinality Estimator based on Adaptive Neural Predicate Modulation

This is the source code of paper DistJoin: A Decoupled Join Cardinality Estimator based on Adaptive Neural Predicate Modulation

Env setup

Install python3.12
Install required packages in requirements.txt python install -r requirements.txt
Install our sampler package that response for generating training data dynamically during training python install ./MySampler/setup.py install

Prepare Dataset

Put the JOB datasets into ./datasets/job, all csv table should have headers
You can update the true cards by first removing all {wokload}.pkl file in ./queries and run the ./queries/ConvertMSCNTestWorkload.py, which will automatically calculates true cards and convert the test workloads to MSCN's format
Use ./queries/GetJoinWithoutPredicatesCard.py to pre-calculates the cardinality of queries' join schemas if needed

Setup experiments

Use ./Configs/IMDB/IMDB.yaml to set experiments, or you can use the default one to perform our experiments in the paper

Train DistJoin

Run python train.py
Copy the exp mark in the output for latter testing, which is a timestamp

Test DistJoin

Run python eval-IMDB-all.py --config=IMDB --no_wandb and enter the exp mark to evaluate the workloads configurated in the IMDB.yaml file, it will cover all five join conditions on that workload
Check the results in the output and the ./results/DistJoin

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Configs/IMDB		Configs/IMDB
MySampler		MySampler
datasets		datasets
model		model
queries		queries
utils		utils
GetPGCardEst.py		GetPGCardEst.py
GetTableAndWorkloadSummary.py		GetTableAndWorkloadSummary.py
collect_result.py		collect_result.py
common.py		common.py
data_str2num.py		data_str2num.py
datasets.py		datasets.py
distributions.py		distributions.py
estimator.py		estimator.py
eval-IMDB-all.py		eval-IMDB-all.py
eval-IMDB-epochs.py		eval-IMDB-epochs.py
eval-IMDB.py		eval-IMDB.py
eval.py		eval.py
eval_IMDB_PilotScope.py		eval_IMDB_PilotScope.py
experiments.py		experiments.py
factorized_sampler.py		factorized_sampler.py
gather_train_data.py		gather_train_data.py
join_utils.py		join_utils.py
masking.py		masking.py
readme.md		readme.md
requirements.txt		requirements.txt
run_eval.py		run_eval.py
run_train.py		run_train.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Env setup

Prepare Dataset

Setup experiments

Train DistJoin

Test DistJoin

About

Uh oh!

Releases

Packages

Uh oh!

Languages

GIS-PuppetMaster/DistJoin

Folders and files

Latest commit

History

Repository files navigation

Env setup

Prepare Dataset

Setup experiments

Train DistJoin

Test DistJoin

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages