Skip to content

Commit d448a08

Browse files
committed
[docs] Update execution instructions
1 parent b38c2a2 commit d448a08

File tree

1 file changed

+69
-2
lines changed

1 file changed

+69
-2
lines changed

docs/setup.md

Lines changed: 69 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,10 +16,77 @@ project:
1616

1717
python -m pip install -r requirements.txt
1818

19-
2019
## Execution
2120

2221
The tokenomics decentralization analysis tool is a CLI tool.
23-
The following process describes the most typical workflow.
22+
To run the tool simply do:
23+
24+
python run.py
25+
26+
The execution is controlled and parameterized by the configuration file
27+
`config.yml` as follows.
28+
29+
`metrics` defines the metrics that should be computed in the analysis. By
30+
default all supported metrics are included here (to add support for a new metric
31+
see the [conributions
32+
page](https://blockchain-technology-lab.github.io/tokenomics-decentralization/contribute/)).
33+
34+
`ledgers` defines the ledgers that should be analyzed. By default, all supported
35+
ledgers are included here (to add support for a new ledger see the [conributions
36+
page](https://blockchain-technology-lab.github.io/tokenomics-decentralization/contribute/)).
37+
38+
`execution_flags` defines various flags that control the data handling:
39+
40+
* `force_map_addresses`: the address helper data from the directory
41+
`mapping_information` is re-computed; you should set this flag to true if the
42+
data has been updated since the last execution for the given ledger
43+
* `force_map_balances`: the balance data of the ledger's addresses is
44+
recomputed; you should set this flag to true if the data has been updated
45+
since the last execution for the given ledger
46+
* `force_analyze`: the computation of a metric is recomputed; you should set
47+
this flag to true if any type of data has been updated since the last
48+
execution for the given ledger
49+
50+
`analyze_flags` defines various analysis-related flags:
51+
52+
* `no_clustering`: a boolean that disables clustering of addresses (under the
53+
same entity, as defined in the mapping information)
54+
* `top_limit_type`: a string of two values (`absolute` or `percentage`) that
55+
enables applying a threshold on the addresses that will be considered
56+
* `top_limit_value`: the value of the top limit that should be applied; if 0,
57+
then no limit is used (regardless of the value of `top_limit_type`); if the
58+
type is `absolute`, then the `top_limit_value` should be an integer (e.g., if
59+
set to 100, then only the 100 wealthiest entities/addresses will be considered
60+
in the analysis); if the type is `percentage` the the `top_limit_value` should
61+
be an integer (e.g., if set to 0.50, then only the top 50% of wealthiest
62+
entities/addresses will be considered)
63+
* `exclude_contract_addresses`: a boolean value that enables the exclusion of
64+
contract addresses from the analysis
65+
* `exclude_below_usd_cent`: a boolean value that enables the exclusion of
66+
addresses, the balance of which at the analyzed point in time was less than
67+
$0.01 (based on the historical price information in the directory
68+
`price_data`)
69+
70+
`snapshot_dates` and `granularity` control the snapshots for which an analysis
71+
will be performed. `granularity` is a string that can be empty or one of `day`, `week`,
72+
`month`, `year`. If granularity is empty, then `snapshot_dates` define the exact
73+
time points for which an analysis will be conducted, in the form YYYY-MM-DD.
74+
Otherwise, if granularity is set, then the two farthest entries in
75+
`snapshot_dates` define the timeframe over which the analysis will be conducted,
76+
at the set granular rate. For example, if the farthest points are `2010` and
77+
`2023` and the granularity is set to `month`, then (the first day of) every
78+
month in the years 2010-2023 (inclusive) will be analyzed.
79+
80+
`input_directories` and `output_directories` are both lists of directories that
81+
define the source of data. `input_directories` defines the directories that
82+
contain raw address balance information, as obtained from BigQuery or a full
83+
node (for more information about this see the [data collection
84+
page](https://blockchain-technology-lab.github.io/tokenomics-decentralization/data/)).
85+
`output_directories` defines the directories to store the databases which
86+
contain the mapping information and analyzed data. The first entry in the output
87+
directories is also used to store the output files of the analysis and the
88+
plots.
2489

90+
Finally, `plot_parameters` contains various parameters that control the type and
91+
data that will be produced as plots.
2592
...

0 commit comments

Comments
 (0)