Skip to content

Commit 0529e2e

Browse files
Deployed e60e87a with MkDocs version: 1.6.1
1 parent f65b87c commit 0529e2e

File tree

7 files changed

+43
-44
lines changed

7 files changed

+43
-44
lines changed

aggregator/index.html

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -97,11 +97,11 @@
9797
<div class="section" itemprop="articleBody">
9898

9999
<h1 id="aggregator">Aggregator</h1>
100-
<p>The aggregator obtains the mapped data of a ledger (from <code>output/&lt;project_name&gt;/mapped_data.json</code>) and aggregates it
101-
over units of time that are determined based on the given <code>timeframe</code> and <code>aggregate_by</code> parameters.
100+
<p>The aggregator obtains the mapped data of a ledger (from <code>processed_data/&lt;project_name&gt;/mapped_data_&lt;(non_)clustered&gt;.json</code>)
101+
and aggregates it over units of time that are determined based on the given <code>timeframe</code> and <code>aggregate_by</code> parameters.
102102
It then outputs a <code>csv</code> file with the distribution of blocks to entities for each time unit under consideration.
103-
This file is saved in the directory <code>output/&lt;project name&gt;/blocks_per_entity/</code> and is named based on the <code>timeframe</code>
104-
and <code>aggregate_by</code> parameters.
103+
This file is saved in the directory <code>processed_data/&lt;project name&gt;/blocks_per_entity/</code> and is named based on the
104+
<code>timeframe</code> and <code>aggregate_by</code> parameters.
105105
For example, if the specified timeframe is from June 2023 to September 2023 and the aggregation is by month, then
106106
the output file would be named <code>monthly_from_2023-06-01_to_2023-09-30.csv</code> and would be structured as follows:</p>
107107
<pre><code>Entity \ Time period,Jun-2023,Jul-2023,Aug-2023,Sep-2023

index.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -202,5 +202,5 @@ <h2 id="contributing">Contributing</h2>
202202

203203
<!--
204204
MkDocs version : 1.6.1
205-
Build Date UTC : 2024-10-06 20:51:07.725364+00:00
205+
Build Date UTC : 2025-05-06 10:54:49.211765+00:00
206206
-->

mappings/index.html

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -116,8 +116,8 @@ <h1 id="mappings">Mappings</h1>
116116
<p>A mapping is responsible for linking blocks to the entities that created them. While the parsed data contains
117117
information about the addresses that received rewards for producing some block or identifiers that are related to them,
118118
it does not contain information about the entities that control these addresses, which is where the mapping comes in.</p>
119-
<p>The mapping takes as input the parsed data and outputs a file (<code>output/&lt;project_name&gt;/mapped_data.json</code>), which is
120-
structured as follows:</p>
119+
<p>The mapping takes as input the parsed data and outputs a file (<code>processed_data/&lt;project_name&gt;/mapped_data.json</code>),
120+
which is structured as follows:</p>
121121
<pre><code>[
122122
{
123123
&quot;number&quot;: &quot;&lt;block's number&gt;&quot;,

metrics/index.html

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -130,8 +130,9 @@ <h1 id="metrics">Metrics</h1>
130130
or the redundancy, in a population. In practice, it is calculated as the maximum possible entropy minus the observed
131131
entropy. The output is a real number. Values close to 0 indicate equality and values towards infinity indicate
132132
inequality. Therefore, a high Theil Index suggests a population that is highly centralized.</li>
133-
<li><strong>Max power ratio</strong>: The max power ratio represents the share of blocks that are produced by the most "powerful"
134-
entity, i.e. the entity that produces the most blocks. The output of the metric is a decimal number in [0,1].</li>
133+
<li><strong>Concentration ratio</strong>: The n-concentration ratio represents the share of blocks that are produced by the n most
134+
"powerful" entities, i.e. the entities that produce the most blocks. The output of the metric is a decimal
135+
number in [0,1]. Values typically used are the 1-concentration ratio and the 3-concentration ratio.</li>
135136
<li><strong>Tau-decentralization index</strong>: The tau-decentralization index is a generalization of the Nakamoto coefficient.
136137
It is defined as the minimum number of entities that collectively produce more than a given threshold of the total
137138
blocks within a given timeframe. The threshold parameter is a decimal in [0, 1] (0.66 by default) and the output of

search/search_index.json

Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

setup/index.html

Lines changed: 32 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -121,44 +121,42 @@ <h2 id="execution">Execution</h2>
121121
in the <code>raw_block_data/</code> directory, each file named as <code>&lt;project_name&gt;_raw_data.json</code> (e.g., <code>bitcoin_raw_data.json</code>).
122122
By default,
123123
there is a (very small) sample input file for some supported projects; to use it, remove the prefix <code>sample_</code>.</p>
124-
<p>Run <code>python run.py --ledgers &lt;ledger_1&gt; &lt;ledger_n&gt; --timeframe &lt;timeframe&gt; --estimation-window &lt;days to aggregate
125-
blocks by&gt; --frequency &lt;days between two data points&gt;</code> to analyze the n specified ledgers for the given timeframe,
126-
aggregated using the given estimation window and frequency.
127-
All arguments are optional, so it's possible to omit any of them; in this case, the default values
128-
will be used. Specifically:</p>
124+
<p>Run <code>python run.py</code> to run the analysis with the parameters specified in the <code>config.yaml</code> file.</p>
125+
<p>The parameters that can be specified in the <code>config.yaml</code> file are:</p>
129126
<ul>
130-
<li><code>ledgers</code> accepts any number of the supported ledgers (case-insensitive). For example, <code>--ledgers bitcoin</code>
131-
would run the analysis for Bitcoin, while <code>--ledgers Bitcoin Ethereum Cardano</code> would run the analysis for Bitcoin,
132-
Ethereum and Cardano. If the <code>ledgers</code> argument is omitted, then the analysis is performed for the ledgers
133-
specified in the <code>config.yaml</code> file, which are typically all supported ledgers.</li>
134-
<li>The <code>timeframe</code> argument accepts one or two values of the form <code>YYYY-MM-DD</code> (month and day can be
135-
omitted), which indicate the beginning and end of the time period that will be analyzed. For example,
136-
<code>--timeframe 2022</code> would run the analysis for the year 2022 (so from January 1st 2022 to
137-
December 31st 2022), while we could also get the same result using <code>--timeframe 2022-01 2022-12</code> or
138-
<code>--timeframe 2022-01-01 2022-12-31</code>. Similarly, <code>--timeframe 2022-02</code> or <code>--timeframe 2022-02-01 2022-02-28</code> would
139-
do it for the month of February 2022 (February 1st 2022 to February 28th 2022), while <code>--timeframe 2022-02-03</code>
140-
would do it for a single day (Feburary 3rd 2022). Last, <code>--timeframe 2018 2022</code> would run the analysis for the
141-
entire time period between January 1st 2018 and December 31st 2022. If the <code>timeframe</code> argument is omitted, then
142-
the start date and end dates of the time frame are sourced from the <code>config.yaml</code> file.</li>
143-
<li><code>estimation_window</code> corresponds to the number of days that will be used to aggregate the data. For example,
144-
<code>--estimation_window 7</code> means that every data point will use 7 days of blocks to calculate the distribution of
145-
blocks to entities. If left empty, then the entire time frame will be used (only valid when combined with empty frequency).</li>
146-
<li><code>frequency</code> determines how frequently to sample the data, in days. If left empty, then only one data point will be
147-
analyzed (snapshot instead of longitudinal analysis), but this is only valid when combined with an empty estimation_window.</li>
148-
</ul>
149-
<p>Additionally, there are three flags that can be used to customize an execution:</p>
150-
<ul>
151-
<li><code>--force-map</code> forces the parsing, mapping and aggregation to be performed on all data, even if the relevant output
152-
files already exist. This can be useful for when mapping info is updated for some blockchain. By default, this flag is
153-
set to False and the tool only performs the mapping and aggregation when the relevant output files do not exist.</li>
154-
<li><code>--plot</code> enables the generation of graphs at the end of the execution. Specifically, the output of each
127+
<li><code>metrics</code>: a list with the metrics that will be calculated. By default, includes all implemented metrics.</li>
128+
<li><code>ledgers</code>: a list with the ledgers that will be analyzed. By default, includes all supported ledgers.</li>
129+
<li><code>force-map</code>: a flag that can force the parsing, mapping and aggregation to be performed on all data, even if the
130+
relevant output files already exist. This can be useful for when mapping info is updated for some blockchain. By
131+
default, this flag is set to False and the tool only performs the mapping and aggregation when the relevant output
132+
files do not exist.</li>
133+
<li><code>clustering</code>: a flag that specifies whether block producers will be clustered based on the available mapping
134+
information. By default, this flag is set to True.</li>
135+
<li><code>start_date</code>: a value of the form <code>YYYY-MM-DD</code> (month and day can be omitted), which indicates the beginning of the
136+
time period that will be analyzed. </li>
137+
<li><code>end_date</code>: a value of the form <code>YYYY-MM-DD</code> (month and day can be omitted), which indicates the end of the time
138+
period that will be analyzed.</li>
139+
<li><code>estimation_window</code>: the number of days that will be used to aggregate the data. For example,
140+
<code>estimation_window 7</code> means that every data point will use 7 days of blocks to calculate the distribution of
141+
blocks to entities. If left empty, then the entire time frame will be used (only valid when combined with empty
142+
frequency).</li>
143+
<li><code>frequency</code>: number of days that determines how frequently to sample the data. If left empty, then only one data
144+
point will be analyzed (snapshot instead of longitudinal analysis), but this is only valid when combined with an
145+
empty estimation_window.</li>
146+
<li><code>population_windows</code>: number that defines the number of windows to look back and forward when calculating the
147+
population of block producers. For example, <code>population_windows 3</code>, combined with <code>estimation_window 7</code> means that the
148+
population of block producers will be calculated using the blocks produced in the 3 weeks before and after each
149+
week under consideration. If <code>all</code> is specified, then the entire time frame will be used to determine the population.</li>
150+
<li><code>plot</code>: a flag that enables the generation of graphs at the end of the execution. Specifically, the output of each
155151
implemented metric is plotted for the specified ledgers and timeframe, as well as the block production dynamics for each
156152
specified ledger. By default, this flag is set to False and no plots are generated.</li>
157-
<li><code>--animated</code> enables the generation of (additional) animated graphs at the end of the execution. By default, this flag
158-
is set to False and no animated plots are generated. Note that this flag is ignored if <code>--plot</code> is set to False.</li>
153+
<li><code>animated</code>: a flag that enables the generation of (additional) animated graphs at the end of the execution. By
154+
default, this flag is set to False and no animated plots are generated. Note that this flag is ignored if <code>plot</code> is
155+
set to False.</li>
159156
</ul>
160-
<p>All output files can then be found under the <code>output/</code> directory, which is automatically created the first time the tool
161-
is run.</p>
157+
<p>All output files can then be found under the <code>results/</code> directory, which is automatically created the first time the
158+
tool is run. Interim files that are produced by some modules and are used by others can be found under the
159+
<code>processed_data/</code> directory.</p>
162160

163161
</div>
164162
</div><footer>

sitemap.xml.gz

0 Bytes
Binary file not shown.

0 commit comments

Comments
 (0)