You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on May 4, 2021. It is now read-only.
Copy file name to clipboardExpand all lines: docs/source/implementation.rst
+89-1Lines changed: 89 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -95,4 +95,92 @@ git or python versions or we find a way to make `setuptools_scm` to detect
95
95
the same version at buildtime and runtime.
96
96
97
97
See `<https://github.com/MartinThoma/MartinThoma.github.io/blob/1235fcdecda4d71b42fc07bfe7db327a27e7bcde/content/2018-11-13-python-package-versions.md>`_
98
-
for other comparative versioning python packages.
98
+
for other comparative versioning python packages.
99
+
100
+
101
+
Changing Bandwidth file monitoring KeyValues
102
+
--------------------------------------------
103
+
104
+
In version 1.1.0 we added KeyValues call ``recent_X_count`` and
105
+
``relay_X_count`` which implied to modify serveral parts of the code.
106
+
107
+
We only stored numbers for simpliciy, but then the value of this numbers
108
+
accumulate over the time and there is no way to know to which number decrease
109
+
since some of the main objects are not recreated at runtime and do not have
110
+
attributes about when they were created or updated.
111
+
The relations between the object do no follow usual one-to-many or many-to-many
112
+
relationships either, to be able to induce some numbers from the related
113
+
objects.
114
+
115
+
The only way we could think to solve this is to store list of timestamps,
116
+
instead of just numbers, as an attribute in the objects that need to store
117
+
some counting.
118
+
119
+
Where the values of the keys come from?
120
+
```````````````````````````````````````
121
+
122
+
In the file system, there are only two types of files were these values can be
123
+
stored:
124
+
- the results files in ``datadir``
125
+
- the ``state.dat`` file
126
+
127
+
Because of the structure of the content in the results files, they can store
128
+
KeyValues for the relays, but not for the headers, which need to be stored in
129
+
the ``state.dat`` file.
130
+
131
+
The classes that manage these KeyValues are:
132
+
133
+
``RelayList``:
134
+
135
+
- recent_consensus_count
136
+
- recent_measurement_attempt_count
137
+
138
+
``RelayPrioritizer``:
139
+
140
+
- recent_priority_list_count
141
+
- recent_priority_relay_count
142
+
143
+
``Relay`` and ``Result``:
144
+
145
+
- relay_in_recent_consensus_count
146
+
- relay_recent_measurement_attempt_count
147
+
- relay_recent_priority_list_count
148
+
149
+
Transition from numbers to datetimes
150
+
````````````````````````````````````
151
+
152
+
The KeyValues named ``_count`` in the results and the state will be ignored
153
+
when sbws is restarted with this change, since they will be written without
154
+
``count`` names in these files json .
155
+
156
+
We could add code to count this in the transition to this version, but these
157
+
numbers are wrong anyway and we don't think it's worth the effort since they
158
+
will be correct after 5 days and they have been wrong for long time.
159
+
160
+
Additionally ``recent_measurement_failure_count`` will be negative, since it's
161
+
calculated as ``recent_measurement_attempt_count`` minus all the results.
162
+
While the total number of results in the last 5 days is corrrect, the number of
163
+
the attempts won't be until 5 days have pass.
164
+
165
+
Disadvantages
166
+
`````````````
167
+
168
+
``sbws generate``, with 27795 measurement attempts takes 1min instead of a few
169
+
seconds.
170
+
The same happens with the ``RelayPrioritizer.best_priority``, though so far
171
+
that seems ok since it's a python generator in a thread and the measurements
172
+
start before it has calculated all the priorities.
173
+
The same happens with the ``ResultDump`` that read/write the data in a thread.
174
+
175
+
Conclussion
176
+
```````````
177
+
178
+
All these changes required lot of effort and are not optimal. It was the way
179
+
we could correct and maintain 1.1.0 version.
180
+
If a 2.0 version happens, we highly recommend re-design the data structures to
181
+
use a database using a well maintained ORM library, which will avoid the
182
+
limitations of json files, errors in data types conversions and which is
183
+
optimized for the type of counting and statistics we aim to.
184
+
185
+
.. note:: Documentation about a possible version 2.0 and the steps to change
0 commit comments