@@ -17,14 +17,14 @@ replication, fragmentation) while suppressing the balancing policies using Lua.
1717Mantle is based on [1] but the current implementation does *NOT * have the
1818following features from that paper:
1919
20- 1 . Balancing API: in the paper, the user fills in when, where, how much, and
20+ # . Balancing API: in the paper, the user fills in when, where, how much, and
2121 load calculation policies. Currently, Mantle requires only that Lua policies
2222 return a table of target loads (for example, how much load to send to each
2323 MDS)
24- 2 . The "how much" hook: in the paper, there was a hook that allowed the user to
24+ # . The "how much" hook: in the paper, there was a hook that allowed the user to
2525 control the "fragment selector policy". Currently, Mantle does not have this
2626 hook.
27- 3 . "Instantaneous CPU utilization" as a metric.
27+ # . "Instantaneous CPU utilization" as a metric.
2828
2929[1] Supercomputing '15 Paper:
3030http://sc15.supercomputing.org/schedule/event_detail-evid=pap168.html
@@ -58,9 +58,9 @@ metadata load:
5858Mantle with `vstart.sh `
5959~~~~~~~~~~~~~~~~~~~~~~~
6060
61- 1 . Start Ceph and tune the logging so we can see migrations happen:
61+ # . Start Ceph and tune the logging so we can see migrations happen:
6262
63- ::
63+ ::
6464
6565 cd build
6666 ../src/vstart.sh -n -l
@@ -71,37 +71,37 @@ Mantle with `vstart.sh`
7171 done
7272
7373
74- 2 . Put the balancer into RADOS:
74+ # . Put the balancer into RADOS:
7575
76- ::
76+ ::
7777
7878 bin/rados put --pool=cephfs_metadata_a greedyspill.lua ../src/mds/balancers/greedyspill.lua
7979
8080
81- 3 . Activate Mantle:
81+ # . Activate Mantle:
8282
83- ::
83+ ::
8484
8585 bin/ceph fs set cephfs max_mds 5
8686 bin/ceph fs set cephfs_a balancer greedyspill.lua
8787
8888
89- 4 . Mount CephFS in another window:
89+ # . Mount CephFS in another window:
9090
91- ::
91+ ::
9292
93- bin/ceph-fuse /cephfs -o allow_other &
94- tail -f out/mds.a.log
93+ bin/ceph-fuse /cephfs -o allow_other &
94+ tail -f out/mds.a.log
9595
9696
9797 Note that if you look at the last MDS (which could be a, b, or c -- it's
9898 random), you will see an attempt to index a nil value. This is because the
9999 last MDS tries to check the load of its neighbor, which does not exist.
100100
101- 5 . Run a simple benchmark. In our case, we use the Docker mdtest image to
101+ # . Run a simple benchmark. In our case, we use the Docker mdtest image to
102102 create load:
103103
104- ::
104+ ::
105105
106106 for i in 0 1 2; do
107107 docker run -d \
@@ -112,9 +112,9 @@ Mantle with `vstart.sh`
112112 done
113113
114114
115- 6 . When you are done, you can kill all the clients with:
115+ # . When you are done, you can kill all the clients with:
116116
117- ::
117+ ::
118118
119119 for i in 0 1 2 3; do docker rm -f client$i; done
120120
@@ -166,7 +166,7 @@ Implementation Details
166166Most of the implementation is in MDBalancer. Metrics are passed to the balancer
167167policies via the Lua stack and a list of loads is returned back to MDBalancer.
168168It sits alongside the current balancer implementation and it's enabled with a
169- Ceph CLI command (" ceph fs set cephfs balancer mybalancer.lua" ). If the Lua policy
169+ Ceph CLI command (`` ceph fs set cephfs balancer mybalancer.lua `` ). If the Lua policy
170170fails (for whatever reason), we fall back to the original metadata load
171171balancer. The balancer is stored in the RADOS metadata pool and a string in the
172172MDSMap tells the MDSs which balancer to use.
@@ -193,19 +193,19 @@ at the Ceph source code to see which metrics are exposed. We figure that the
193193Mantle developer will be in touch with MDS internals anyways.
194194
195195The metrics exposed to the Lua policy are the same ones that are already stored
196- in mds_load_t: auth.meta_load(), all.meta_load(), req_rate, queue_length ,
197- cpu_load_avg.
196+ in `` mds_load_t ``: `` auth.meta_load() ``, `` all.meta_load() ``, `` req_rate `` ,
197+ `` queue_length ``, `` cpu_load_avg `` .
198198
199199Compile/Execute the Balancer
200200~~~~~~~~~~~~~~~~~~~~~~~~~~~~
201201
202- Here we use `lua_pcall ` instead of `lua_call ` because we want to handle errors
202+ Here we use `` lua_pcall `` instead of `` lua_call ` ` because we want to handle errors
203203in the MDBalancer. We do not want the error propagating up the call chain. The
204204cls_lua class wants to handle the error itself because it must fail gracefully.
205205For Mantle, we don't care if a Lua error crashes our balancer -- in that case,
206206we will fall back to the original balancer.
207207
208- The performance improvement of using `lua_call ` over `lua_pcall ` would not be
208+ The performance improvement of using `` lua_call `` over `` lua_pcall ` ` would not be
209209leveraged here because the balancer is invoked every 10 seconds by default.
210210
211211Returning Policy Decision to C++
@@ -220,7 +220,7 @@ Lua side.
220220Iterating through tables returned by Lua is done through the stack. In Lua
221221jargon: a dummy value is pushed onto the stack and the next iterator replaces
222222the top of the stack with a (k, v) pair. After reading each value, pop that
223- value but keep the key for the next call to `lua_next `.
223+ value but keep the key for the next call to `` lua_next ``.
224224
225225Reading from RADOS
226226~~~~~~~~~~~~~~~~~~
@@ -254,16 +254,16 @@ the cls logging interface:
254254 BAL_LOG(0, "this is a log message")
255255
256256
257- It is implemented by passing a function that wraps the `dout ` logging framework
258- (`dout_wrapper `) to Lua with the `lua_register() ` primitive. The Lua code is
259- actually calling the `dout ` function in C++.
257+ It is implemented by passing a function that wraps the `` dout ` ` logging framework
258+ (`` dout_wrapper `` ) to Lua with the `` lua_register() ` ` primitive. The Lua code is
259+ actually calling the `` dout ` ` function in C++.
260260
261261Warning and Info messages are centralized using the clog/Beacon. Successful
262262messages are only sent on version changes by the first MDS to avoid spamming
263- the `ceph -w ` utility. These messages are used for the integration tests.
263+ the `` ceph -w ` ` utility. These messages are used for the integration tests.
264264
265265Testing
266266~~~~~~~
267267
268- Testing is done with the ceph-qa-suite ( tasks.cephfs.test_mantle). We do not
268+ Testing is done with the `` ceph-qa-suite `` (`` tasks.cephfs.test_mantle `` ). We do not
269269test invalid balancer logging and loading the actual Lua VM.
0 commit comments