From 43b73a299f66ce1566b8531a3620956f5d1da541 Mon Sep 17 00:00:00 2001 From: Fawzi Mohamed Date: Tue, 24 Jun 2025 14:10:22 +0200 Subject: [PATCH 1/8] Add lustre tuning guide --- docs/guides/lustre-tuning.md | 25 +++++++++++++++++++++++++ docs/images/storage/lustre.png | Bin 0 -> 27329 bytes 2 files changed, 25 insertions(+) create mode 100644 docs/guides/lustre-tuning.md create mode 100644 docs/images/storage/lustre.png diff --git a/docs/guides/lustre-tuning.md b/docs/guides/lustre-tuning.md new file mode 100644 index 00000000..79aa18c7 --- /dev/null +++ b/docs/guides/lustre-tuning.md @@ -0,0 +1,25 @@ +# Lustre Tuning +`/capstor/` and `/iopsstor` are both [lustre](https://lustre.org) filesystem. +Lustre is an open-source, parallel file system used in HPC systems. +As shown in ![Lustre architecture](/images/storage/lustre.png) uses *metadata* servers to store and query metadata which is basically what is shown by `ls`: directory structure, file permission, modification dates,.. +This data is globally synchronized, which means that handling many small files is not especially suited for lustre, and the perfomrance of that part is similar on both capstor and iopsstor. +With many small files, a local filesystems like `/dev/shmem/$USER` or "/tmp", if enough memory can be spared for it, can be *much* faster, and offset the packing/unpacking work. Alternatively using a squashed filesystems can be a good option. + +The data itself is subdivided in blocks of size `` and is stored by Object Storage Servers (OSS) in one or more Object Storage Targets (OST). +The blocksize and number of OSTs to use is defined by the striping settings. A new file or directory ihnerits them from its parent directory. The `lfs getstripe ` command can be used to get information on the actual stripe settings. For directories and empty files `lfs setstripe --stripe-count --stripe-size ` can be used to set the layout. The simplest way to have the correct layout is to copy to a directory with the correct layout + +A blocksize of 4MB gives good throughput, without being overly big, so it is a good choice when reading a file sequentially or in large chuncks, but if one reads shorter chuncks in random order it might be better to reduce the size, the performance will be smaller, but the performance of your application might actually increase. +https://doc.lustre.org/lustre_manual.xhtml#managingstripingfreespace + +!!! example "Good large files settings" + ```console + lfs setstripe --stripe-count -1 --stripe-size 4M ` + ``` + +Lustre also supports composite layouts, switching from one layout to another at a given size `--component-end` (`-E`). +With it it is possible to create a Progressive file layout switching `--stripe-count` (`-c`), `--stripe-size` (`-S`), so that fewer locks are required for smaller files, but load is distributed for larger files. + +!!! example "Good default settings" + ```console + lfs setstripe -E 4M -c 1 -E 64M -c 4 -E -1 -c -1 -S 4M + ``` diff --git a/docs/images/storage/lustre.png b/docs/images/storage/lustre.png new file mode 100644 index 0000000000000000000000000000000000000000..8eb617a959dd288dfa0e502bf6060d05d0084c55 GIT binary patch literal 27329 zcmd43by!s2*EUXyfCz$uN{R>q64KpB2_i5FxxnJ6lvF#-2?(Xi0(pkV@c=xD&}(TV@wqXYLZ|5F}h zj)wK84jP&W@D~jY^BekqYJS7`_bEo^H_U(U(IsvxT9?yo0RvbuSJQCNP<$(32(f1U zUohK5GS&d5YSMe5Dp=D;suS~CX+TLCsUXJ=6Oe>h+u9l1 z+XG^WycYV?!T+c1KjQsUU&YuSV&!-{3XqMtg9yjpCjYDWf46#PXKoCr<+j=X^zz?j z|GU1j8N>ksOp%?rp`49_u^k}d-`4+j`2Vi)U+=mdw}7&_i?Nl4l)1IB&7V>7@Nx;U z{a>H_@0OAfD~O${?FU2STcQ6n`LCk?UHeZzn*ZsClLL_TKmGjIlfNy6*lz9MzwF{4 zqx(||Se*!-5ZnJ)un1m)i^(Y(ni!g#l(?D;`gS93imvq6o76cJ_l8nlqC-Oe{b$eU zwrKo8L=Q)O#ovsgusg+Ng2ZJgtB2luJh{Sr{=6Zp?Hgv+9b@v77x(Ds(20KxYYNCY z$LI~uZg}e7G%IVVZKkNAX59N5Jrni4&+S!@M-PRLM^U-E_HdnWtOTHM}C=!8v7PwR}1sF)YR}Tncu(fZh z5SullzWH{2$;#!&x#nf^7(;R`N3QFJ9c!Y*_VCCL9U;uyk}_99(5YuXoaITamP@Tc zIL3msAaolLrFKx2ZvBIh2(_Ex?m;@orGkY!h2)gfUz6%pwco#K-kFy{k_Kvl?myG0 z%#*6aX?;VW1;?czvr(l!?5{vR- z65Y?@l(Lm&JKztW(w3e-vE!XY=p^9ys_f3AcsPuj{ zQ*yRWW}+f2eoYx)5}H3}C0K^65K#Af;E5Z2#CG{boiPrqgbf0*9 zo^`Pek?JOBl)cev9wx5uj_8%|j~rK2w7GY$1+w7Y zxsS&KPpfl#Ra%{VGoNz1uld+^ zbZ_eoLp~Z;&=|Vb3z&Bj9HhykVXjDv_Q7nLu09GoqYxyyN0UX6@ht%6s%$vIHG~zM zY{s3w2D4?pKnieAm&Icd`-(|H@>Os8(Z^2+hruuBanHJ5y~s2u8HQ%#z$u6~z%blk zaJ+>$e>8Xv}}X9cyQJCe5fsd$H}Ut z?O@{fS;*gctd;kPU2NZ`dwh^@Fa+ykfY@l?S|#CwL<-4ryA7cO3p+lJbvd~ezx`?% ze%eJaW#TIZO2&QOI~xSoLx~>ocAA5<(VzkY?I;UcFU?9&f<&T)%DmRzPV43fm{t(a z!UxK{Ar%{>9KTEB5fas_q(({I&hFd|CQf3)h*mnZxu#jAPCK=X193htSeuiFCJ85N zdwyCP4Y(ANVB6_>(IBp`A*cNe#cPnKhY@iXPbpeMW_yLD&@Mc=r%9sxvpv5>BBds> z!M)_XLSBOE+Zq$ckVPu6=x}9ehRsE2mIg(nB+3s~{l1eipqI*G@6NiSX#74r^T3k0 zU_*Rgp7q;gnOQA)6MA!4b!{E6NX2*+E9Q?Ybu{JS-$iQWe40!+)&bick}Fwj-MpYG zL1xfDNH2jBfuvZQZC$7Xkj)Cgsu(5EAku7$oD6neIleJjs~6z)R<{yIa^?I83Dv4V zx<`eIR9TPi6DU3mylj8ZFC>q*o*!^xNz!C_JlxX+Em(Lg3kEZfMK~YSU)MG@5}J$MK-h z3=5m#nYsel{r66W7?if(;qIQ5WFKA%O*rK-!HxTrf6p?~EsCpFZI6S^2wK2>a(t^d z&r&h6Qm}fla%uXKG0UzQpNdFjGndCtS-ez$h(F>AVzo{c`C(8b@NkYM+6D;*u|C2e zl{lm+l=%JNSwU(Dj5IIT0J*;}2__H7*c@G&AgnsKkbSA`$Q4}J z{duz$Y$nL%r|kH$2Wo1MY_#u-+={>^{_-9pxW;PLuD#Ev=6^?e)dav6^)5 zK*NSM)0Lrpd&%o&<7aDk={l@LGDNi7&-HC#B3rqz`I5G5dtw@N;xA~v{<4P;7JjhG z`e5B&Gh<$32f`PWDLGSWfXuY3khe}Zv zYmJR6$0MrRqcKaZAI=ztgc5hrNop8*j8E* zys64Gu&#`dRaJeI%?wZf70riT5Z=RRz+lgCWw6?Mo-T2{rVM!TU?0p{c{$2xH6n@1 zpJU_?tyW>)334=_?n=Gv!^|I{PCHq*7TaBEHCSq5G+GBs!aF=4i&$v2m&i*wN5QXP z)LAOxkF>7xiM@Wn*fmc=(Qfq5q+`5@!^KvHJ#=C~Bgqs7m!4Dw&qE@g!Jo%KQXW6m z%fZ6d3R~z-9)kQY;}xl#yw3BmFitwal@HQgaIghK`Uh1S0nvcsZ-E3BSxL zRH7=x7Y;%c#=K9aAp^m^f$EX9R_xROJ zWiQPvwBU)b!pqWp<4xLivg|-D`S8k(*Mg2}4ue z+}eORuV&=!`)Vf>5I_vkO8n&)f#ZRyq?Gx^Is%uQgkMNevHSfXbR{piQm#^^2Z~eq zQ;YOL8SP$!3xYl1twK?1-bl*pvbkhNC(|#iA|NB4w@jSP?of^Zr#F(q3d)u8B5xRCMda(>u&u4i5{Nf5F@1R*5d9@`8IF z6hNvkd}}J~fZefJ#(rb$1zy1b25yvC9MKmCs57S(FZIg|PnX);3MQ`xKXFo&Gw)LM zUQceu_%z?V)^U28YN7i1qr5=4(sft})(pv)UzM(?4+YTEX)ydfU@TQcZ!4a;qrUMw zs7mK8O@EI&+M#b7bYOqU&(-H=79+vHT|k4+L<=V=$bF>7jF3arsi`WCmQBtuY5_YY z()zCP;z%R4)Q%p%zl)g!#=g7(o(Zs8AqeE^eAW4)@$tgZS^-!|Ac}mnqRF3x+eA^^ zy6hyM5g%L8?8M;MoO#G&YBOJ=QdL_UB+eanT=ux|F?RQ+ExVCgfHH_gVu*>r7=%>o z0W)vR7t+_UH{7u-+9dC`h~!WMeJNw_g3~*z#&=}fDSH`E#WigO06tI@OH+uIYcM2IxSRoaOW!D51JE5@ z6{uw>mP4TT$Oxk<*1qYEb>wENRLpYk&ouh6`eLmOby)}5mbpQ8gFmJlmxO}HhC z@vk$L;`M!8lfhbt&LME?es4Xn9>&|lp^G_oj=u$avOeZ>&&8mz{{-7&N5lru$T?JC zbo4(3J~BW9D)eE%=>NFv%tW*ikE15uu&}=cLO=u5-Ke#H!}mr&ge+9!2cy3Q3(;tV z^~XR30pFqEMbFe}8tYBrLE#y&Y6UIJi!3^L8C|`EdU&ea_@0L0yzR||i|0VpQ%5I@ z?wQNpin#26fJs=2@gLcV0a4x`yyu6k08nI){`r>R`=@n_gb;=^K1 zf#su$m3s`8zoMTP?DCB@=TthcmwMT&1w1LRG+8T7`|3P=lv1%m7!km$T^fHoh&Dx^ zpxMA=TcHaXC-09pn7dU?CbHW~%c40)4VB38RIpQNY}>=`*UsM8%3UMJlvi2H&-_cv z=unc60!$Oh{0$UHZpGUJG}F*5MAAuqaNfARX=Bht%E>Gt5GcbC1-JtOg3Rn3vtO;s`XF{SNO?+S)o4?}h+# zq+CO+`(*m}*mff-*b^n_+Y(scv+i}?8^TO@yp4KMc z+&nuH;@S|_)g>Wq^PMseyoi)ZcaXl7jun@^uO3Uf()`Dm=&T)lTSE9`BvtcM=jx#i ztY820lgsn=NNb4q)iA%eU&So<%!$z~Cfk_k68+?S*-}3sd9cv$^ycxS?&$u#}o1zM0;tuH;1v7bG-5f*ixy<~)VF%aGJ-vf*92x+4Le#&_?`o$K)McRgWE z4(A)=uj&Yjtue!YpUfRjEp}O5ro9aD$R%`+AX{#)yP|}f9I9Le@A6mzB7b0L;i7ix z*dJxpzymP!_AVYTA;X_n>eJ0E zU4CybQScj2*NxiQ_U+$<#qK99k{VwV5lYx}7vzbaBta`&N24MQ%Rx%>xCRl8E;kNq z5Ey;}U+say^&3^*lbN&CzHsWoxf}2GUba4Y`>eN=f9y&LFi$tr$Lmz=s_OO)kGJqwmN&0OA(s!w&OjNv#uT21 zkllc4e9PXN2RA|5PaEbDrpIq_Z{5TM@q*PHDSm)Ay^r~MqJD#zo~3%Nf$r6FZr)Hu zr=-E=iM=m9)9WRVeLS5ea!r4OoqS^tnjF_4M#y;#;%!kIzHg~ImeT=zb2c_!Z(Nn$ z{9edU`ci#ABF-cybhC7^G;Oi;YSwQ$^0>a?$*$p{f{lin#WZEWt$SEt^Nri}Le*cp zWmmh<+Rl%8^9|RllRt2pj$i97`t61a|IRRH;ad7C?>wfKS5M8;Q%jznbfxbnc@aH* z7oo#$3-iGhf4sVW`5vT;K?dQxXvm`<^`4`6&j{hH4;vVDZkU=dcH(9bB}p7s@9S-D zsC#@S$jx8iZG{2fDQR0gdHH2?@U)||Jao@`p8z`Fx6i8e!VpM^(VILI$@NCAk#pq^Emc^<|@^Tq;BuU_(pWoxtftHW@aXR zp_yQFfi!KyoeBZA<&`J33L#}Zb{rfru~dcDuJFongU^_IyBZ*4WrIK1$f<-+xvM_= z_q^C3=c`ahrpOFE_6rhDdl){a2@S}w`6_zOGpPe&<@U7Mb)z$BQlA5*>bfD#_srDn z^7MtrG0xYEz#RP>Z}rQ)>?fw%X7hV+RFhxw{$6jgR=w_d(~Ss-wq6t-L3M z&qwSfsEJ-Z0`wM}){2WhpTE!|uuxJ=LRB9Tj%OQ_W#=WOd-&}V@!^dBBwYPtbNJ}Fw`$9XH6;`= zYU++Sb}g(}HgSvIdM9}E(Xy-yET~gHKXq< zcV1gP_buJusxT5#adOi8K33Hv*(jNmTkqf5 z&_wDsM&DBcHe12=(<1}ny?&>Zs-IM~6i+BQ-6;b7W=|iva?iF~NA5J?qt+$`UgKL% z^PVNGp&SXm){`K9lh;RW6sPeTf|bq}=N9*W{=^iP-1f_oG+h2#z}VG{cez^mO3>3w z+ywPwVt6Sf!8cks6|HYCmaDtPISJd8ly=5tLYbpZ0cXm4HP2@@BsMGxhyKGBu6p_iz4rY6R zN3JWXCSMNd9Rui{=v!xV@ZuMN_JvRU)Q2uY`ETglw-B*E=L{Z|nCo{{`3USh!r1w$ zuFH*ze<26Ll~0G@bN;BR*5^$%BtNMVH$SBdQ)Jqj zoCS57pR_Og<@GlV8bjwx4P~Y)Xo_3~vGo(!b zMq``SvJ2KpA6Frs-;LgH=ls%juRT$(kKg@vD5j8v+T2G*W-PL+0nJ|00b5@HM7|+F zZNAYSG^sU{f!d4qZjBZ+XMB;1#6jOyQR(3seoWCr><);n?-UAS+RloE^|z3w~o_9O6(DeN{M;hE&}r#l}=y}2BW z@V240>ae?7SHC>1&?H;V;MN$JAH6J8@UE5XWaaL&_!?_y5^%SAi6ZST>am@(qi`Ps z*#P3lSE@3cr!i=_Qb}UFzgl9ClfSM#4-@98OO6KZ7=F_%aHUQ|J?UL2A2}#Bi&Wj4 zv+@wgcR%SDUX2M!!IqSi^z`(6hBY=nFDm45IaXO1oNYe)jPGt@u*;loO~b_|FFzwL ze^ahc-r;#je7`JB3YI{i`+i~%7q!y&*T2aL!}edFFt2&<6oxR*@g2~mX@z2b@;eSI zxQXeGcOe)X1m7%@IpH*<&W^EDNQl(YQ>bmkV#$gD(Ms{K!Px-T-V8sr<||gcaEi!s zh~#1IH>4L2gp9kq9=9SRH8zi7EL;k%R-F*6r$;B2#RBXnc#k!Ry56H98te;il6;ALYnzV+318z_D<{uvCSL51?j`Uz0u}J$1wW(AVp6_`1R~YuV+% zsy1l#Xmh4Fl@ik>A@#|^=GR&i>?%O@Q9jNn_E38&~))?P=!Yx z*m_K#u3(IayIj=W>7u7EPUBpL)sFb3TC1;^czhK9Ik|dvm1P*@rWlzcr=V2#t?Y-4 z@;?D%6;=)Q01#Gm;p)`y@l1#BCnu+q4P@`HqC4Cfo1pdSntJ+B!jhQ8Esx~a-s4km zchnCShQxGk_`>k!Qg(JWLEkI_Gb`vKaIZz60+#v_ep#3j2Ac^hOF81~KImegk0c0p ziVPDi$(pVU*j7zF)@v&A)lYV z-AdRekRzbNu`2-a8oS(W?G9g$a!ir1iwTV=L`3~~obo+?Q*2UB@qAyS zn#={cTVSO5_4`gLvHd7=%UI{1WgGnJ{i)ZZ;>lh)hjm5bmm(l<9UM4{vus!inkL94 z68&{pke#Qgv5}hj5OD1G9SjphDMrh8n`#<_0`Gs17W>SkAsT^*`9^ULr-(2}@sKFO zE;hFSWmrAZdI6ng2c@0|=xh%`OAvuJbg+SZ7Hu5j$_!fe+%q#T=Ufql-iy4W~;>rKXDn-{O|9jSg6;Y^Irq89>J|?7Kj4-gYL=Mqq zG8syO@ZSMQoLM(LQa&6El~rVXaAi@fqA2i?c%R8S2O1|vEuogml_NQswJ9|*#Uq!| zgE7*epp1)fnq8Vsb|ajy>5A2{ioNpQxy;bnQzyBsYedm@_0VrfJ+#p|>B|{;H}1cf z|B><9J#XNwivXv0di8SiDeuA6ES?a_7f}YCeWXbLJqI<#%6lZ0n`#LbOR?kUpI+60 z`*MPIA$jFDQ?PIBMLF!BS*`|qHAGK zGq)ZF!g4nscQ<78qXj8a6Sc&y%G+az4!oheU~4{lX4R&z=eG0fbU(R!ui;F(6TYK& z!IY-whjo*a$f{AjPL)UDeb$iNR$71U7oDOZvDs*dwCde@_wYhrQsz|t0V;82)@MW{ zuOrArOwupDQqE;$MM>Ap!a`__o`FH4Xi6gUYqE7EcoC0kFN@n8^ioWdq#LFhYqFY6 zfBD5A`&}Qn8(_EhLrJ3R# zX4H`UO-i4_!&gan7Z&ZjD_8KnXD)i43m?-rqNb(Fx~2qK8zlz^%Pl|co3A}-*b6rH zgOkn>4>mXBX~@dt-u#>nYU9o8>O2)xLVS!sp~CThDP9!b%cpk5W~t9dfBslA-a;ia z5QdkN3nKyzX_{3i8wG)P;!blnD~G$EaPGxx;a>spV*CZwETAaoBnQk878V&LD=`y8gOefEh> z^}rHsnkVG!5=dpO0tu8c5VsZPyQZ`plp&2>^qAfpiir z(=RM}dlTKty>rP_nQ>~pylQOCF4yC9trJD>cL+ibf5Oq%i%sTolR@imZ$_?9c%k); zTP-r5ZyRmF)JZTrOW-EQBBFaCB{SHj+>vWDuJM-g1re<1A>^&$nJN76=s^ zs>0Up!R&J>84FMX!P*$Tx$#RgFf_LkDc=IaWB_ybmFEi(sH%?D*&612HM@@hu^4FE zE6NCbictvg!EXOjA$!mB3DgK}r%{gNz4G%4f#~irdRZfQGT<}Fd=~;Z zdmXi@lvh>iI~lISM-)Qyq!(K?B5dzN2k14W#jmVXv_Z8n2n%_kRkB*fU9l^JN5(_- zFh;sEXN24_jpJvQd=Y5_Fd;M|78(VAuStOD50C{isV^^u=}$z%!VU)Zx5dmfhC{$fSer{8w+;P z4m?rH#vTJVAFVFgI7z9K`)tG8Sf{Z33iHAMIl#yIcJdmD5OWBx^IU1%2| z7V46VAF7!M5`av=E#g2^1VDma$myT0f0+HzIjtOj+jKkvj!`vkg8YwfLqOnC z20%JEmj)0!-v0KjxebNXAq+VGax0jClW+W;+bHSJnMobXW!^)lF_q#Tu77Y#LqnM< zVRNcS$Mpu=hs`*f=aS#}gsTm+**H8)6J?288c!AQO*#8yV)!p;eKlU#PHurpp9(C~ z`*zT`NMCe#-SE5p?Uo+sb{xWnBlD-5s0jckBe}R*|MkzXmY@JQsngs@LGfo8VjoEi zpzpR&=~)ba#QtJ=duqBmv9tY0%wQn06)Bzfj!65f6JU&8$J0Hn|KQVfJb(`V|I@jx zP#?pK7oFXYk;+KB!j1vHf2T4A>oUDh=GBFgl9JoAN2Pp?SaRBbXKhcS<%Vnb#nY+V zVEoT@Docah!dKOE9JWHp+h@?{{#bfMoWTZH(UesNrS^a|+2OeBCdL_t$IpSZ-gg=# z#vOV=W8c4keCAg%A1=sUO1by+viZTVx7)+Ys@%jN%As~E;>nFSXpFT;BPLKS=h z6R^;fUZzubw7P-^T1Cjbkit?8J8@1F&7 z2`rGv8&uMBfLaBNOjZL}Ah;J}_6z?kko|WpBUzS9)PyqmL`m_D`RrLb4@zYJS>fAQ zTaj^*VxtH@T}sJD0#8I{abOeALzumSRHZ<51s`WI7n~g-!kDDaxLkRzs+kL;D`W~|sCR$J1VzCam)CjYN57*7`9 z?Uf32`S(*7k6!zKhC7yaLo!^K6i#L_7EF=wYraZ!dtZrQP-a z-E_8oznL(Cuyidrp+XMK`_#Q0_l|n+EsP!5?**L-O-4aWP|E~9ZzzpmkQy`h>oex(Qc#(lP`u2#(0 z$D|w}9#42e)#!Xda);y)Hcyu)Aq9d2f$K(j!F4NrH=Ut)=sMWdm z$%eZa#QM5*wvIm~%AeX)v`e2C7hg(g@XjS08ZiGIt<*F0`C*}|re*osk-Ow>+K73y zlsM4YrHA@}ysv#PfN9zXTJ-$8`sw(9+LwpB5JZ>9dDw6#zd1FMSxu!kNvd%12+jkSr$xSmf6#d!EaE61QS+vvdd-!!9BE{t8lqRRXIMFn_cxv5 zk{V0@5BKaHP#)NiLIFh{F{Yg?+gt8UPorbW701+UxM|0cU340x8kNh0)ob zel)p97p0696DyLW_4aMVi_r;w4dNtX4C>3D0W+S~;~$L9qx!D6J9t{Y2)FeY|A>vD zAbgaR5;w4RZ4Xhd!n+EgJMfwzlu;%&bT)s1^hY>4G{wdV!#pSPG8X96t=qNIV$g*&O)Ki1wcK?QRGVOLTqUm~`3_>&v3ccO-`fAF`URu@kD6w ztqnbr_^2qi@M2Bt!Z|TyFCX-Li{4`HpUY*W?gw&#49T>S=#S|NtBiDuqKL(0y6Z-N)4N))ZLTgRUn2SaJk$sCXQ@DIf5LYCo z4JcAoJ0h(I$LkYOtkOPHUUp^(Yqpe==`>cx53V*TM4bceOo+N=UZ(R9j9=}-jw5c3 zLlzlaX}LIJ6#xb)zD(AamXVMNWJ>u#&zM+rPf?}IxOG!rw$P5xUzy;$)}vH;$_yGE zxeU^zH7QHQ3MS%9@9{ZgPGK*O0-FAPuewz*-NkRxc`x}5*H&m(`t@OufLDE9r8n16 zXoL3j`JQ%k@_y&)Dn9-hjiLIDOhc|~HYky|qaH6vo)qI$YD~lf3DXKJ&;s>n#e%s3 zM8Wd0R;!IRE_2;{T0?v@Ja1RJ^-)|6xZd_)mx}vBtt427P9GAWKV!(?9A|!Jnxg0U4|_V z5%8LK)wV2cg}k#~1fsGAx1z zHaadDAT#5ja)-=KVju1{b~!82*8K;N>C{-AizF~7)}m{??5>_XS5TQVJbo(9 zY?bUcwX4fIp$x+o+hg^0wpx!lMN_bjucs5Gqzqf~1uubc7oH&zokDO{B6y-sy7ZP}L1@8=HB><#7!qxFyyAKHF>WTb0Dl zu<1;G!OJBRnfeY&Kv`RjA9-i_DgsLl^M0{hUlkrjG&wZwyS1al_E${8B};VDJy8|=&>o{m zS5bk_cC<$4-tNWw;kh;h1t3*3F{Iq4<1v}`OF?d9*7 ze=?;crIa|4e$M9O{6!hBsn_*TB5CL&tx8;%na8K4*MWDFdpe%@dZpSO#u+4HCI;Lk ziqZ`WBWxHssVQ`0T*(=bm6Jy;EoN*x@BDUAKlV|yw}cmyT&()o2|LkV@y2^Il1s6~7qQs%u#H1;% z1lU%ih!x85R!uG%jtXP*2IS=AZMJszUW_>3J4QP4{B#I9&0aD3JHJIhlT5 zysS=lF;HAX!gw<@>%4g3yIsSc*Cmg}dN^J6-FGkkxvp$uG{ujBDylO8=Bc@w>z)bX z5>=HuRJ`i)Do4c+ur@sonx3ydEcgLw8&Fb6PHt`!=qIA{e7S0E_#k{JbyNj54>u|m|mW>DkGs9HPTmg2~mXx9RMr|mQ*hRo9dCe2tTHXrn58m zYY$)hcbIUL=#1U3okI&3|A;%={@RdAo zyM`hv9?F_XkRnu=@|wE-JXOmth8MLvhde9e;;&rU%nZZRJT5hmWm;O+27u^L1oaZE z?4F7$r}~H06)C=l@8#CZ7K~Q2*#u$`f%!r_KU<9QU;gk4P)3T!ZoMbXhQ?-Cb4$K{GrVqpB0T!TEmPM4HfZR9M;N z=t_2M-ul)Kg@%s9CGlQ=gs*YuWWB(7BI!B>`BUo+>PaMjNwqy%bx-}Gy}0oMn*!&o zuY3X`*{<1h)R3#jzbEvF4kPSmNY$(P5c zudwJC_Apx<0%+sWVih@OC@mn!^_rC@KHTCiXPuq7OsSz_OIVUF^Df)*+NyA_TUK%m z&~>+5*YW;zte;1hcjb2B%*!QGP!)1uop*h&Ye{Rhl-YH2;WdT2X#{oGgr~iuqb_Qp zaOPQSj25}MU-JI_jj$O=3=TawAxEqX@kBu2H=F#VQe4OKVDP-9hbkDYF z+^(RyKyP-QB4~AU8a9PG(@OH=`o9%ZJOq+|rt!0n%b0fqlfEtMIZgX}{_d{tFKBqG zOVcM>wXPSvRZ6)#C0)H(Cd`Ty)kLA4OiWL;;r+f;N6ygN6c-Cux&A!jZk{;do>p`{ zWwHfW7a!+#J^6+f(3#iuU5we%-WcA$V>+dyrQ1-OljmqVTF9uiU%4$}>&=&~laKm3Oa==(*_7Be9@3C-=fp zcnliE#ofZ4S(0pP&68(rICa(2ZW^E}QuV%1$rp?{>v|lhH-$y02!{{z6jTPPgeO5bp$}&N zk8+;Q%$2X_5xm;h9@QD#*P9K!$q5P8kX>@B^qFby3DLEaz`_LI2d}0eRy%qcah2IG zaq^Z7LLkR%bf+<;-`L&35qW}aDJflFmDy8zPUCG@)^wBYXPok*aO)j^CRjXvKjQsUd8FMw&bR_vop+Kv=K;Q*?xxS9Z z**kcVWFDkq>vK?S==uVOcu7(II}Ez0B&;Z(={sA_a|L4Ts^k4FzKbM|C_`*2b1M8( zzzV~T&p5k9b)dQ+=gU@yT){QaKv@{O?T#mfRjqep!E}QTuthvR4lAsHq=7y2A!hH^ zTh_7~(bJ9mcCnb{?cPDtRKLdh>K?IC?Qo$!v)b~b&GWm=G@gpCy@Li7m-i&t2-QuW z#PD-m0Su&@-%vsXtL96$8QK=tdBh9F?25}g-`a7ps^_J?cqXELVQlG{b4yPjH#~7+3bzU zRMkmS+tFymV(C(tTH!#V@p3LKdbsdtaX(mEdNSxd5J-(SPtR2sTkcin7WufJwm#wa zveagK*Yn_J)kEP>wxT9@r^uPQ-1Ffpn{PL(JL?2DJ;FS?uid%x4yc|06jS05G^g(Y z?0T-+Laf!#Ck<=$wmRKU7Q6!&6ro%w9KGqp?)Tag(;F6;q{L^El`^mKw&Dd5To66+ zu~Yx6)cEDt*qY{@#*-%f9V@5USrCVI#L4SEe43^l*P8ZEE;UBidvl4Pu?H^=Yp*X4 zVxJE_sK3RE$T1W>FC?i#WlugX3-in@Zg8)KqtFHI>~TfE#WJe%O)N9@_3%a4`YF!U zOkYdJOBYjZ+^dnM{*7Lig~fJSj{Q!eqv`yvsm|M_D*-~8h|kynB5_LZ6Km063btyv z={iT7ETWh_$F+B1s=UhIN0R44bW+!=A>T@4~vx3bP%!A#ZmbFb8y?>g% zws`&>n@mLc;$DAg^sIjtjh&EE3qg?mIRG|&F~qQ;k|r75lc5h(jZul3WlMG@*@1N> zGy-%?Ha312BQuKWd-CY(8GJFQ{F5 z2KB4hL|xp~(%&MgA}k=)Lc*INbe_bD<7(GGK6 zf70rh>pA$0>?b?~FP%9iYooeZ8y~U-Uh|m%v*UpmHD;c%Zq*?|1}|jo26=CxIg8dZG@-T-D{?Y z#tXC<-Vr2sAy63{J}gun^zhR?mKaZ6XVo90p_~V{X!o0V{kNWJw2zXo!NL{q;r#YJ z2L?4HhejCIS{$kVs~C26>8Nn6rxVk)e>~w&J=zkQ8;_Gi9njr5l4(iXk=NgIL-*m` zVJ1Uw8${O$n`hP9d^!~WWG8X|(=larCmVCoclbRhV59&eO%^S$-?6gF)q%DzF)_^# z)3W*MoFQr>XVCM*aRa_c+1B%&<1E#~9Q70Ffk*o(T{-;3TI?>?_EU)m9*cf`gG1vK!1 zt^{kqYt|j|Sn%$5?wwZrH^TCcWDkuRhej!1w5IA(1+(b9=g_c!JXBWrjJTN}{13fF zAB*%MqD7Ctxxw*1RVT!3u0(-GB#qZw$yGdV3!)G3Xbe9|x(hB1Flh~YTt=^c0pR9B ztYK9V4S7l`>JlcYx1R-P)f08M)Saa@Irqt-yeWx61`_4rOw3zu^%Zp!M8gXc9RMkB zsdtcso$5o-`hw!yJA5FNWIT2tJ24{ITx4kOQ$!+GZVXyPNbKypw^ZKerw^|eqB)h$ zDl<%-QMMP={paC5k{gm#*trLHpBo*Tab7A1m=}0BuZyk!m>>vM7cV${%j-BKre>L2|GJ-*?#o>~FE zu9nD!{3_6`UY|4AjV?d0h^Og#J~9o&14IJ}oy+>X)Z3*5O_nZOL&7o;4K=6El#O&P zE_-j(f_vBbzRpx?*SYoaGaVk`yjOzOXGb$34PnvyKS#68Zz}U1d8HgZyZP1dW=BXQYVqN;qx)_vW0g1o zTH+1cS9ik;gOQ1jd?XR?@nY>U*!8V~4UknGtJ2HMLvC+Bz3k8RgPW{N!uTC z_Skbibqh2x%4wXP;uYq6q}lvRkaFN_XZ-bC^!`$>ng@AO+Se0+a6Z-F>a6A7XUk;A z*-Li6ckR<8kd;e>FyjEs_Jt5M((UHD^Z9)t%{=$Tj;z&z!kPmZH2Ib^=)q!pX68!g zlD!8o=br{BD&CaCuM{oC(uL11z#j?~N=zb5JFlH0H-?NM)Z3lAS?=1=@yvF^0Be`s zc;M}OwqI{W#s`NuIlQuWPdPU*MRfHf>dobzEWzGUyS6w?3_DzdLIH0l7l>n-!M zhkXlht&3z`lX^V6BfP8^^x!Fd3zvuXp3^u#K0ctbaM7Y97D_lTU5GUHhTEUE+IYi- z?$ZHGpHjG?fu**7ODWe0!LsvI1)BYlkC%U2*IHuXtDlYqDSQ?Hm+Vl_J{wp^Cs;*V zwhGo=Yd(3aG^WdB$p=N64&+Tw400p%$=`XGu zE73`hi=)F#b(jDZZHj5Ca#qW)YoUXo)QYXfj}A8k!F|{L$;a;Xw9cH^RkO1tU%Cn= z8>HVJ@+iLqK1z_~!l*gF1}Mhm*L!t7m|Y`gPSz7@sm&`u2>&eWd0#snH^228@YyU` z<7cc$M=zbuu1XIE^}M(==bjL1!Agr-FdUB=t{k6l%;0IcTT+H6nb45F0dFc@G%)J# zY*!NA!oABsb;@G|jod{gW&vp@TWPe7Ck15hirhcWU)kJaSG?Is(0=Vnrh1leeSQ|u zGtYP6&)?e4xwa25DTVV`3!gk4u~f_l1uz$w*l?rpwHX44R%&FuyI?SCD|2>87-4I< z$>(<49lpwt#1-yWWt#)U;SOZ)0{R!fJ5Nz1KG)K6SS(EG{Dbw2-NmvH)VJB3XCW8| z&oTA9S84fp!B*utYc7+f8ubHJz__v9&*}1bDH+RY@9X$x>R2yw1(AT*Ptr!5N6C_C zmVodTzs%Z;x?T`f4F#A`4Q_VVjzBcHV)^ILLdsn+DR_VdMn{2e1!Ds^m?PCT*5rIk zUV8@z4Y9a9=d)!?O8O@=KZEP7_1C}*B-*uX5)=X#S>4v)TyXbk_&1VlN-NQrQe7jCG*#? zGhTg>8~WZg1R_IDij}Nt3>(ef(oEQ`v`vj1Rf}NlyKuA))gL)8@4*)-&%QT~-GG%q zQ)LuK1K|p~r2u20udLm|pp!H2MBVo?#_tQ0+Lqb;e#(3Xz&ho13!tdZvYJ32AAgD z^T}nh`rq?5uKS1tvf08FL(gk~m|6^Vjb>9gLN^3)*1b|-X2k7sbe;^oz=+}U`lN6b zxJH$IdHue6fV&0wYJZ^nM%nlQweWgc)!+)LkjzI(!$wGj(5vKi;47lUW-YKLuz%k5 z6+42oauUKa>Pf5nog*u>2f8|0oUk`IqZXpq`jk?%2)<4X(lvf{1jq=%6ubcZM)ZR{MFND!uTIu`ewpS06X5WdCybUNnGcbFf2q?c47vL~^wRfO#LZ+qQ zxRo%{uv(C6+5yu}K>Kc7(1UsOx1P}5$2;4P&llcyEcMPTg*BW8_7B~r#(h}OEpw}L zt>O1|Ub<00KuRT)1`&|%7&SmbloSCSB_Z84Vswn|&H*}*?vC%w_x*ib5Bq~_?5UmS z+~+Q`8HqLA7KJn9c1Zm4fOifTr!l|peD~e`@>e6cbg%x((QlLeVdcpkZ^!xJA?T5= zPkWpklxc#EmA=;_Wcnr2j5v|CTkIFozVG)T5RVk{@?r*cz2lW1BBJpPnMY!iM{^Gpcx4k z+*?SmCZOLHaT$67{9F^lNJ)c9BY^Ezl5F*`B>O_+{C?oEGGcS)gUyB8T%QPEeSu*9 zpiaQQVl75;rBos(!0)ZOKAc0gd|g&(T3y=Oj4~_s6J5&I_)kIlNP%;e4dEHO_Btj> z+pHq_#C3p{d+*^pZo_V z(ASPAw^@noNJ@gaBr%t;gb$m*vdmEEt#+6jAG2KQ>9}6TUBjZ$j?pjw%8CB|spXI) zZx-^3mtAiY=!HA7gT2bVoh;^I!wPwFHg`W03@}2F53@mJ>A&;#cPNJ#e1Tq|64eD6 z=g*zrtE?yJ6xLR(X>q_`H8fbATd|QiN*8@C`S%upSCl>snqi)x^y^#0O17nPnZQEh z3Yp3=$X?v0Bd*r$AGCh&b+sdrs_RqwBtY$dx9&aOh4TL7!Ikrfw+X!D#}W);nP5Tv zp56h70_;wC4bv5uVO~gWY!2>Dr!K7go@D$(B>zQLZ;xmoj^Qq2r)uBl zcUVlv!I@X~>o%Km5o*2!YjrjR&!RXy4_7?P1`=Nb%hfZTcEIHD@8&z&_8dyd>Cbq< z={M!v;oAO>IUboSC~tkbTf?p5{QV@E^ZReW3-ZMEWZG}>%od)+`)3?qc$KJGgZ!l}v!g|iHaGXgXozj`7B@EX0Hkzs` zHuw4ZEOY9k)JXHSF`lHQlqLZJ2A{GmY}RecXP=7k8>g=}IgVv8%u|w1$6Pb=CUf(( zpYaU0qPQH2C>B7*-E$MW9=7t&GU5*6>}P4G%7WaC!xsT5@vCV%6I4sjE?4j3uz;k) zH)kA>=WGe5CplD)$z-$C`u#|J(AQ_ZbTQiw)u{oWt#$4ABf@)W;nsZXgN2Vsgu3dL zz444D!OY7Y(O13O*VWf29>ILP#ZAu}6dl4$w&Y+k;*-=;nQiJ#t6`xr;9Jpl2fJXH zrMdCV^+(Og>Ko911BPk?JEdJ#v4Dy?*Xbcq?|vE?cLBm(c|9w0uA410AZyvzy*n{I zDRQxY3?rpM$`YR?1$}J&x;SXQI+87r4u%RY(5YF?OX~f*;0VAPU)i2L5X)Bf zQ5*X7U+8kt6j;#>TQ`9wwF|(n(7|Y}@=-P3j!w^KV~NWIqDl#h^Noh`Tn0@?+@t8k z$Aw6ED6o?*;WPxKO$cr~?!Bm0g*(#rmh5>mYAIq2DV;Twwy*AnR{*T@=C z_aTsE1RPZ3=#J8iEXycEVLgf31nR7DRp&4cJq{P^J=lY0=_$dI4l0;%#107@Wb(X5 zyTAQNFsI38t*O6GGt1F1WTLd3hPt(07uN`B4Wx+ad|R}7QPtn$a)*!0**iY9kN{(O zAV07bn=Dsw7OmUpcwgZal|twP7(E5#M5mtPFih?jH2)ebRk@yCT@Gl8(!)Kh?>cE#(dBPXVO{HBRQ z4c|yKF<&%y+P7~9)ZK-mE+fD){#;kGUC@nJby4$I-+(rD&qJ=q0C`z#$gbtgQa>r z6#RtaSt7&aPQRhK!Robvg$l5`C7E+0mgEsb$Hn8@tM6q`gNmLO3XNH1GR<}Q%8YB#~=8^22EMn(HZmdUjY^xl?_hO*aU&TMGA`HOh_=CH@c&?D>B^ z20;k(>w2R^f;!~-Lfi5@n|$xKOx?*_$x^fifk?wRLav)dle4G)>I4z&clgst{#PZW zF=tqVnQ9O$$HhLa99{1!Y19jatDt?>5P3H6s)oRthRFmD;VD!8g}z8H(I9s6x;S0> z{Nt*`o!$FOA6%9a(T7Zd%VGUGo8=W?SA%>>_-+?$6bcR zjjPKqWKVf7Ogf@SPAA`la*Qq>o$05_tj~KiyjrO5$)#a3h94W1KP9cavTFQ8x4!lr zQoTvmb4dI0g&4z`=+V|xNMx&TYt0|7a@?OgA~21~ottu4tvB;1CeSekG5&?ebuzEu z3=@AQbPAg6eZERP1AteC_sA>;h5nh4#;3YoseiNxRi9VITP|h6G%o~2$-aCB@j%VzRMJDen9Zg1ZSR8pWKl0ot%vy8+QjT9+>0a_b&naw}lCvZpivW*K;xI;yE zytY>#pbt2`2Mz$c1u*ckH!l77?cPASnnMkQ26xuL=W~&8T-JbZ?7%=6u<)K1r>ZkT zgTYve{ABs-h7(8q$u0l6z2NU`lVZ^8=3*900@oZa6bZILd$Z;SPJfyrJNx6K+O1}M zeBJE$U-KpJ(IMz9z%n(L?t-+dXDyHWgodgrh=7=GOYyc2@O5^pDyyrcxVuA`@*Z8X zIKYo24Hry}N)>55+a5JI^6-#4Ho}jdwb_iN3L3`e@T18bJTZ*7$jR&?*EYr!x5 zY?KaF`LCZHcB_PS7Y#mpW1P!-jXU2HQu?U91;jD)xe`t#Qn;rqF;MB~TeC6zm%Ez< zMa@XxuKg@=a@IWVNcU{MVy=a6RKCOlL%vO`MQyC*FC!_BGcEz=6|`j)Y4}(3Y>@+U z?G^K_4ca$4Nrp50C7Ggimhb4{GnJ#cX=B~`K~oOto#RhK zYWth;EviVPdk9`Vx=9T564lkx_Br=I4D?Lf^joRP>-+85hzg!k`zY_@0bYF=p#v=v zm`qV50bVPXT?7YFzK}X$+dcj%5xnUvn#Y-3ubb}FL~H&-x{F6ABKJmEY6S$7npw7S z)}7AZYwATr5FI)CJ> z&N=$-dI;ZGN$vUh_T2TuukCW1ym<{*hb$VlcD_?xAHJF`H1HtbGohM=ZZb?Xt?v3_ zU0?&tqNII(-C-v86qcl;Iy6A?XHF=x{yR8%~`@Da*5b{`Zk z*3?`Q%byL15zt}%A&Y`M{vlluLVJjv(bIjIzE&ai zp5|uJC@<+-iqEX{Wdo5Tt9FT;f33v0*b4*c5a(;Ct+k)M+;Z%Wk3~B*U|$qr$lurZ zk-~H>FA_tBjk&$H{1YFOktzurySa)P^B;D#EJx-E=hb37s zN=%SyHj@TsdZ8Q4-S=VioFXyT>#9>?ZGRm>WniFn2pqg8=Stq&FNZ){LYDMqle*_S zbH-yhruCt?;1uemD83t=m_WxsH14uku}7P3dC$7JV(7q%ie4XRfjA=lNM1# zm+-4&x(|_QxtzIgr!&)}gkhhuBPvus$FH<3?p#d%xO2tXnuu1o5EUaJ7y$%K!o3Bu zM^k3F*EQwwi&#lYTZ1@? z+2pX47c^a>1F$&8uC4T%Vx>$7zaPC3m*(wD=ev z0azk5p^2H#L7RagB}J&`lK&Z}@_zMW=n*Nc$(X7oz>_&LJ5CpTPM!2<)a3p(nY*&5I-6eb{e}!WYyY7fBHh7Vf~_#s;Kd$* z3KDJ+mc1Ti5x#GqC_7n9+e8?uTdVKwbCd8T8_r4t=|g5IljWCZX1y*`nqD?fAIEKT z@_29^h>1{;G7wH9Jx1u7%?wpJX-`ep@p=93swFTgfVP>J!Ow^|o3$9S(fC@Qse^?C z_Q|NcBYw#(cY@*s`_WU*h)N&){12u-_?FBy4}ojm(#0YO+#Is_hI)I zEgg5i>fC~qs080WKpeo1gDd@w+~!b(=V64Ee*?jQ;I_moag0uT7e}=3hW;#Y!kI_B z@x?Oi>|$78?JvBbW%wu}%x}H91lBtIiSTYSy}+&md>~)b{}Bv-E9XMtcWq`_?m2uh zizwKrm&z;_b@%hWyJuidvt>AP208M>U329g2tOSPLNI zIPSmK+s%ZCDH^}FbirVZ=D&%VEnMF4lUF*^c? z$s{6EdtT(3udfU&jFv z;fyKkJnALm`&z9+ zPy3{lSUXJ5efTwt*5ON?M9KJ=KQ6;xpjSqkCn;B8J#^3JaMs<@Gh6-S`rV*PZ&sfo zbVA~6g+q4bXWtuv{n)UsfD91yVtV{|dLIZ4?etPPOlCFGLTvSOKtGuQ4s&B&Uq(T) zl(BH`F~KyQF>p$+bng_L>4a&K7y8$G0{7on8dy#98VIX8UK<68VAC#z^|NO|M$Y%t zXO@#&p;vEH;$R*+QxWWTRe-;4!^Z}tDC-Ak+`T|LPF?~$gPc$~IX~^h(J~=YmwfGo zhr6Ke)CDwZd7MwDh)+faRtn(}srR3PPA&T4dx3W8-yioS&O2p6sP0$A$Qn>JWg!%8 zwp-N2qcS8L*QOiw_KwlYh}D*8<4X3zJcal6V_yqFDW6X0tH6v6Xk&4ok4n9k`$N?8 zi~r~=ky$aO;a?z+7)_O@y3c_!3C6n>h4n=$vT0`UM5|@yWpqn)QqG9FnV&cLR0#7Dm-UnY{OYqN=;u*jEU=&CNpsA6Vr_5~d9 zD(1H&0`FD`mhkU-t!pKq(w%MT86ZT$@0B0zAs{`LN#64tHnY7b?bjiK|U(5Hddnd``nb;(PUPXtiz85kwgIjM4)pq6dA; zgNXpW&UFhol>BaZP~Nr-91&dOgRBd9<&iboqD~nsb^CCCF29*3LiBOOW$RA(Xyin& z!qlwMO0s}M&VGM11DGs9qo@5RIq~CfLmAN%x6z)1O{&SUhmS9dQKoQH92GbQy-?C) z7u2#$=-E}^+2*qpf^UWz_j|+lg#{gg0bD74aB}yTay z8zF`hqk<7fiYqr}2g__6ZQID$eTbU2k0Q@x#vCNwIxbu+liRo&?%9t5zVf3Rj&;8~ z-Dh3oY~Gn`ajsq|y&N8M3_1ptDzQh-zWc2^)c+X|RudxY$Hk|q{QUjra*j{`w#0$C z5LA^es_l0UD(!$!Q)oP@@;IZY?hi&&AiK_R>F-n-7}wMNFKUNP!|n zAh25UuLAHl(35J#AkuH@QGbJ9A{OGKGTV zm4K*w%fn#Gy*Rw3Fn_P5+(Spf3ccXBCn4T2q_-CWAbSE!QT-?xi@0=8kP*FXe!i*}7&rqrcc&IEIj3n_MuR^O( zTgbd{jqMO7QC5|AtFjtRV6{#3q z3h_?p&ZBd`Hhl`!nYyLj31lDtdpP5)LA`d=wNAe?BJHq-D8E6%`RS@4FKiIQQ;nl6>9*(e~e!r^k)VK@G9?Go`qy`=#g5m>3Oe?3s!n6I-f>Ulo$pNR9SQ z@>vQh*k4i?=gA2Wr&-zLpC%-KD*cgn#2hCimRF`s`wH>nO9rydkm$W5pa^%QBH z{v5}PvilX8w6@D}jHW;qaL}Vmb*|OV{o(?>*cv!lDAe-g$40w(L1bUI%i4GQrPHG9 zb4==VC&pjtX-k@(PTh;&)_29NwZF!=X{D_Z{5H|oer>J0u6OYE#S2jV{nE-g`>PfC z%yiwF4w?3*UdIEHyf{4_srTkK+FoW%kwhnslIp$=+;@620&Y4>YV?GKhy?PtabpXW zhP}&*(CQJ%i>y>GuPh{{0A;R)fw!MFQ>IMgQK{*QmkKM(^~vL9(dC&B5mQ~{IKK>e zQ5!7p!j>plWBoIpjG@3#Wo4ZG`BI!eu=`qC${h?&4R^@xrmxJIu(G0XvEQ-pewbzV zLr4V6{b}P%9ZuuCU|U=6p*CH6x*nqqIt@{+>6b4XQ=eGg%TY<&rf74S;u{0w6c0PI z;&EDZ&0WJz=T6@)(O_;}AyL;1GoggEJkvT?e0|fO)1P<)_a0a=hV8U0E^(i(=zAE0jopki=l-Sw9*M#gak!=RS(1H#!?Bep%l$Wm)xBIze zy80yRKs40sV`yJ{-!(5es|DdlRYu3@Py=SS(K7mMwBi`Qpt4KhkMAFK#IP z*~>`Hb*s*9V(j+A&mX;qr7K4$YQ9L+l1VS%njt1*AXOW0;0#m2F&aVKFjrYOy2xue zN=uRRwOi>ESEi=xzMes}4z6Vpg|1RJLTOU(nPY|xAr5o05?xMr8BdOQ74e|ps;CeX zHIdv@3{MXG)11Tn={(Ri@5-4>*y^v{mt8^JE#ozGAOzR*w9n&}%CR1QJnJf-{DuSN z0)sD!PLAwB=g(uc8Kk>*%|8q>df#iPUj2@jp6nn0slrhqUsv%Pt!sgrPMBRIY%rHa z-@!fz7O4|gFVXcgkOyzHLfer~UXeG>ZtPtWPn#4FNan>IQj+z*q%4gU{<1&hOH3m6 zyg+DD{cV{uVfR0+h0#>zVM>W9gYkSik)DHfhzPQ%oaWs_re7Agun+Z$WQF5dhy&aF z=v#lqJj&0L)<&d?d@%=MrJZq*gL@ispIM(BQA^?ZDD8XRNLJmit|6ycjzKR-x$i{8 zCpUz$=!^vdZGNez_FL(QEPZ-NB9NN#j_V9^h#W{X32L;{ny4!m1(bhe3w_M&!NCah(D% zC!LLL(L|pM1ykm|)+yFCEZYGy%N}=dVQdWqlH3|mrN%_4*kL%wOB-kj1Gf&d$>I@$ zgWz_~GRrlPU|snW@<`4s14K?SCTqqV-;O@Dee}?h*uhxMPaqxh+fSZrUOv+g?NT1O zfviG|SRCD!Une*ozF{h_ee0t#PL+2bj?qsY>c_V0&K|P(+&nW%1=e4|j;fhZhqon% zkP!2pg5>(qYO)b8+wMBIDF7#o;O&ch*T5YOeG=@4xjZd5Dj^q7PA0;C!LVmuQ z45%h+!buAqCVBK78NHTg8!((v!DN5pDy#xO+c6AR;waaYD%hCi%2brBf8N@AK2c}f z#8A)2*Vvt~d3>_wI8he`Xzt&!#T~B##{U!~5!sMpjw?u3_TBrj7d;I+_?-FT*wk7p zn7lCPS|IZH==EoL_e^9$#14M=Li6#uz-QjOAS8hvoY|C{s}g@gr1|AbY5Vr*U7_Sw zUy4o0&Fqb4Ng3b0*|O3f7kK%Is=aX&HQ+(#emYZ|8{tw)uYuAKZ?#eA__`H9O< zJt8@dR2X|cn67W&NN-dm%sT~N$!Z9EssEaTKf5z%@Ma1KVnYqwkqp}g?o(X@Ew=odoG7CPeT~|{&;iu9#m;ouW!bOK zx=1*pa9@_gF+a`tg9-TqzBzRS1d^fdL+zr!=Mzvk;$6?x8X;Zg4I7ITlYi|YJ{%8R z8kJ5P&a>qJhyic+XC7HV2;R!25o7T9W#+=TpaA|3B@$&b>_rT*o;{tKwUWgz7t;q9 z$A<|8g(<%IW@N@BRS;|I^Sw#Ut9Q5Ig5w9(t&&?M7G6R~xrIEev^=U80SmYVlUyJm zP2{sPH|sG`U9%N)SqvQB8+Dcjx0^-of41LPn4$g*@pShL6Pdvf4+7K-N) zWV$B%?pWqIyP{fw}|BR^&+Y~ zXJa$jiRAAElw?ClX)Q;9_eLN^YT0SW^Y%P)CHa2or7uIqPcfo4%7FJO`9pU!NbBQB z<10tJh2 z%8zf;uEn+hFokwg{_vMQ!XSpHCTk;zgjFk*|AAPO2bA&WXtK-yEF%~I{1!ARu!O+w tV+>H3=T9amVo-#Mp8v0c;^WPeM(QKYQQfC$=2+lISwUUC=&4EI{{b^C6y5*; literal 0 HcmV?d00001 From 1e0814c2b0bc3c8f6723467a35589838a2447dd7 Mon Sep 17 00:00:00 2001 From: Fawzi Mohamed Date: Tue, 24 Jun 2025 14:43:11 +0200 Subject: [PATCH 2/8] Added iopsstor vs capstor notes --- docs/guides/lustre-tuning.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/docs/guides/lustre-tuning.md b/docs/guides/lustre-tuning.md index 79aa18c7..87bcc942 100644 --- a/docs/guides/lustre-tuning.md +++ b/docs/guides/lustre-tuning.md @@ -23,3 +23,10 @@ With it it is possible to create a Progressive file layout switching `--stripe-c ```console lfs setstripe -E 4M -c 1 -E 64M -c 4 -E -1 -c -1 -S 4M ``` + +## Iopsstor vs Capstor + +`iopsstor` uses SSD as OST, thus random access is quick, and the performance of the single OST is high. `capstor` on another hand uses harddisks, it has a larger capacity, and it also have many more OSS, thus the total bandwidth is larger. + +!!! Note ML usage + model training normally has better performance if reading from iopsstor (random access), checkpoint can be done to capstor (very good for contiguous access) From 1af02bc17723df2e9daef84158f7705adbc57602 Mon Sep 17 00:00:00 2001 From: Fawzi Mohamed Date: Tue, 24 Jun 2025 16:25:15 +0200 Subject: [PATCH 3/8] added Lustre tuning guide to index --- docs/guides/lustre-tuning.md | 4 ++-- mkdocs.yml | 1 + 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/guides/lustre-tuning.md b/docs/guides/lustre-tuning.md index 87bcc942..884e50c4 100644 --- a/docs/guides/lustre-tuning.md +++ b/docs/guides/lustre-tuning.md @@ -28,5 +28,5 @@ With it it is possible to create a Progressive file layout switching `--stripe-c `iopsstor` uses SSD as OST, thus random access is quick, and the performance of the single OST is high. `capstor` on another hand uses harddisks, it has a larger capacity, and it also have many more OSS, thus the total bandwidth is larger. -!!! Note ML usage - model training normally has better performance if reading from iopsstor (random access), checkpoint can be done to capstor (very good for contiguous access) +!!! Note + ML model training normally has better performance if reading from iopsstor (random access, difficult to predict access pattern). Checkpoint can be done to capstor (very good for contiguous access). diff --git a/mkdocs.yml b/mkdocs.yml index 3247e1ff..90082402 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -112,6 +112,7 @@ nav: - guides/index.md - 'Internet Access on Alps': guides/internet-access.md - 'Storage': guides/storage.md + - 'Lustre tuning': guides/lustre-tuning.md - 'Using the terminal': guides/terminal.md - 'MLP Tutorials': - guides/mlp_tutorials/index.md From 8951a59206d90338a262e94219a84e04247777a0 Mon Sep 17 00:00:00 2001 From: Fawzi Mohamed Date: Wed, 25 Jun 2025 11:25:34 +0200 Subject: [PATCH 4/8] Added refs, integrated with storage guide --- docs/guides/lustre-tuning.md | 32 ----------------------------- docs/guides/storage.md | 40 ++++++++++++++++++++++++++++++++++++ docs/platforms/mlp/index.md | 1 + docs/storage/filesystems.md | 2 ++ mkdocs.yml | 1 - 5 files changed, 43 insertions(+), 33 deletions(-) delete mode 100644 docs/guides/lustre-tuning.md diff --git a/docs/guides/lustre-tuning.md b/docs/guides/lustre-tuning.md deleted file mode 100644 index 884e50c4..00000000 --- a/docs/guides/lustre-tuning.md +++ /dev/null @@ -1,32 +0,0 @@ -# Lustre Tuning -`/capstor/` and `/iopsstor` are both [lustre](https://lustre.org) filesystem. -Lustre is an open-source, parallel file system used in HPC systems. -As shown in ![Lustre architecture](/images/storage/lustre.png) uses *metadata* servers to store and query metadata which is basically what is shown by `ls`: directory structure, file permission, modification dates,.. -This data is globally synchronized, which means that handling many small files is not especially suited for lustre, and the perfomrance of that part is similar on both capstor and iopsstor. -With many small files, a local filesystems like `/dev/shmem/$USER` or "/tmp", if enough memory can be spared for it, can be *much* faster, and offset the packing/unpacking work. Alternatively using a squashed filesystems can be a good option. - -The data itself is subdivided in blocks of size `` and is stored by Object Storage Servers (OSS) in one or more Object Storage Targets (OST). -The blocksize and number of OSTs to use is defined by the striping settings. A new file or directory ihnerits them from its parent directory. The `lfs getstripe ` command can be used to get information on the actual stripe settings. For directories and empty files `lfs setstripe --stripe-count --stripe-size ` can be used to set the layout. The simplest way to have the correct layout is to copy to a directory with the correct layout - -A blocksize of 4MB gives good throughput, without being overly big, so it is a good choice when reading a file sequentially or in large chuncks, but if one reads shorter chuncks in random order it might be better to reduce the size, the performance will be smaller, but the performance of your application might actually increase. -https://doc.lustre.org/lustre_manual.xhtml#managingstripingfreespace - -!!! example "Good large files settings" - ```console - lfs setstripe --stripe-count -1 --stripe-size 4M ` - ``` - -Lustre also supports composite layouts, switching from one layout to another at a given size `--component-end` (`-E`). -With it it is possible to create a Progressive file layout switching `--stripe-count` (`-c`), `--stripe-size` (`-S`), so that fewer locks are required for smaller files, but load is distributed for larger files. - -!!! example "Good default settings" - ```console - lfs setstripe -E 4M -c 1 -E 64M -c 4 -E -1 -c -1 -S 4M - ``` - -## Iopsstor vs Capstor - -`iopsstor` uses SSD as OST, thus random access is quick, and the performance of the single OST is high. `capstor` on another hand uses harddisks, it has a larger capacity, and it also have many more OSS, thus the total bandwidth is larger. - -!!! Note - ML model training normally has better performance if reading from iopsstor (random access, difficult to predict access pattern). Checkpoint can be done to capstor (very good for contiguous access). diff --git a/docs/guides/storage.md b/docs/guides/storage.md index 80da59c2..95c6d1cd 100644 --- a/docs/guides/storage.md +++ b/docs/guides/storage.md @@ -113,10 +113,50 @@ To set up a default so all newly created folders and dirs inside or your desired !!! info For more information read the setfacl man page: `man setfacl`. +[](){#ref-guides-storage-lustre} +## Lustre Tuning +[Capstor][ref-alps-capstor] and [Iopsstor][ref-alps-iopsstor] are both [lustre](https://lustre.org) filesystem. +Lustre is an open-source, parallel file system used in HPC systems. +As shown in the schema below + +![Lustre architecture](/images/storage/lustre.png) + +Lustre uses *metadata* servers to store and query metadata which is basically what is shown by `ls`: directory structure, file permission, modification dates,.. +This data is globally synchronized, which means that handling many small files is not especially suited for lustre, and the perfomrance of that part is similar on both Capstor and Iopsstor. The section below discusses [how to handle many small files][ref-guides-storage-small-files] + +The data itself is subdivided in blocks of size `` and is stored by Object Storage Servers (OSS) in one or more Object Storage Targets (OST). +The blocksize and number of OSTs to use is defined by the striping settings. A new file or directory ihnerits them from its parent directory. The `lfs getstripe ` command can be used to get information on the actual stripe settings. For directories and empty files `lfs setstripe --stripe-count --stripe-size ` can be used to set the layout. The simplest way to have the correct layout is to copy to a directory with the correct layout + +A blocksize of 4MB gives good throughput, without being overly big, so it is a good choice when reading a file sequentially or in large chuncks, but if one reads shorter chuncks in random order it might be better to reduce the size, the performance will be smaller, but the performance of your application might actually increase. +https://doc.lustre.org/lustre_manual.xhtml#managingstripingfreespace + +!!! example "Settings for large files" + ```console + lfs setstripe --stripe-count -1 --stripe-size 4M ` + ``` + +Lustre also supports composite layouts, switching from one layout to another at a given size `--component-end` (`-E`). +With it it is possible to create a Progressive file layout switching `--stripe-count` (`-c`), `--stripe-size` (`-S`), so that fewer locks are required for smaller files, but load is distributed for larger files. + +!!! example "Good default settings" + ```console + lfs setstripe -E 4M -c 1 -E 64M -c 4 -E -1 -c -1 -S 4M + ``` + +### Iopsstor vs Capstor + +[Iopsstor][ref-alps-iopsstor] uses SSD as OST, thus random access is quick, and the performance of the single OST is high. [Capstor][ref-alps-capstor] on another hand uses harddisks, it has a larger capacity, and it also have many more OSS, thus the total bandwidth is larger. + +!!! Note + ML model training normally has better performance if reading from iopsstor (random access, difficult to predict access pattern). Checkpoint can be done to capstor (very good for contiguous access). + +[](){#ref-guides-storage-small-files} ## Many small files vs. HPC File Systems Workloads that read or create many small files are not well-suited to parallel file systems, which are designed for parallel and distributed I/O. +In some cases, and if enough memory is available it might be worth to unpack/repack the small files to local in memory filesystems like `/dev/shmem/$USER` or `/tmp`, which are *much* faster, or to use a squashfs filesystem that is stored as a single large file on lustre. + Workloads that do not play nicely with Lustre include: * Configuration and compiling applications. diff --git a/docs/platforms/mlp/index.md b/docs/platforms/mlp/index.md index e5ab58eb..1d8a7f46 100644 --- a/docs/platforms/mlp/index.md +++ b/docs/platforms/mlp/index.md @@ -63,6 +63,7 @@ Scratch is per user - each user gets separate scratch path and quota. The Capstor scratch filesystem is based on HDDs and is optimized for large, sequential read and write operations. We recommend using Capstor for storing **checkpoint files** and other **large, contiguous outputs** generated by your training runs. In contrast, Iopstor uses high-performance NVMe drives, which excel at handling **IOPS-intensive workloads** involving frequent, random access. This makes it a better choice for storing **training datasets**, especially when accessed randomly during machine learning training. + See the [Lustre guide][ref-guides-storage-lustre] for some hints on how to get the best performance out of the filesystem. ### Scratch Usage Recommendations diff --git a/docs/storage/filesystems.md b/docs/storage/filesystems.md index 6f1735f1..7fc46c01 100644 --- a/docs/storage/filesystems.md +++ b/docs/storage/filesystems.md @@ -84,6 +84,7 @@ Daily [snapshots][ref-storage-snapshots] for the last seven days are provided in ## Scratch The Scratch file system is a fast workspace tuned for use by parallel jobs, with an emphasis on performance over reliability, hosted on the [Capstor][ref-alps-capstor] Lustre filesystem. +See the [Lustre guide][ref-guides-storage-lustre] for some hints on how to get the best performance out of the filesystem. All users on Alps get their own Scratch path, `/capstor/scratch/cscs/$USER`, which is pointed to by the variable `$SCRATCH` on the [HPC Platform][ref-platform-hpcp] and [Climate and Weather Platform][ref-platform-cwp] clusters Eiger, Daint and Santis. @@ -123,6 +124,7 @@ Please ensure that you move important data to a file system with backups, for ex ## Store Store is a large, medium-performance, storage on the [Capstor][ref-alps-capstor] Lustre file system for sharing data within a project, and for medium term data storage. +See the [Lustre guide][ref-guides-storage-lustre] for some hints on how to get the best preformance out of the filesystem. Space on Store is allocated per-project, with a path created for each project. To accomodate the different customers and projects on Alps, the project paths are organised as follows: diff --git a/mkdocs.yml b/mkdocs.yml index 90082402..3247e1ff 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -112,7 +112,6 @@ nav: - guides/index.md - 'Internet Access on Alps': guides/internet-access.md - 'Storage': guides/storage.md - - 'Lustre tuning': guides/lustre-tuning.md - 'Using the terminal': guides/terminal.md - 'MLP Tutorials': - guides/mlp_tutorials/index.md From ffa006ad5ed899512c86d1de535e54ccefa8cbb3 Mon Sep 17 00:00:00 2001 From: Fawzi Mohamed Date: Wed, 25 Jun 2025 14:41:33 +0200 Subject: [PATCH 5/8] Integrated suggestions by @msimberg, cleanups --- docs/alps/storage.md | 2 ++ docs/guides/storage.md | 23 ++++++++++++----------- docs/platforms/mlp/index.md | 3 ++- 3 files changed, 16 insertions(+), 12 deletions(-) diff --git a/docs/alps/storage.md b/docs/alps/storage.md index ef7df25b..bd18b6b6 100644 --- a/docs/alps/storage.md +++ b/docs/alps/storage.md @@ -19,6 +19,8 @@ HPC storage is provided by independent clusters, composed of servers and physica Capstor and Iopsstor are on the same Slingshot network as Alps, while VAST is on the CSCS Ethernet network. +See the [Lustre guide][ref-guides-storage-lustre] for some hints on how to get the best performance out of the filesystem. + The mounts, and how they are used for Scratch, Store, and Home file systems that are mounted on clusters are documented in the [file system docs][ref-storage-fs]. [](){#ref-alps-capstor} diff --git a/docs/guides/storage.md b/docs/guides/storage.md index 95c6d1cd..8aef55ae 100644 --- a/docs/guides/storage.md +++ b/docs/guides/storage.md @@ -116,25 +116,28 @@ To set up a default so all newly created folders and dirs inside or your desired [](){#ref-guides-storage-lustre} ## Lustre Tuning [Capstor][ref-alps-capstor] and [Iopsstor][ref-alps-iopsstor] are both [lustre](https://lustre.org) filesystem. -Lustre is an open-source, parallel file system used in HPC systems. + As shown in the schema below ![Lustre architecture](/images/storage/lustre.png) -Lustre uses *metadata* servers to store and query metadata which is basically what is shown by `ls`: directory structure, file permission, modification dates,.. -This data is globally synchronized, which means that handling many small files is not especially suited for lustre, and the perfomrance of that part is similar on both Capstor and Iopsstor. The section below discusses [how to handle many small files][ref-guides-storage-small-files] +Lustre uses *metadata* servers to store and query metadata which is basically what is shown by `ls`: directory structure, file permission, modification dates,... +Its performance is roughly the same on [Capstor][ref-alps-capstor] and [Iopsstor][ref-alps-iopsstor]. +This data is globally synchronized, which means that handling many small files is not especially suited for Lustre, see the discussion on [how to handle many small files][ref-guides-storage-small-files]. The data itself is subdivided in blocks of size `` and is stored by Object Storage Servers (OSS) in one or more Object Storage Targets (OST). -The blocksize and number of OSTs to use is defined by the striping settings. A new file or directory ihnerits them from its parent directory. The `lfs getstripe ` command can be used to get information on the actual stripe settings. For directories and empty files `lfs setstripe --stripe-count --stripe-size ` can be used to set the layout. The simplest way to have the correct layout is to copy to a directory with the correct layout +The blocksize and number of OSTs to use is defined by the striping settings. +A new file or directory ihnerits them from its parent directory. +The `lfs getstripe ` command can be used to get information on the actual stripe settings. +For directories and empty files `lfs setstripe --stripe-count --stripe-size ` can be used to set the layout. The simplest way to have the correct layout is to copy to a directory with the correct layout -A blocksize of 4MB gives good throughput, without being overly big, so it is a good choice when reading a file sequentially or in large chuncks, but if one reads shorter chuncks in random order it might be better to reduce the size, the performance will be smaller, but the performance of your application might actually increase. +A blocksize of 4MB gives good throughput, without being overly big, so it is a good choice when reading a file sequentially or in large chunks, but if one reads shorter chunks in random order it might be better to reduce the size, the performance will be smaller, but the performance of your application might actually increase. https://doc.lustre.org/lustre_manual.xhtml#managingstripingfreespace !!! example "Settings for large files" ```console lfs setstripe --stripe-count -1 --stripe-size 4M ` ``` - Lustre also supports composite layouts, switching from one layout to another at a given size `--component-end` (`-E`). With it it is possible to create a Progressive file layout switching `--stripe-count` (`-c`), `--stripe-size` (`-S`), so that fewer locks are required for smaller files, but load is distributed for larger files. @@ -145,17 +148,15 @@ With it it is possible to create a Progressive file layout switching `--stripe-c ### Iopsstor vs Capstor -[Iopsstor][ref-alps-iopsstor] uses SSD as OST, thus random access is quick, and the performance of the single OST is high. [Capstor][ref-alps-capstor] on another hand uses harddisks, it has a larger capacity, and it also have many more OSS, thus the total bandwidth is larger. - -!!! Note - ML model training normally has better performance if reading from iopsstor (random access, difficult to predict access pattern). Checkpoint can be done to capstor (very good for contiguous access). +[Iopsstor][ref-alps-iopsstor] uses SSD as OST, thus random access is quick, and the performance of the single OST is high. +[Capstor][ref-alps-capstor] on another hand uses harddisks, it has a larger capacity, and it also have many more OSS, thus the total bandwidth is larger. See for example the [ML filesystem suitability][ref-mlp-storage-suitability]. [](){#ref-guides-storage-small-files} ## Many small files vs. HPC File Systems Workloads that read or create many small files are not well-suited to parallel file systems, which are designed for parallel and distributed I/O. -In some cases, and if enough memory is available it might be worth to unpack/repack the small files to local in memory filesystems like `/dev/shmem/$USER` or `/tmp`, which are *much* faster, or to use a squashfs filesystem that is stored as a single large file on lustre. +In some cases, and if enough memory is available it might be worth to unpack/repack the small files to in-memory filesystems like `/dev/shm/$USER` or `/tmp`, which are *much* faster, or to use a squashfs filesystem that is stored as a single large file on Lustre. Workloads that do not play nicely with Lustre include: diff --git a/docs/platforms/mlp/index.md b/docs/platforms/mlp/index.md index 1d8a7f46..3b2387ca 100644 --- a/docs/platforms/mlp/index.md +++ b/docs/platforms/mlp/index.md @@ -52,13 +52,14 @@ Use scratch to store datasets that will be accessed by jobs, and for job output. Scratch is per user - each user gets separate scratch path and quota. * The environment variable `SCRATCH=/iopsstor/scratch/cscs/$USER` is set automatically when you log into the system, and can be used as a shortcut to access scratch. -* There is an additional scratch path mounted on [Capstor][ref-alps-capstor] at `/capstor/scratch/cscs/$USER`. +* There is an additional scratch path mounted on [Capstor][ref-alps-capstor] at `/capstor/scratch/cscs/$USER`. !!! warning "scratch cleanup policy" Files that have not been accessed in 30 days are automatically deleted. **Scratch is not intended for permanent storage**: transfer files back to the capstor project storage after job runs. +[](){#ref-mlp-storage-suitability} !!! note "file system suitability" The Capstor scratch filesystem is based on HDDs and is optimized for large, sequential read and write operations. We recommend using Capstor for storing **checkpoint files** and other **large, contiguous outputs** generated by your training runs. From 899bb7e7a18e89fd286af701d7c6682b3198df18 Mon Sep 17 00:00:00 2001 From: Fawzi Mohamed Date: Wed, 25 Jun 2025 15:01:00 +0200 Subject: [PATCH 6/8] fix image in preview --- docs/guides/storage.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/guides/storage.md b/docs/guides/storage.md index 8aef55ae..2ac553ea 100644 --- a/docs/guides/storage.md +++ b/docs/guides/storage.md @@ -119,7 +119,7 @@ To set up a default so all newly created folders and dirs inside or your desired As shown in the schema below -![Lustre architecture](/images/storage/lustre.png) +![Lustre architecture](../../images/storage/lustre.png) Lustre uses *metadata* servers to store and query metadata which is basically what is shown by `ls`: directory structure, file permission, modification dates,... Its performance is roughly the same on [Capstor][ref-alps-capstor] and [Iopsstor][ref-alps-iopsstor]. From eeb39f4e940e4b79cb0efb840def3e07847772b9 Mon Sep 17 00:00:00 2001 From: Fawzi Mohamed Date: Wed, 25 Jun 2025 15:05:57 +0200 Subject: [PATCH 7/8] really fix image in preview --- docs/guides/storage.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/guides/storage.md b/docs/guides/storage.md index 2ac553ea..9d43f12b 100644 --- a/docs/guides/storage.md +++ b/docs/guides/storage.md @@ -119,7 +119,7 @@ To set up a default so all newly created folders and dirs inside or your desired As shown in the schema below -![Lustre architecture](../../images/storage/lustre.png) +![Lustre architecture](../images/storage/lustre.png) Lustre uses *metadata* servers to store and query metadata which is basically what is shown by `ls`: directory structure, file permission, modification dates,... Its performance is roughly the same on [Capstor][ref-alps-capstor] and [Iopsstor][ref-alps-iopsstor]. From 31df88f5ebcc3ab6a50261bf89a48f2e5a334ba3 Mon Sep 17 00:00:00 2001 From: bcumming Date: Wed, 25 Jun 2025 18:03:01 +0200 Subject: [PATCH 8/8] polish the lustre guide --- docs/guides/storage.md | 27 ++++++++++++++------------- 1 file changed, 14 insertions(+), 13 deletions(-) diff --git a/docs/guides/storage.md b/docs/guides/storage.md index 9d43f12b..17048328 100644 --- a/docs/guides/storage.md +++ b/docs/guides/storage.md @@ -114,25 +114,25 @@ To set up a default so all newly created folders and dirs inside or your desired For more information read the setfacl man page: `man setfacl`. [](){#ref-guides-storage-lustre} -## Lustre Tuning -[Capstor][ref-alps-capstor] and [Iopsstor][ref-alps-iopsstor] are both [lustre](https://lustre.org) filesystem. - -As shown in the schema below +## Lustre tuning +[Capstor][ref-alps-capstor] and [Iopsstor][ref-alps-iopsstor] are both [Lustre](https://lustre.org) filesystem. ![Lustre architecture](../images/storage/lustre.png) -Lustre uses *metadata* servers to store and query metadata which is basically what is shown by `ls`: directory structure, file permission, modification dates,... +As shown in the schema above, Lustre uses *metadata* servers to store and query metadata, which is basically what is shown by `ls`: directory structure, file permission, and modification dates. Its performance is roughly the same on [Capstor][ref-alps-capstor] and [Iopsstor][ref-alps-iopsstor]. -This data is globally synchronized, which means that handling many small files is not especially suited for Lustre, see the discussion on [how to handle many small files][ref-guides-storage-small-files]. +This data is globally synchronized, which means Lustre is not well suited to handling many small files, see the discussion on [how to handle many small files][ref-guides-storage-small-files]. The data itself is subdivided in blocks of size `` and is stored by Object Storage Servers (OSS) in one or more Object Storage Targets (OST). -The blocksize and number of OSTs to use is defined by the striping settings. -A new file or directory ihnerits them from its parent directory. -The `lfs getstripe ` command can be used to get information on the actual stripe settings. -For directories and empty files `lfs setstripe --stripe-count --stripe-size ` can be used to set the layout. The simplest way to have the correct layout is to copy to a directory with the correct layout +The blocksize and number of OSTs to use is defined by the striping settings, which are applied to a path, with new files and directories ihneriting them from their parent directory. +The `lfs getstripe ` command can be used to get information on the stripe settings of a path. +For directories and empty files `lfs setstripe --stripe-count --stripe-size ` can be used to set the layout. +The simplest way to have the correct layout is to copy to a directory with the correct layout + +!!! tip "A blocksize of 4MB gives good throughput, without being overly big..." + ... so it is a good choice when reading a file sequentially or in large chunks, but if one reads shorter chunks in random order it might be better to reduce the size, the performance will be smaller, but the performance of your application might actually increase. + See the [Lustre documentation](https://doc.lustre.org/lustre_manual.xhtml#managingstripingfreespace) for more information. -A blocksize of 4MB gives good throughput, without being overly big, so it is a good choice when reading a file sequentially or in large chunks, but if one reads shorter chunks in random order it might be better to reduce the size, the performance will be smaller, but the performance of your application might actually increase. -https://doc.lustre.org/lustre_manual.xhtml#managingstripingfreespace !!! example "Settings for large files" ```console @@ -149,7 +149,8 @@ With it it is possible to create a Progressive file layout switching `--stripe-c ### Iopsstor vs Capstor [Iopsstor][ref-alps-iopsstor] uses SSD as OST, thus random access is quick, and the performance of the single OST is high. -[Capstor][ref-alps-capstor] on another hand uses harddisks, it has a larger capacity, and it also have many more OSS, thus the total bandwidth is larger. See for example the [ML filesystem suitability][ref-mlp-storage-suitability]. +[Capstor][ref-alps-capstor] on another hand uses harddisks, it has a larger capacity, and it also have many more OSS, thus the total bandwidth is larger. +See for example the [ML filesystem guide][ref-mlp-storage-suitability]. [](){#ref-guides-storage-small-files} ## Many small files vs. HPC File Systems