Skip to content

Commit 693ba2a

Browse files
committed
Updated some URLs. Added real-world workload docs to setup
1 parent 017031f commit 693ba2a

File tree

3 files changed

+31
-5
lines changed

3 files changed

+31
-5
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -233,7 +233,7 @@ Hops is released under an [Apache 2.0 license](LICENSE.txt).
233233

234234
# Associated Publications
235235

236-
This software was the subject of the paper, *λFS: A Scalable and Elastic Distributed File System Metadata Service using Serverless Functions*. This paper can be found [here](https://arxiv.org/abs/2306.11877) and is set to appear in the proceedings of ASPLOS'23. The software found [here](https://github.com/ds2-lab/LambdaFS-Benchmark-Utility) was used to evaluate λFS and HopsFS for the paper.
236+
This software was the subject of the paper, *λFS: A Scalable and Elastic Distributed File System Metadata Service using Serverless Functions*. This paper can be found [here](https://arxiv.org/abs/2306.11877) and is set to appear in the proceedings of ASPLOS'23. The software found [here](https://github.com/ds2-lab/LambdaFS-Benchmarking) was used to evaluate λFS and HopsFS for the paper.
237237

238238
**BibTeX Citation (for arXiv preprint)**:
239239
```

aws-setup/documentation/setup.md

Lines changed: 29 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -411,7 +411,7 @@ If everything works, then you'll also see a message prefixed with `[SUCCESS]` be
411411
412412
# λFS Benchmarking Utility
413413
414-
**NOTE:** This information is covered in greater detail in the LambdaFS-Benchmarking-Utility GitHub repository available [here](https://github.com/ds2-lab/LambdaFS-Benchmark-Utility).
414+
**NOTE:** This information is covered in greater detail in the LambdaFS-Benchmarking-Utility GitHub repository available [here](https://github.com/ds2-lab/LambdaFS-Benchmarking).
415415
416416
To run the same software we used when evaluating λFS and HopsFS, you can navigate to the `/home/ubuntu/repos/LambdaFS-BenchmarkingUtility` directory. The branch compatible with λFS is the `origin/generic` branch, while the branch compatible with HopsFS is the `origin/vanilla-distributed` branch.
417417
@@ -455,7 +455,7 @@ There are several configuration parameters to set:
455455
456456
For each "follower" (i.e., other machine on which you'd like to run the benchmarking software), you must add an entry to the `followers` list using the format shown above. If deployed on AWS EC2 within a VPC, then the `ip` is the private IPv4 of the EC2 VM. For `user`, specify the OS username that should be used when SSH-ing to the machine. If using our provided EC2 AMIs, then this will be `ubuntu`.
457457
458-
## Full List of Available Command-Line Arguments
458+
## **Full List of Available Command-Line Arguments**
459459
460460
The following is the full list of available command-line arguments for the λFS Benchmarking Utility.
461461
```
@@ -482,4 +482,30 @@ The following is the full list of available command-line arguments for the λFS
482482
The command should SCP the hdfs-site.xml config file to each follower.
483483
484484
-m --manually_launch_followers [no value] [default: false]
485-
```
485+
```
486+
487+
## Real-World Workloads
488+
489+
This software also drives simulations of the HDFS Spotify workload described in the paper. This option can be selected from the interactive menu along with all of the other experiments. The real-world workload expects there to be a `workload.yaml` file in the root of the repository on the primary client (i.e., experiment driver). The following is a description of the available parameters.
490+
491+
### **General Config Parameters for the Real-World Spotify Workload**
492+
- `num.worker.threads` (`int`): The total number of clients that each individual worker node should use. If this is set to `128` and there are 8 worker nodes used in the experiment, then there will be a total of 1,024 clients.
493+
- `files.to.create.in.warmup.phase` (`int`): The number of files that each individual client should create at the very beginning of the experiment. These files are used to perform `move`, `delete`, and `rename` operations.
494+
- `warmup.phase.wait.time` (`int`): How long to wait at the beginning for all "warm-up files" to be created before moving onto the actual experiment.
495+
- `interleaved.bm.duration` (`int`): How long the real-world experiment should last (in *milliseconds*).
496+
- `interleaved.bm.iat.unit` (`int`) (**recommended:** `15`): How long, in seconds, the current randomly-generated throughput value should last before a new value is generated.
497+
- `interleaved.bm.iat.skipunit` (`int`) (**recommended:** `0`): Skips rate-limiting for this number of ticks. Recommended to leave this at 0.
498+
- `interleaved.bm.iat.distribution` (`string`) (**recommended:** `PARETO`): Defines the distribution to use when randomly generating file system operations. Options include `"UNIFORM"`, `"PARETO"` (default/recommended), `"POISSON"`, and `"ZIPF"`.
499+
- `interleaved.bm.iat.pareto.alpha`(`int`): (**recommended:** `2`): Shape parameter of the `Pareto` distribution.
500+
- `interleaved.bm.iat.pareto.location` (`int`): (**recommended:** `10000`): Used as a parameter to the `Pareto` distribution.
501+
502+
### **File System Operation Distribution Parameters**
503+
- `interleaved.create.files.percentage`(**recommended:** `1.09`): Percentage of `CREATE-FILE` operations.
504+
- `interleaved.rename.files.percentage`(**recommended:** `0.55`): Percentage of `RENAME-FILE` operations.
505+
- `interleaved.delete.files.percentage`(**recommended:** `0.34`): Percentage of `DELETE-FILE` operations.
506+
- `interleaved.mkdir.percentage`(**recommended:** `0.02`): Percentage of `MKDIR` operations.
507+
- `interleaved.read.files.percentage`(**recommended:** `71.84`): Percentage of `READ-FILE` operations.
508+
- `interleaved.ls.dirs.percentage`(**recommended:** `8.17`): Percentage of `LIST-DIRECTORY` operations.
509+
- `interleaved.ls.files.percentage`(**recommended:** `0.68`): Percentage of `LIST-FILE` operations.
510+
- `interleaved.file.getInfo.percentage`(**recommended:** `13.54`): Percentage of `STAT-FILE` operations.
511+
- `interleaved.dir.getInfo.percentage`(**recommended:** `3.77`): Percentage of `STAT-DIRECTORY` operations.

aws-setup/documentation/setup_tldr.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -228,4 +228,4 @@ com.gmail.benrcarver.distributed.InteractiveTest --leader_ip <PRIVATE IPv4 OF VM
228228

229229
Note that we're setting the JVM heap size to 2GB in the above command via the flags `-Xmx2g -Xms2g`. If you're using a VM with less than 2GB of RAM, then you should adjust this value accordingly. We recommend at least 256-512MB of RAM for basic testing with single file system operations and 1-2GB for benchmarks, especially if running in `distributed` mode.
230230

231-
For more detailed instructions on the `distributed` mode of the benchmarking utility, please refer to the `setup.md` file (in `aws-setup/documentation/setup.md`) or the LambdaFS-Benchmarking-Utility GitHub repository available [here](https://github.com/ds2-lab/LambdaFS-Benchmark-Utility).
231+
For more detailed instructions on the `distributed` mode of the benchmarking utility, please refer to the `setup.md` file (in `aws-setup/documentation/setup.md`) or the LambdaFS-Benchmarking-Utility GitHub repository available [here](https://github.com/ds2-lab/LambdaFS-Benchmarking).

0 commit comments

Comments
 (0)