Skip to content

Latest commit

 

History

History
26 lines (18 loc) · 1.32 KB

File metadata and controls

26 lines (18 loc) · 1.32 KB

optimal_zfs_recordsize

A shell script for Linux that analyzes file size distribution on a directory and recommends ZFS recordsize settings for different workload types.

./optimal_zfs_recordsize.sh /path/to/dataset

How it works

  • Pipes find directly into gawk, builds a space-weighted cumulative distribution function (CDF):
  • finds the bins where the CDF exceeds 50%, 70%, and 90%
  • maps the write-heavy sequential case to the bin falling on 50 percentile; the mixed case to P70; and the read-heavy case to P90
  • if 60% of files are smaller than 64KiB AND 80% of the total space is in files bigger than 1MiB, it concludes the CDF is heavily skewed and forces the write-heavy seq. suggestion to 128K and the mixed to 256K as compromise. An alert will be shown.
  • if all 3 cases match the same suggestion, it will give just one
  • in any case the write-heavy random i/o will give always the same suggestion: to match the application block size, not the file size. (For now this case outputs a statica suggestion, maybe in the future I will add a file type detection for databases, we'll see).

Requirements

  • Bash
  • GNU Awk (gawk)
  • GNU find with -printf support

Example output

output