Add instruction for dataset downsampling

Maybe we should add a blog post for downsampling a dataset (notably retrieval dataset). I could imagine this is a common use-case

_Originally posted by @KennethEnevoldsen in https://github.com/embeddings-benchmark/mteb/pull/3810#pullrequestreview-3617900338_