Usage of LM eval harness ?

The latest version of the paper on arxiv mentions evaluation using LM Eval Harness. However, I could not find its usage in this repository. 

Does the team plan to release code using eval harness ?