Improve the benchmarks

The benchmarks should be improved to better represent real-world usage. Different criteria could be explored such as:

* The amount of predicates;
* The amount of shared sub-expressions;
* The complexity of predicates;
* The size of events;
* Depth of the A-Tree.