-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Description
During the generation of the queries in the dataset_generator module, we are setting a parameter called max_query_terms to choose the length (in words) of the generated queries. This value is then used to design the prompt that is sent to the LLM. This means that the queries of the evaluation dataset end up being less diverse than the queries of the users.
Idea
Add a mixture of queries feature, where instead of fixing a number of words for each query, we randomly sample in {1, ..., n, None}, where None means that we don't give a restriction on the number of words for each query, ending up with long and detailed natural language queries.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request