-
-
Notifications
You must be signed in to change notification settings - Fork 428
Description
Feature and its Use Cases
Feature Description
Introduce a Question Quality Scoring & Ranking layer in the existing question generation pipeline to improve the overall quality of generated questions.
Currently, EduAid generates questions (MCQ, Boolean, Short Answer) and returns them after basic processing such as chunking, deduplication, and slicing based on max_questions. However, there is no mechanism to evaluate or rank the quality of generated questions. As a result, the system may return questions that are valid but not necessarily the most meaningful, diverse, or useful.
This feature aims to enhance the pipeline by scoring each generated question and ranking them, ensuring that the system returns the best-quality questions instead of simply the first N generated ones.
Use Case
- Users may receive questions that are too simple, repetitive in idea, or less informative
- Generated output quality may vary depending on input content
- In educational scenarios, selecting better-quality questions improves learning outcomes
- When multiple valid questions exist, the system should prioritize the most relevant and well-structured ones
Implementation / Proposed Approach & Benefits
-
Leverage the existing
QAEvaluator(already present in the codebase) to score question-answer pairs -
Rank generated questions based on their quality scores and return top-N highest scoring questions instead of first-N
-
Add an optional request parameter:
{ "use_scoring": true } -
Apply scoring only when enabled to avoid unnecessary performance overhead
-
Skip scoring when the number of questions is too small and fallback to the original results if scoring fails
This will result in:
-
Improved relevance, clarity, and usefulness of generated questions
-
More consistent and high-quality outputs across different inputs
-
Better alignment with real-world educational use cases
-
Enhanced system intelligence without modifying core generation logic
Additional Context
The repository already includes a QAEvaluator class capable of scoring and ranking question-answer pairs using a BERT-based model, but it is not currently integrated into the primary endpoints (/get_mcq, /get_boolq, /get_shortq). This feature focuses on activating and integrating that existing evaluation component into the main pipeline.
Code of Conduct
- I have joined the Discord server and will post updates there
- I have searched existing issues to avoid duplicates