[FEATURE] Add Question Quality Scoring and Ranking Layer to Improve Output Quality

### Feature and its Use Cases

### Feature Description

Introduce a **Question Quality Scoring & Ranking layer** in the existing question generation pipeline to improve the overall quality of generated questions.

Currently, EduAid generates questions (MCQ, Boolean, Short Answer) and returns them after basic processing such as chunking, deduplication, and slicing based on `max_questions`. However, there is **no mechanism to evaluate or rank the quality** of generated questions. As a result, the system may return questions that are valid but **not necessarily the most meaningful, diverse, or useful**.

This feature aims to enhance the pipeline by **scoring each generated question and ranking them**, ensuring that the system returns the **best-quality questions instead of simply the first N generated ones**.

---

### Use Case

- Users may receive questions that are **too simple, repetitive in idea, or less informative**  
- Generated output quality may vary depending on input content  
- In educational scenarios, selecting **better-quality questions improves learning outcomes**  
- When multiple valid questions exist, the system should **prioritize the most relevant and well-structured ones**

---

### Implementation / Proposed Approach & Benefits

- Leverage the existing `QAEvaluator` (already present in the codebase) to **score question-answer pairs**  
- Rank generated questions based on their **quality scores** and return **top-N highest scoring questions** instead of first-N  
- Add an optional request parameter:
  ```json
  { "use_scoring": true }
- Apply scoring only when enabled to avoid unnecessary performance overhead
- Skip scoring when the number of questions is too small and fallback to the original results if scoring fails

   This will result in:
- Improved relevance, clarity, and usefulness of generated questions
- More consistent and high-quality outputs across different inputs
- Better alignment with real-world educational use cases
- Enhanced system intelligence without modifying core generation logic

---

### Additional Context
The repository already includes a QAEvaluator class capable of scoring and ranking question-answer pairs using a BERT-based model, but it is not currently integrated into the primary endpoints (/get_mcq, /get_boolq, /get_shortq). This feature focuses on activating and integrating that existing evaluation component into the main pipeline.

### Code of Conduct

- [x] I have joined the [Discord server](https://discord.gg/hjUhu33uAn) and will post updates there
- [x] I have searched existing issues to avoid duplicates

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FEATURE] Add Question Quality Scoring and Ranking Layer to Improve Output Quality #597

Feature and its Use Cases

Feature Description

Use Case

Implementation / Proposed Approach & Benefits

Additional Context

Code of Conduct

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[FEATURE] Add Question Quality Scoring and Ranking Layer to Improve Output Quality #597

Description

Feature and its Use Cases

Feature Description

Use Case

Implementation / Proposed Approach & Benefits

Additional Context

Code of Conduct

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions