Currently we do 2 steps:
- Find more relevant documents belongs to query
- Rerank top3
Instead:
- Find more relevant documents belongs to query
- Rerank top3
- Summarize by embedded llm into single structured & compressed doc
OR even
- Find more relevant documents belongs to query
- Summarize by embedded llm into single structured & compressed doc (more accurate)
BTW this issue raising following critical issue with agents (and indeed self chat also):
