More robust JSON prompting and parsing

**Describe the Feature**

This feature request is about integrating LangChain's `PydanticOutputParser` and pydantic models into ragas prompting.

**Why is the feature important for you?**

The current prompting of the validation metrics uses a somewhat vague definition of the answer format, specifying only examples about the expected format. The LLM then should deduct the expected format and generate an answer that corresponds to the schema that is only implicitly specified in the single metrics implementation. Especially when using different LLMs compared to the default GPT-3.5, this leads to potential parsing errors and the metrics can not be calculated correctly.

For example:

- Claude and other models tends to return the binary verdict as a JSON number opposed to the expected string: [#752](#752) (this issue is about testset generation, but I saw very similar issues also during context precision/recall calculation) and also [#715](#715)
- Sometimes the "verdict" envelop is omitted from the response: [#733](#733)
- Sometimes the response is embedded in a superfluous envelope: [#668](#668)
- It seems different models have issues with the "Attributed" keys: [#619](#619)
- ...and numerous other bugs might be related to the weak JSON parsing.

**Additional context**

LangChain has a [robust implementation](https://python.langchain.com/docs/modules/model_io/output_parsers/types/pydantic) of instructing the models to return a JSON response conforming to a specific schema. The output format can be specified with pydantic data classes, the expected JSON schema is injected into the prompt, and the response is automatically parsed by pydantic. Additionally, a [retry mechanism](https://python.langchain.com/docs/modules/model_io/output_parsers/types/retry) can be included for even more robust JSON parsing.

Considering that Ragas is already using LangChain, implementing this feature would not create addition dependencies to the project.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

More robust JSON prompting and parsing #761

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

More robust JSON prompting and parsing #761

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions