Langium AI #2008

RuanVanRooyenDSA · 2025-08-21T06:07:37Z

RuanVanRooyenDSA
Aug 21, 2025

I hope you’re doing well.

I’m currently experimenting with the Langium AI tool and my goal is to evaluate which large language model produces responses that correctly follow the Langium grammar I have defined. However, I’ve encountered a challenge: the example workspace available in the Langium AI repository focuses on grammar generation, rather than on validating or parsing LLM-generated output using an existing grammar.

Could you please let me know if there are:

Any examples, sample workspaces, or demos that explicitly show how to use a predefined grammar to validate or filter responses generated by an LLM?
Recommended guidance or best practices for implementing this type of workflow using Langium AI?

Any pointers, code snippets, or references to relevant resources would be incredibly helpful.

Thank you so much for your time and support!

Answered by montymxb

Aug 21, 2025

Hi @RuanVanRooyenDSA , thanks for the question! Sounds like you're talking about eval-langium example, which does focus on evaluating generated grammars. However the same approach will work for evaluating program output that corresponds to your DSL as well. It just happens to be that in that example the program that we're parsing is a generated Langium grammar, since Langium parses itself.

For any other Langium DSL, you can extend the base evaluator class with one that invokes the services for your DSL instead (building a document & invoking validation on it). It's very thin by design, so you can determine how you want to do that for your language. For reference, you can see how the langi…

View full answer

montymxb · 2025-08-21T07:45:22Z

montymxb
Aug 21, 2025
Collaborator

Hi @RuanVanRooyenDSA , thanks for the question! Sounds like you're talking about eval-langium example, which does focus on evaluating generated grammars. However the same approach will work for evaluating program output that corresponds to your DSL as well. It just happens to be that in that example the program that we're parsing is a generated Langium grammar, since Langium parses itself.

For any other Langium DSL, you can extend the base evaluator class with one that invokes the services for your DSL instead (building a document & invoking validation on it). It's very thin by design, so you can determine how you want to do that for your language. For reference, you can see how the langium-evaluator extends this to invoke the services for parsing & validating Langium grammars, and then retrieving those diagnostics for checking later. Effectively you can do a very similar approach for your own language as well using your own service set, which can be retrieved like so (as an example):

const myDSLServices = createMyDSLServices(NodeFileSystem);
const myDSLEvaluator = new MyDSLEvaluator(myDSLServices.mydsl);
// invoke your custom evaluator like in evaluator-langium example

This works pretty nicely if you're project is a mono-repo, but you can also link in your DSL project to get access to the services as well.

Generally, once you have this setup, we tend to see that users will consume that information as part of a larger Python workflow (like a fine-tuning run for example). This can be via piping in the output via a cli, or running a small server (like express) to make it available to some other local systems. It's somewhat dependent on your preference & workflow constraints.

Since the evaluator is pretty lightweight you can also compute additional metrics if needed and modify the evaluator result type, based on your needs & requirements. This comes up as well, since there are often additional metrics that others want to add which are sourced via some other means (quality assessment, code complexity heuristic, or some other grading for example).

Hope that helps!

2 replies

RuanVanRooyenDSA Aug 21, 2025
Author

Thank you for your helpful response regarding extending the eval-langium example for custom DSL evaluation. I’m excited to put this into practice!

Could you please clarify whether the langium-ai-tools package is compatible with any version of Langium, or if it is tied to specific versions? If there are version constraints or best-practice pairings (e.g., “use with Langium vX.x.x only”), I’d appreciate your guidance—especially regarding backward compatibility or recommended upgrade paths.

Many thanks again for your support!

montymxb Aug 22, 2025
Collaborator

The current release is built against version 3.4.0 of Langium, and won't work against 3.5.0 or 4.0.0 (4.0.0 being the latest release as of the last few weeks) due to changes to the various types, in particular the services. This goes the other way too as well unfortunately, for 3.3.0 wouldn't work either.

We need to roll out a release for 4.0.0, but I'm leaning towards releasing versions that match on the major & minor numbers to track better (and more obviously too) with Langium's API at that point. I.e. 3.4.X of langium-ai-tools would then be compatible with Langium 3.4.X.

If I have some time I can push those versions out today actually 🤔

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Langium AI #2008

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Langium AI #2008

Uh oh!

Uh oh!

RuanVanRooyenDSA Aug 21, 2025

Replies: 1 comment · 2 replies

Uh oh!

montymxb Aug 21, 2025 Collaborator

Uh oh!

RuanVanRooyenDSA Aug 21, 2025 Author

Uh oh!

Uh oh!

montymxb Aug 22, 2025 Collaborator

RuanVanRooyenDSA
Aug 21, 2025

Replies: 1 comment 2 replies

montymxb
Aug 21, 2025
Collaborator

RuanVanRooyenDSA Aug 21, 2025
Author

montymxb Aug 22, 2025
Collaborator