-
Notifications
You must be signed in to change notification settings - Fork 250
GSoC 2026 Projects
AiiDA is a Python framework for managing computational science workflows, with roots in computational materials science. It helps researchers manage large numbers of simulations (10k, 100k, 1M, ...) and complex workflows involving many steps and multiple executables. At the same time, it records the provenance of the entire simulation pipeline with the aim to make it fully reproducible.
AiiDA is used in research projects at universities, research institutes and companies (see SciPy 2020 talk, SciPy 2022 talk, publications, and testimonials).
- Help accelerate the transition to open (computational) science
- Help fix the reproducibility crisis. Computational science is a good place to start.
- Work with a team of computational scientists (mostly physics backgrounds) who are passionate about both science and coding.
- We have an active Discourse community & biweekly developer meetings.
A background in materials science is not strictly required. However, it would make communication easier if applicants do have a certain domain knowledge and understand the domain language.
Say hi on our GSOC 2026 topic on Discourse.
Complexity: Advanced
Duration: 350 hours
This project aims to explore the use of multi-agent AI systems to assist users in creating, running, and analyzing AiiDA workflows through natural language interaction. Rather than relying on large flagship models, the project aims to develop an architecture with multiple smaller, specialized agents that can run on consumer hardware without requiring expensive API quotas. These agents would collaborate to handle complex tasks, with each focusing on specific domains such as workflow execution, simulation configuration, or results analysis.
The key architectural principles focus on all or a subset of the following:
- Multiple specialized agents rather than one general-purpose model - each agent focuses on specific domains or tasks
- Integration with custom-developed tools exposed via the Model Context Protocol (MCP; or similar protocols), to provide agents with access to AiiDA functionalities in a secure way
- Retrieval Augmented Generation (RAG) to retrieve context from different specialized knowledge bases
- Agent collaboration through communication protocols to handle complex multi-step tasks
- Local models where possible, making the system more accessible and cost-effective
The multi-agent system should address various aspects of the AiiDA workflow lifecycle. For workflow execution, agents could help users to set up and run simulations by assisting with configuring input parameters for density functional theory (DFT) simulations, customizing or constructing AiiDA workflows, and preparing inputs for codes like Quantum ESPRESSO or VASP. Users might interact through prompts like "run a geometry optimization of this structure" or "set up a bandstructure calculation." For results analysis and diagnostics, specialized agents could explore simulation results, identify patterns, and diagnose failures. A Quantum ESPRESSO diagnostic agent could help troubleshoot convergence issues, while an analysis agent could query provenance and extract insights from large datasets. Users could ask questions like "why did my calculation fail?" or "find all structures with a bandgap > 2 eV."
The project scope and emphasis on different aspects will be determined in discussion with mentors based on the student's background and interests.
By the end of this project, the student is expected to deliver:
- A functional multi-agent system architecture that demonstrates feasibility of the approach
- Implementation of specialized agents with clearly defined responsibilities and domains
- Integration of the Model Context Protocol (MCP; or similar protocols) to provide agents with access to AiiDA's Python API, tools, and system state
- RAG-based knowledge retrieval system connecting agents to documentation and specialized knowledge bases
- Agent-to-agent communication protocol enabling collaboration on multi-step tasks
- Natural language interface for user interaction (command-line interface as minimum requirement)
- Technical documentation including architecture overview, design patterns, and extension guidelines
Essential:
- Strong proficiency in Python and object-oriented programming
- Strong software engineering skills including testing, documentation, and version control (Git)
- Practical knowledge of working with language models (e.g., Llama, Mistral, GPT4All)
- Ability to work independently and communicate effectively with mentors
Desirable:
- Experience with workflow management systems and scientific computing
- Knowledge of the MCP (Model Context Protocol) or similar tool-use frameworks
- Familiarity with RAG (Retrieval Augmented Generation) techniques and vector databases
- Understanding of agent architectures and multi-agent systems design
- Background in or strong interest in computational materials science and, ideally, DFT (Density Functional Theory) calculations
- Julian Geiger @GeigerJ2
- Edan Bainglass @edan-bainglass
- Giovanni Pizzi @giovannipizzi
Interested students should submit:
- CV highlighting relevant experience with Python development, AI/ML, and any materials science background (if applicable)
- Motivational letter (max. 1 page) explaining your interest in this project and relevant background
We would also appreciate a preliminary architectural proposal (max. 1 page) outlining your proposed approach: system architecture, choice of technologies (models, frameworks, database, etc.), focus area with justification, to assess suitability.
This is an exploratory project at the exciting frontier of AI and computational science. While the AiiDA team does not have extensive experience with multi-agent systems and MCP, we're eager to collaborate with students who bring expertise in these areas. We'll provide deep knowledge of the use of Python in computational science, (AiiDA) workflows, computational materials science, DFT, and the real challenges users face in practice. The goal is to explore what's possible and identify promising approaches, rather than deliver a production-ready system.
- The timeline
In case you are already considering contributing to AiiDA: While we are happy about any external contributions, given limited resources for reviewing, we focus on impactful changes addressing actual user needs. Before submitting PRs, please consider: Was this change triggered by an actual issue with AiiDA? We welcome bug fixes, security improvements, and solutions to real problems. For significant new features or improvements, please discuss with the team first to ensure alignment with project priorities.
We recognize the potential of modern AI tooling for exploratory purposes, prototyping, and learning. We also acknowledge its widespread adoption in modern software development.
However, we discourage copy-pasting of LLM-generated code and low-effort automated contributions. You should always carefully review code and ensure that technical requirements are met (e.g., code style/quality is consistent, CI/CD pipelines pass, tests pass) before requesting review from core maintainers. Use of AI tooling to generate or edit code should always be openly acknowledged.
We recommend taking Google's Guidance for GSoC Contributors using AI tooling in GSoC 2026 to heart.