Skip to content

Latest commit

 

History

History
77 lines (49 loc) · 6.51 KB

File metadata and controls

77 lines (49 loc) · 6.51 KB

Responsible AI Transparency Documentation - Agent Lightning

OVERVIEW

Agent Lightning is a flexible and extensible framework that enables seamless agent optimization for any existing agent framework. Agent optimization includes various data-driven techniques to customize the agent for better performance, including but not limited to model fine-tuning, prompt tuning, and model selection. And the agent frameworks refer to popular and easy-to-use agent developing frameworks such as OpenAI Agents SDK, Microsoft AutoGen, and LangChain.

WHAT CAN AGENT LIGHTNING DO

Agent lightning was developed to bridge the gap between agent workflow development and agent optimization, empowering developers to go beyond static, pre-trained models and unlock the full potential of adaptive, learning-based agents. Agent Lightning is a training framework which can be used for any LLMs.

INTENDED USES

Agent Lightning is best suited for agent researchers and developers. They can easily fine-tune models in existing agent frameworks with Agent Lightning. This can improve model performance on the targeted scenarios.

OUT-OF-SCOPE USES

Agent Lightning is not well-suited for users who are not familiar with agent development and machine learning concepts.

We do not recommend using Agent Lightning in commercial or real-world applications without further testing and development. It is being released for research purposes.

Agent Lightning was not designed or evaluated for all possible downstream purposes. Developers should consider its inherent limitations as they select use cases, and evaluate and mitigate for accuracy, safety, and fairness concerns specific to each intended downstream use.

Agent Lightning should not be used in highly regulated domains where inaccurate outputs could suggest actions that lead to injury or negatively impact an individual's legal, financial, or life opportunities.

We do not recommend using Agent Lightning in the context of high-risk decision making (e.g. in law enforcement, legal, finance, or healthcare).

HOW TO GET STARTED

To begin using Agent Lightning, here are some instructions.

  1. Install dependencies, including Python, uv, PyTorch, FlashAttention, vLLM, verl.
  2. Clone and install Agent Lightning.
  3. Convert the dataset (provided by the user) into parquet file, which contains multiple columns. Each column contains a data id, an input and an expected output.
  4. Run agent, which is developed by the user.
  5. Run the training process via “bash train.sh”

EVALUATION

Agent Lightning was evaluated on its ability to correctly complete 3 example tasks: (1) Math. The model needs to answer some math questions, and when answering one question, the model can use the calculator as its tool to help answer. (2) Text2SQL. The model is given a question related to the database, and it is required to generate a SQL which can query the database, find the information to answer the question. (3) Retrieval-Augmented Generation (RAG). The model is given a question which needs some information from Wikipedia to answer. The model is required to generate some queries to find the related information in Wikipedia, and answer the question according to retrieved documents.

EVALUATION METHODS AND RESULTS

For detailed evaluation methods and results, please refer to the latest version of our technical report.

LIMITATIONS

Agent Lightning was developed for research and experimental purposes. Further testing and validation are needed before considering its application in commercial or real-world scenarios.

Agent Lightning was designed and tested using the English language. Performance in other languages may vary and should be assessed by someone who is both an expert in the expected outputs and a native speaker of that language.

Outputs generated by AI may include factual errors, fabrication, or speculation. Users are responsible for assessing the accuracy of generated content. All decisions leveraging outputs of the system should be made with human oversight and not be based solely on system outputs. Agent Lightning inherits any biases, errors, or omissions produced by its base model. Developers are advised to choose an appropriate base LLM/MLLM carefully, depending on the intended use case. We use some demo cases to show the effectiveness of our training framework. See their links to understand the capabilities and limitations of this model.

BEST PRACTICES

Better performance can be achieved by following the instructions in how to get started section.

We strongly encourage users to use LLMs/MLLMs that support robust Responsible AI mitigations, such as Azure Open AI (AOAI) services. Such services continually update their safety and RAI mitigations with the latest industry standards for responsible use. For more on AOAI’s best practices when employing foundations models for scripts and applications:

Users are responsible for sourcing their datasets legally and ethically. This could include securing appropriate rights, ensuring consent for use of audio/images, and/or the anonymization of data prior to use in research.

Users are reminded to be mindful of data privacy concerns and are encouraged to review the privacy policies associated with any models and data storage solutions interfacing with Agent Lightning.

It is the user’s responsibility to ensure that the use of Agent Lightning complies with relevant data protection regulations and organizational guidelines.

LICENSE

We use the MIT license.

CONTACT

We welcome feedback and collaboration from our audience. If you have suggestions, questions, or observe unexpected/offensive behavior in our technology, please contact us at agent-lightning@microsoft.com.

If the team receives reports of undesired behavior or identifies issues independently, we will update this repository with appropriate mitigations.


Last updated: September 6, 2025 Document version: 1.0