Follow these steps to set up and run the project locally:
git clone https://github.com/Sakiruto/llm-diff-tracker.git
cd llm-diff-tracker
Make sure you have all the required dependencies. It's recommended to create a virtual environment first.
# Optional: Create and activate a virtual environment
python3 -m venv venv
# On Windows
venv\Scripts\activate
# On macOS/Linux
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
You can find the .env.example
in the root directory for the API keys structure.
python -m prd_to_issues.main
Make sure the
.env
file is properly configured before running the module.
- User intent is clearly structured using structured outputs and keyword-based detection with Pydantic.
- Maintaining user feedback consistency by structuring leveled prompts.
- Instead of converting to a .md file through LLM, the JSON is obtained and structured into a .md file as needed.
- ID-based segregation of tasks, user stories, and epics.
- Token-efficient prompting with minimal hallucination.
- Several additional improvements (you'll discover them as you review the code) in
prd_sample
andmain
when comparing the changes.
Some major issues are:
- Sometimes, it returns the entire original structure with minor updates (full response).
- Sometimes, it returns only the updated part (partial response), e.g., just a task or a user story.
If we try to use a code diff algorithm like Meyers' algorithm (along with Bentley-McIlroy and Patience), the major problem arises when merging the partial responses.
Using code diff algorithms is not very fruitful in the case of partial responses, and searching for the updated response after user feedback to track updates based on the initial/first response generated is somewhat complicated. I found useful libraries like deepDiff
that can help search JSON responses, but the extent of search is still questionable.
I would encourage the person working on this problem statement to thoroughly go through:
- OpenAI's core concepts documentation (important topics: structured response, text generation and prompting, function calling).
- Code diff algorithms. I have some good resources to get started with:
For a detailed understanding of the problem statement, you can visit the StackOverflow question I posted (still waiting for an answer, unfortunately!): Link