diff --git a/CommandRetrieval/README.md b/CommandRetrieval/README.md index 41ea498..9629f61 100644 --- a/CommandRetrieval/README.md +++ b/CommandRetrieval/README.md @@ -1,7 +1,6 @@ - # Git Commands Retrieval System -This appraoch implements a retrieval system for commonly used Git commands using vector embeddings and similarity search. It consists of two main components: a vector store creation script and a query interface. +This approach implements a retrieval system for commonly used Git commands using vector embeddings and similarity search. It consists of two main components: a vector store creation script and a query interface. ## Components @@ -12,8 +11,7 @@ This script creates a vector store from a JSON file containing Git commands. #### Features: - Loads Git commands from a JSON file - Uses HuggingFace embeddings (all-MiniLM-L6-v2 model) - Sentence Transformers -- Creates a persistent a Chroma vector store - +- Creates a persistent Chroma vector store ### 2. Git Commands Search @@ -31,15 +29,11 @@ This script provides an interactive interface to query Git commands. ``` pip install langchain langchain_community langchain_huggingface chromadb sentence_transformers ``` - 2. Prepare your Git commands data in a JSON file named `GitCommands.json` in the `CommandRetrieval` directory. - 3. Run `VectorStore.py` to create the vector store: - ``` python VectorStore.py ``` - 4. Use `GitCommands.py` to query Git commands: ``` python GitCommands.py @@ -64,7 +58,7 @@ The `GitCommands.json` file should contain an array of objects with the followin "param_description": "The ID or reference of the second commit." } ] - } + }, ... ] ``` @@ -72,11 +66,8 @@ The `GitCommands.json` file should contain an array of objects with the followin ## How It Works 1. `VectorStore.py` loads the Git commands from the JSON file, creates embeddings for each command, and stores them in a Chroma vector store. - 2. `GitCommands.py` loads the pre-created vector store and waits for user input. - 3. When a user enters a query, the script performs a similarity search to find the most relevant Git commands. - 4. The retrieved commands are displayed with their descriptions and parameters. ## Customization @@ -84,3 +75,49 @@ The `GitCommands.json` file should contain an array of objects with the followin - To use a different embedding model, modify the `model_name` parameter in the `HuggingFaceEmbeddings` initialization. - Adjust the number of results returned by modifying the `k` parameter in the similarity search. +## Contributing + +We welcome contributions to improve and expand the Git Commands Retrieval System. Here are some ways you can contribute: + +### Enhancing the JSON Data + +1. Open the `GitCommands.json` file in the `CommandRetrieval` directory. +2. Add new Git commands or improve existing ones following the established format: + ```json + { + "command": "git command", + "description": "Detailed description of the command.", + "parameters": [ + { + "param": "parameter_name", + "param_description": "Description of the parameter." + } + ] + } + ``` +3. Ensure the JSON is valid after your changes. +4. Submit a pull request with your updates. + +### Improving the Code + +1. Fork the repository and create a new branch for your changes. +2. Make your improvements to either `VectorStore.py` or `GitCommands.py`. +3. Test your changes thoroughly. +4. Submit a pull request with a clear description of your improvements. + +### Adding New Features + +1. Open an issue to discuss the new feature you'd like to add. +2. Once approved, implement the feature in a new branch. +3. Update the README to reflect the new functionality. +4. Submit a pull request with your changes. + +### Reporting Issues + +If you encounter any bugs or have suggestions for improvements: + +1. Check if the issue already exists in the project's issue tracker. +2. If not, create a new issue with a clear title and description. +3. Include steps to reproduce the problem and any relevant error messages. + +By contributing, you help make this Git Commands Retrieval System more comprehensive and useful for everyone. Thank you for your support! diff --git a/GitAIAgent/GitAIAgent.py b/GitAIAgent/GitAIAgent.py index f21fa6d..88b87ea 100644 --- a/GitAIAgent/GitAIAgent.py +++ b/GitAIAgent/GitAIAgent.py @@ -11,6 +11,9 @@ # "YashJain/GitAI-Qwen2-0.5B-Instruct-MLX-v1" # ) + +##### + llm=ChatOpenAI(temperature=0,model="gpt-3.5-turbo") shell_tool=ShellTool() diff --git a/GitAIAgent/README.md b/GitAIAgent/README.md index aac5147..a2f1b3c 100644 --- a/GitAIAgent/README.md +++ b/GitAIAgent/README.md @@ -11,13 +11,13 @@ GitAI Agent is an AI-powered assistant that can execute Git commands through nat - OpenAI API key - Git installed and configured on your system -## SetUp -Install Dependencies +## Setup +1. Install Dependencies ``` pip install langchain openai ``` -Set up your OpenAI API key: +2. Set up your OpenAI API key: ``` export OPENAI_API_KEY=your_api_key_here ``` @@ -27,7 +27,25 @@ Set up your OpenAI API key: Run the GitAI agent script: ``` -python GitAI.py +python GitAIAgent.py ``` The agent will prompt you for a Git-related task. Enter your request in natural language, and the agent will interpret and execute the appropriate Git commands. + +## Contributing + +We welcome contributions to improve and expand the GitAI Agent. Here are some ways you can contribute: + +### Enhancing Security with Guardrails + +One of the key areas for contribution is implementing and improving guardrails for the shell tool. This involves adding safety checks and input validation to ensure that only safe Git commands are executed. Contributors are encouraged to think about potential security risks and propose solutions. + +### Other Contribution Areas + +- Enhancing the agent's capabilities to handle more complex Git scenarios +- Improving error handling and user feedback +- Expanding the documentation with more examples and use cases +- Adding comprehensive test coverage +- Reporting issues and suggesting improvements + +When contributing, please ensure that your changes align with the project's goals of maintaining a balance between functionality and security. diff --git a/GitAI_LLM/GitAI_fn_calling/GitAI.py b/GitAI_LLM/GitAI_fn_calling/GitAI.py deleted file mode 100644 index 10b9cae..0000000 --- a/GitAI_LLM/GitAI_fn_calling/GitAI.py +++ /dev/null @@ -1,396 +0,0 @@ -import os - - -class GitAI: - def install_git_with_brew(): - """Install Git with Homebrew on Mac OS""" - os.system("brew install git") - - def selfupdate_macports(): - """Install Git with MacPorts on Mac OS""" - os.system("sudo port selfupdate") - - def install_git_linux(): - """Install Git on Linux""" - os.system("sudo apt-get install git") - - def show_git_version(): - """Shows the current version of Git""" - os.system("git --version") - - def init_git_repo(): - """Initializes a new Git repository in the current directory""" - os.system("git init") - - def init_git_repo_in_directory(directory): - """Initializes a new Git repository in the specified directory - directory: The directory where the repository will be initialized - """ - os.system(f"git init {directory}") - - def clone_repository(repository_url): - """Clones a repository from a remote server to your local machine - repository_url: The URL of the remote repository - """ - os.system(f"git clone {repository_url}") - - def clone_specific_branch(branch_name, repository_url): - """Clones a specific branch from a repository - branch_name: The name of the branch to clone - repository_url: The URL of the remote repository - """ - os.system(f"git clone --branch {branch_name} {repository_url}") - - def add_file_to_staging(file): - """Adds a specific file to the staging area - file: The file to add to the staging area - """ - os.system(f"git add {file}") - - def add_all_files_to_staging(): - """Adds all modified and new files to the staging area""" - os.system("git add .") - - def show_git_status(): - """Shows the current state of the repository""" - os.system("git status") - - def show_ignored_files_status(): - """Displays ignored files in addition to the regular status output""" - os.system("git status --ignored") - - def show_git_diff(): - """Shows the changes between the working directory and the staging area (index)""" - os.system("git diff") - - def show_commit_diff(commit1, commit2): - """Displays the differences between two commits - commit1: The first commit - commit2: The second commit - """ - os.system(f"git diff {commit1} {commit2}") - - def show_staged_diff(): - """Displays the changes between the staging area and the last commit""" - os.system("git diff --staged") - - def show_head_diff(): - """Displays the difference between the current directory and the last commit""" - os.system("git diff HEAD") - - def commit_changes(): - """Creates a new commit with the changes in the staging area and opens the default text editor for adding a commit message""" - os.system("git commit") - - def commit_with_message(message): - """Creates a new commit with the changes in the staging area and specifies the commit message inline - message: The commit message - """ - os.system(f'git commit -m "{message}"') - - def commit_all_changes(): - """Commits all modified and deleted files in the repository without explicitly using git add to stage the changes""" - os.system("git commit -a") - - def add_git_note(): - """Creates a new note and associates it with an object (commit, tag, etc.)""" - os.system("git notes add") - - def restore_file(file): - """Restores the file in the working directory to its state in the last commit - file: The file to restore - """ - os.system(f"git restore {file}") - - def reset_to_commit(commit): - """Moves the branch pointer to a specified commit, resetting the staging area and the working directory to match the specified commit - commit: The commit to reset to - """ - os.system(f"git reset {commit}") - - def soft_reset_to_commit(commit): - """Moves the branch pointer to a specified commit, preserving the changes in the staging area and the working directory - commit: The commit to reset to - """ - os.system(f"git reset --soft {commit}") - - def hard_reset_to_commit(commit): - """Moves the branch pointer to a specified commit, discarding all changes in the staging area and the working directory - commit: The commit to reset to - """ - os.system(f"git reset --hard {commit}") - - def remove_file(file): - """Removes a file from both the working directory and the repository, staging the deletion - file: The file to remove - """ - os.system(f"git rm {file}") - - def move_or_rename_file(source, destination): - """Moves or renames a file or directory in the repository - source: The current location of the file or directory - destination: The new location of the file or directory - """ - os.system(f"git mv {source} {destination}") - - def commit_new_feature(message): - """Creates a new commit with a specific message indicating a new feature - message: The commit message - """ - os.system(f'git commit -m "feat: {message}"') - - def commit_bug_fix(message): - """Creates a new commit with a specific message indicating a bug fix - message: The commit message - """ - os.system(f'git commit -m "fix: {message}"') - - def commit_chore(message): - """Creates a new commit with a specific message indicating routine tasks or maintenance - message: The commit message - """ - os.system(f'git commit -m "chore: {message}"') - - def commit_refactor(message): - """Creates a new commit with a specific message indicating code refactoring - message: The commit message - """ - os.system(f'git commit -m "refactor: {message}"') - - def commit_docs_change(message): - """Creates a new commit with a specific message indicating documentation changes - message: The commit message - """ - os.system(f'git commit -m "docs: {message}"') - - def commit_style_change(message): - """Creates a new commit with a specific message indicating styling and formatting changes - message: The commit message - """ - os.system(f'git commit -m "style: {message}"') - - def commit_test_change(message): - """Creates a new commit with a specific message indicating test-related changes - message: The commit message - """ - os.system(f'git commit -m "test: {message}"') - - def commit_performance_change(message): - """Creates a new commit with a specific message indicating performance-related changes - message: The commit message - """ - os.system(f'git commit -m "perf: {message}"') - - def commit_ci_change(message): - """Creates a new commit with a specific message indicating CI system-related changes - message: The commit message - """ - os.system(f'git commit -m "ci: {message}"') - - def commit_build_change(message): - """Creates a new commit with a specific message indicating build process-related changes - message: The commit message - """ - os.system(f'git commit -m "build: {message}"') - - def commit_revert(message): - """Creates a new commit with a specific message indicating a revert of a previous commit - message: The commit message - """ - os.system(f'git commit -m "revert: {message}"') - - def list_branches(): - """Lists all branches in the repository""" - os.system("git branch") - - def create_branch(branch_name): - """Creates a new branch with the specified name - branch_name: The name of the new branch - """ - os.system(f"git branch {branch_name}") - - def delete_branch(branch_name): - """Deletes the specified branch - branch_name: The name of the branch to delete - """ - os.system(f"git branch -d {branch_name}") - - def list_all_branches(): - """Lists all local and remote branches""" - os.system("git branch -a") - - def list_remote_branches(): - """Lists all remote branches""" - os.system("git branch -r") - - def switch_branch(branch_name): - """Switches to the specified branch - branch_name: The name of the branch to switch to - """ - os.system(f"git checkout {branch_name}") - - def create_and_switch_branch(branch_name): - """Creates a new branch and switches to it - branch_name: The name of the new branch - """ - os.system(f"git checkout -b {branch_name}") - - def discard_changes(file): - """Discards changes made to the specified file and reverts it to the version in the last commit - file: The file to discard changes from - """ - os.system(f"git checkout -- {file}") - - def merge_branch(branch_name): - """Merges the specified branch into the current branch - branch_name: The name of the branch to merge - """ - os.system(f"git merge {branch_name}") - - def show_commit_history(): - """Displays the commit history of the current branch""" - os.system("git log") - - def show_branch_commit_history(branch_name): - """Displays the commit history of the specified branch - branch_name: The name of the branch - """ - os.system(f"git log {branch_name}") - - def show_file_commit_history(file): - """Displays the commit history of a file, including its renames - file: The file to show the commit history of - """ - os.system(f"git log --follow {file}") - - def show_all_commit_history(): - """Displays the commit history of all branches""" - os.system("git log --all") - - def stash_changes(): - """Stashes the changes in the working directory""" - os.system("git stash") - - def list_stashes(): - """Lists all stashes in the repository""" - os.system("git stash list") - - def apply_and_remove_stash(): - """Applies and removes the most recent stash""" - os.system("git stash pop") - - def drop_stash(): - """Removes the most recent stash""" - os.system("git stash drop") - - def list_tags(): - """Lists all tags in the repository""" - os.system("git tag") - - def create_tag(tag_name): - """Creates a lightweight tag at the current commit - tag_name: The name of the tag - """ - os.system(f"git tag {tag_name}") - - def create_tag_at_commit(tag_name, commit): - """Creates a lightweight tag at the specified commit - tag_name: The name of the tag - commit: The commit to tag - """ - os.system(f"git tag {tag_name} {commit}") - - def create_annotated_tag(tag_name, message): - """Creates an annotated tag at the current commit with a custom message - tag_name: The name of the tag - message: The custom message - """ - os.system(f'git tag -a {tag_name} -m "{message}"') - - def fetch_changes(): - """Retrieves changes from a remote repository""" - os.system("git fetch") - - def fetch_changes_from_remote(remote): - """Retrieves changes from the specified remote repository - remote: The name of the remote repository - """ - os.system(f"git fetch {remote}") - - def prune_fetch(): - """Removes any remote-tracking branches that no longer exist on the remote repository""" - os.system("git fetch --prune") - - def pull_changes(): - """Fetches changes from the remote repository and merges them into the current branch""" - os.system("git pull") - - def pull_changes_from_remote(remote): - """Fetches changes from the specified remote repository and merges them into the current branch - remote: The name of the remote repository - """ - os.system(f"git pull {remote}") - - def rebase_pull(): - """Fetches changes from the remote repository and rebases the current branch onto the updated branch""" - os.system("git pull --rebase") - - def push_changes(): - """Pushes local commits to the remote repository""" - os.system("git push") - - def push_changes_to_remote(remote): - """Pushes local commits to the specified remote repository - remote: The name of the remote repository - """ - os.system(f"git push {remote}") - - def push_changes_to_branch(remote, branch): - """Pushes local commits to the specified branch of the remote repository - remote: The name of the remote repository - branch: The name of the branch - """ - os.system(f"git push {remote} {branch}") - - def push_all_branches(): - """Pushes all branches to the remote repository""" - os.system("git push --all") - - def list_remotes(): - """Lists all remote repositories""" - os.system("git remote") - - def add_remote(name, url): - """Adds a new remote repository with the specified name and URL - name: The name of the remote repository - url: The URL of the remote repository - """ - os.system(f"git remote add {name} {url}") - - def show_commit_details(): - """Shows the details of a specific commit""" - os.system("git show") - - def show_specific_commit_details(commit): - """Shows the details of the specified commit - commit: The commit to show details of - """ - os.system(f"git show {commit}") - - def revert_commit(commit): - """Creates a new commit that undoes the changes introduced by the specified commit - commit: The commit to revert - """ - os.system(f"git revert {commit}") - - def revert_no_commit(commit): - """Undoes the changes introduced by the specified commit, but does not create a new commit - commit: The commit to revert - """ - os.system(f"git revert --no-commit {commit}") - - def rebase_branch(branch): - """Reapplies commits on the current branch onto the tip of the specified branch - branch: The branch to rebase onto - """ - os.system(f"git rebase {branch}") diff --git a/GitAI_LLM/GitAI_fn_calling/GitAI_functions.json b/GitAI_LLM/GitAI_fn_calling/GitAI_functions.json deleted file mode 100644 index 929d2ae..0000000 --- a/GitAI_LLM/GitAI_fn_calling/GitAI_functions.json +++ /dev/null @@ -1,729 +0,0 @@ -[ - { - "function_name": "install_git_with_homebrew", - "description": "Installs Git with Homebrew on Mac OS.", - "parameters": [], - "command": "brew install git" - }, - { - "function_name": "install_git_with_macports", - "description": "Installs Git with MacPorts on Mac OS.", - "parameters": [], - "command": "sudo port selfupdate" - }, - { - "function_name": "install_git_on_linux", - "description": "Installs Git on Linux.", - "parameters": [], - "command": "sudo apt-get install git" - }, - { - "function_name": "show_git_version", - "description": "Shows the current version of Git installed.", - "parameters": [], - "command": "git --version" - }, - { - "function_name": "initialize_git_repository", - "description": "Initializes a new Git repository in the current directory.", - "parameters": [], - "command": "git init" - }, - { - "function_name": "initialize_git_repository_in_directory", - "description": "Creates a new Git repository in the specified directory.", - "parameters": [ - { - "param": "directory", - "param_type": "str", - "param_description": "The directory path where the new Git repository will be initialized." - } - ], - "command": "git init " - }, - { - "function_name": "clone_repository", - "description": "Clones a repository from a remote server to your local machine.", - "parameters": [ - { - "param": "repository_url", - "param_type": "str", - "param_description": "The URL of the repository to clone." - } - ], - "command": "git clone " - }, - { - "function_name": "clone_specific_branch", - "description": "Clones a specific branch from a repository.", - "parameters": [ - { - "param": "branch_name", - "param_type": "str", - "param_description": "The name of the branch to clone." - }, - { - "param": "repository_url", - "param_type": "str", - "param_description": "The URL of the repository to clone from." - } - ], - "command": "git clone --branch " - }, - { - "function_name": "add_file_to_staging_area", - "description": "Adds a specific file to the staging area.", - "parameters": [ - { - "param": "file", - "param_type": "str", - "param_description": "The path of the file to add to the staging area." - } - ], - "command": "git add " - }, - { - "function_name": "add_all_files_to_staging_area", - "description": "Adds all modified and new files to the staging area.", - "parameters": [], - "command": "git add ." - }, - { - "function_name": "check_repository_status", - "description": "Shows the current state of your repository, including tracked and untracked files, modified files, and branch information.", - "parameters": [], - "command": "git status" - }, - { - "function_name": "check_ignored_files", - "description": "Displays ignored files in addition to the regular status output.", - "parameters": [], - "command": "git status --ignored" - }, - { - "function_name": "show_changes_in_working_directory", - "description": "Shows the changes between the working directory and the staging area (index).", - "parameters": [], - "command": "git diff" - }, - { - "function_name": "show_changes_between_commits", - "description": "Displays the differences between two commits.", - "parameters": [ - { - "param": "commit1", - "param_type": "str", - "param_description": "The ID or reference of the first commit." - }, - { - "param": "commit2", - "param_type": "str", - "param_description": "The ID or reference of the second commit." - } - ], - "command": "git diff " - }, - { - "function_name": "show_changes_between_staging_and_last_commit", - "description": "Displays the changes between the staging area (index) and the last commit.", - "parameters": [], - "command": "git diff --staged" - }, - { - "function_name": "show_changes_between_working_directory_and_last_commit", - "description": "Displays the difference between the current directory and the last commit.", - "parameters": [], - "command": "git diff HEAD" - }, - { - "function_name": "create_commit", - "description": "Creates a new commit with the changes in the staging area and opens the default text editor for adding a commit message.", - "parameters": [], - "command": "git commit" - }, - { - "function_name": "create_commit_with_message", - "description": "Creates a new commit with the changes in the staging area and specifies the commit message inline.", - "parameters": [ - { - "param": "message", - "param_type": "str", - "param_description": "The commit message describing the changes." - } - ], - "command": "git commit -m \"\"" - }, - { - "function_name": "commit_all_changes", - "description": "Commits all modified and deleted files in the repository without explicitly using git add to stage the changes.", - "parameters": [], - "command": "git commit -a" - }, - { - "function_name": "add_note_to_object", - "description": "Creates a new note and associates it with an object (commit, tag, etc.).", - "parameters": [], - "command": "git notes add" - }, - { - "function_name": "restore_file_to_last_commit_state", - "description": "Restores the file in the working directory to its state in the last commit.", - "parameters": [ - { - "param": "file", - "param_type": "str", - "param_description": "The file to restore to the last commit state." - } - ], - "command": "git restore " - }, - { - "function_name": "reset_branch_to_commit", - "description": "Moves the branch pointer to a specified commit, resetting the staging area and the working directory to match the specified commit.", - "parameters": [ - { - "param": "commit", - "param_type": "str", - "param_description": "The commit to reset the branch to." - } - ], - "command": "git reset " - }, - { - "function_name": "reset_branch_soft", - "description": "Moves the branch pointer to a specified commit, preserving the changes in the staging area and the working directory.", - "parameters": [ - { - "param": "commit", - "param_type": "str", - "param_description": "The commit to reset the branch soft to." - } - ], - "command": "git reset --soft " - }, - { - "function_name": "reset_branch_hard", - "description": "Moves the branch pointer to a specified commit, discarding all changes in the staging area and the working directory.", - "parameters": [ - { - "param": "commit", - "param_type": "str", - "param_description": "The commit to reset the branch hard to." - } - ], - "command": "git reset --hard " - }, - { - "function_name": "remove_file_from_repository", - "description": "Removes a file from both the working directory and the repository, staging the deletion.", - "parameters": [ - { - "param": "file", - "param_type": "str", - "param_description": "The file to remove from the repository." - } - ], - "command": "git rm " - }, - { - "function_name": "move_or_rename_file", - "description": "Moves or renames a file or directory in your Git repository.", - "parameters": [], - "command": "git mv" - }, - { - "function_name": "commit_with_feature_message", - "description": "Creates a new commit in a Git repository with a specific message to indicate a new feature commit.", - "parameters": [ - { - "param": "message", - "param_type": "str", - "param_description": "The message describing the new feature." - } - ], - "command": "git commit -m \"feat: \"" - }, - { - "function_name": "commit_with_fix_message", - "description": "Creates a new commit in a Git repository with a specific message to fix bugs in the codebase.", - "parameters": [ - { - "param": "message", - "param_type": "str", - "param_description": "The message describing the bug fix." - } - ], - "command": "git commit -m \"fix: \"" - }, - { - "function_name": "commit_with_chore_message", - "description": "Creates a new commit in a Git repository with a specific message to show routine tasks or maintenance.", - "parameters": [ - { - "param": "message", - "param_type": "str", - "param_description": "The message describing the chore." - } - ], - "command": "git commit -m \"chore: \"" - }, - { - "function_name": "commit_with_refactor_message", - "description": "Creates a new commit in a Git repository with a specific message to change the codebase and improve the structure.", - "parameters": [ - { - "param": "message", - "param_type": "str", - "param_description": "The message describing the refactor." - } - ], - "command": "git commit -m \"refactor: \"" - }, - { - "function_name": "commit_with_docs_message", - "description": "Creates a new commit in a Git repository with a specific message to change the documentation.", - "parameters": [ - { - "param": "message", - "param_type": "str", - "param_description": "The message describing the documentation change." - } - ], - "command": "git commit -m \"docs: \"" - }, - { - "function_name": "commit_with_style_message", - "description": "Creates a new commit in a Git repository with a specific message to change the styling and formatting of the codebase.", - "parameters": [ - { - "param": "message", - "param_type": "str", - "param_description": "The message describing the style change." - } - ], - "command": "git commit -m \"style: \"" - }, - { - "function_name": "commit_with_test_message", - "description": "Creates a new commit in a Git repository with a specific message to indicate test-related changes.", - "parameters": [ - { - "param": "message", - "param_type": "str", - "param_description": "The message describing the test change." - } - ], - "command": "git commit -m \"test: \"" - }, - { - "function_name": "commit_with_performance_message", - "description": "Creates a new commit in a Git repository with a specific message to indicate performance-related changes.", - "parameters": [ - { - "param": "message", - "param_type": "str", - "param_description": "The message describing the performance change." - } - ], - "command": "git commit -m \"perf: \"" - }, - { - "function_name": "commit_with_ci_message", - "description": "Creates a new commit in a Git repository with a specific message to indicate continuous integration (CI) system-related changes.", - "parameters": [ - { - "param": "message", - "param_type": "str", - "param_description": "The message describing the CI change." - } - ], - "command": "git commit -m \"ci: \"" - }, - { - "function_name": "commit_with_build_message", - "description": "Creates a new commit in a Git repository with a specific message to indicate changes related to the build process.", - "parameters": [ - { - "param": "message", - "param_type": "str", - "param_description": "The message describing the build change." - } - ], - "command": "git commit -m \"build: \"" - }, - { - "function_name": "commit_with_revert_message", - "description": "Creates a new commit in a Git repository with a specific message to indicate changes related to reverting a previous commit.", - "parameters": [ - { - "param": "message", - "param_type": "str", - "param_description": "The message describing the revert change." - } - ], - "command": "git commit -m \"revert: \"" - }, - { - "function_name": "list_branches", - "description": "Lists all branches in the repository.", - "parameters": [], - "command": "git branch" - }, - { - "function_name": "create_branch", - "description": "Creates a new branch with the specified name.", - "parameters": [ - { - "param": "branch_name", - "param_type": "str", - "param_description": "The name of the new branch to create." - } - ], - "command": "git branch " - }, - { - "function_name": "delete_branch", - "description": "Deletes the specified branch.", - "parameters": [ - { - "param": "branch_name", - "param_type": "str", - "param_description": "The name of the branch to delete." - } - ], - "command": "git branch -d " - }, - { - "function_name": "list_all_branches", - "description": "Lists all local and remote branches in the repository.", - "parameters": [], - "command": "git branch -a" - }, - { - "function_name": "list_remote_branches", - "description": "Lists all remote branches in the repository.", - "parameters": [], - "command": "git branch -r" - }, - { - "function_name": "switch_to_branch", - "description": "Switches to the specified branch.", - "parameters": [ - { - "param": "branch_name", - "param_type": "str", - "param_description": "The name of the branch to switch to." - } - ], - "command": "git checkout " - }, - { - "function_name": "create_and_switch_to_branch", - "description": "Creates a new branch with the specified name and switches to it.", - "parameters": [ - { - "param": "new_branch_name", - "param_type": "str", - "param_description": "The name of the new branch to create and switch to." - } - ], - "command": "git checkout -b " - }, - { - "function_name": "discard_changes_in_file", - "description": "Discards changes made to the specified file and reverts it to the version in the last commit.", - "parameters": [ - { - "param": "file", - "param_type": "str", - "param_description": "The file to discard changes for." - } - ], - "command": "git checkout -- " - }, - { - "function_name": "merge_branch_into_current", - "description": "Merges the specified branch into the current branch.", - "parameters": [ - { - "param": "branch_name", - "param_type": "str", - "param_description": "The name of the branch to merge into the current branch." - } - ], - "command": "git merge " - }, - { - "function_name": "show_commit_history", - "description": "Displays the commit history of the current branch.", - "parameters": [], - "command": "git log" - }, - { - "function_name": "show_branch_commit_history", - "description": "Displays the commit history of the specified branch.", - "parameters": [ - { - "param": "branch_name", - "param_type": "str", - "param_description": "The name of the branch to show the commit history for." - } - ], - "command": "git log " - }, - { - "function_name": "show_file_commit_history", - "description": "Displays the commit history of a file, including its renames.", - "parameters": [ - { - "param": "file", - "param_type": "str", - "param_description": "The file to show the commit history for." - } - ], - "command": "git log --follow " - }, - { - "function_name": "show_commit_history_all_branches", - "description": "Displays the commit history of all branches in the repository.", - "parameters": [], - "command": "git log --all" - }, - { - "function_name": "stash_changes", - "description": "Stashes the changes in the working directory, allowing you to switch to a different branch or commit without committing the changes.", - "parameters": [], - "command": "git stash" - }, - { - "function_name": "list_stashes", - "description": "Lists all stashes in the repository.", - "parameters": [], - "command": "git stash list" - }, - { - "function_name": "apply_latest_stash", - "description": "Applies and removes the most recent stash from the stash list.", - "parameters": [], - "command": "git stash pop" - }, - { - "function_name": "drop_latest_stash", - "description": "Removes the most recent stash from the stash list.", - "parameters": [], - "command": "git stash drop" - }, - { - "function_name": "list_tags", - "description": "Lists all tags in the repository.", - "parameters": [], - "command": "git tag" - }, - { - "function_name": "create_lightweight_tag", - "description": "Creates a lightweight tag at the current commit.", - "parameters": [ - { - "param": "tag_name", - "param_type": "str", - "param_description": "The name of the tag to create." - } - ], - "command": "git tag " - }, - { - "function_name": "create_lightweight_tag_at_commit", - "description": "Creates a lightweight tag at the specified commit.", - "parameters": [ - { - "param": "tag_name", - "param_type": "str", - "param_description": "The name of the tag to create." - }, - { - "param": "commit", - "param_type": "str", - "param_description": "The commit to tag." - } - ], - "command": "git tag " - }, - { - "function_name": "create_annotated_tag", - "description": "Creates an annotated tag at the current commit with a custom message.", - "parameters": [ - { - "param": "tag_name", - "param_type": "str", - "param_description": "The name of the tag to create." - }, - { - "param": "message", - "param_type": "str", - "param_description": "The message describing the tag." - } - ], - "command": "git tag -a -m \"\"" - }, - { - "function_name": "fetch_changes_from_remote", - "description": "Retrieves changes from a remote repository, including new branches and commits.", - "parameters": [], - "command": "git fetch" - }, - { - "function_name": "fetch_changes_from_specific_remote", - "description": "Retrieves changes from the specified remote repository.", - "parameters": [ - { - "param": "remote", - "param_type": "str", - "param_description": "The name of the remote repository to fetch changes from." - } - ], - "command": "git fetch " - }, - { - "function_name": "fetch_and_prune", - "description": "Retrieves changes from a remote repository and removes any remote-tracking branches that no longer exist on the remote repository.", - "parameters": [], - "command": "git fetch --prune" - }, - { - "function_name": "pull_changes_from_remote", - "description": "Fetches changes from the remote repository and merges them into the current branch.", - "parameters": [], - "command": "git pull" - }, - { - "function_name": "pull_changes_from_specific_remote", - "description": "Fetches changes from the specified remote repository and merges them into the current branch.", - "parameters": [ - { - "param": "remote", - "param_type": "str", - "param_description": "The name of the remote repository to pull changes from." - } - ], - "command": "git pull " - }, - { - "function_name": "pull_changes_with_rebase", - "description": "Fetches changes from the remote repository and rebases the current branch onto the updated branch.", - "parameters": [], - "command": "git pull --rebase" - }, - { - "function_name": "push_commits_to_remote", - "description": "Pushes local commits to the remote repository.", - "parameters": [], - "command": "git push" - }, - { - "function_name": "push_commits_to_specific_remote", - "description": "Pushes local commits to the specified remote repository.", - "parameters": [ - { - "param": "remote", - "param_type": "str", - "param_description": "The name of the remote repository to push commits to." - } - ], - "command": "git push " - }, - { - "function_name": "push_commits_to_branch_of_remote", - "description": "Pushes local commits to the specified branch of the remote repository.", - "parameters": [ - { - "param": "remote", - "param_type": "str", - "param_description": "The name of the remote repository." - }, - { - "param": "branch", - "param_type": "str", - "param_description": "The name of the branch to push commits to." - } - ], - "command": "git push " - }, - { - "function_name": "push_all_branches_to_remote", - "description": "Pushes all branches to the remote repository.", - "parameters": [], - "command": "git push --all" - }, - { - "function_name": "list_remote_repositories", - "description": "Lists all remote repositories.", - "parameters": [], - "command": "git remote" - }, - { - "function_name": "add_remote_repository", - "description": "Adds a new remote repository with the specified name and URL.", - "parameters": [ - { - "param": "name", - "param_type": "str", - "param_description": "The name of the new remote repository." - }, - { - "param": "url", - "param_type": "str", - "param_description": "The URL of the new remote repository." - } - ], - "command": "git remote add " - }, - { - "function_name": "show_commit_details", - "description": "Shows the details of a specific commit, including its changes.", - "parameters": [ - { - "param": "commit", - "param_type": "str", - "param_description": "The ID or reference of the commit to show details for." - } - ], - "command": "git show " - }, - { - "function_name": "revert_commit", - "description": "Creates a new commit that undoes the changes introduced by the specified commit.", - "parameters": [ - { - "param": "commit", - "param_type": "str", - "param_description": "The commit to revert." - } - ], - "command": "git revert " - }, - { - "function_name": "revert_commit_no_commit", - "description": "Undoes the changes introduced by the specified commit, but does not create a new commit.", - "parameters": [ - { - "param": "commit", - "param_type": "str", - "param_description": "The commit to revert." - } - ], - "command": "git revert --no-commit " - }, - { - "function_name": "rebase_branch", - "description": "Reapplies commits on the current branch onto the tip of the specified branch.", - "parameters": [ - { - "param": "branch", - "param_type": "str", - "param_description": "The branch to rebase onto." - } - ], - "command": "git rebase " - } - ] - \ No newline at end of file diff --git a/GitAI_LLM/GitAI_fn_calling/VectorStore.py b/GitAI_LLM/GitAI_fn_calling/VectorStore.py deleted file mode 100644 index c1672d2..0000000 --- a/GitAI_LLM/GitAI_fn_calling/VectorStore.py +++ /dev/null @@ -1,23 +0,0 @@ -from langchain_community.document_loaders import JSONLoader -from langchain_huggingface import HuggingFaceEmbeddings -from langchain_community.vectorstores import Chroma - - -loader = JSONLoader( - file_path='GitAI_LLM/GitAI_fn_calling/GitAI_functions.json', - jq_schema='.[]', - # content_key='description', - text_content=False, - # metadata_func=metadata_func - ) - -git = loader.load() - - -#checking format -# print(git[7]) - -embedding_function = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2") - -# # save to disk -vectorstore = Chroma.from_documents(git, embedding_function, persist_directory="./GitAI_LLM/GitAI_fn_calling/chroma_db") \ No newline at end of file diff --git a/GitAI_LLM/GitAI_fn_calling/agent.py b/GitAI_LLM/GitAI_fn_calling/agent.py deleted file mode 100644 index b64d11b..0000000 --- a/GitAI_LLM/GitAI_fn_calling/agent.py +++ /dev/null @@ -1,33 +0,0 @@ -from langchain_community.vectorstores import Chroma -from langchain_huggingface import HuggingFaceEmbeddings -import json - -# Initialize the embedding function -embedding_function = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2") - -# Initialize Chroma vector store -vectorstore = Chroma(persist_directory="./GitAI_LLM/GitAI_fn_calling/chroma_db", embedding_function=embedding_function) - -query = input( -''' -GitAI: How Can I help you? -Human: ''' -) - -# Perform similarity search -docs = vectorstore.similarity_search(query) - -# retriever = vectorstore.as_retriever(search_kwargs={"k": 3}) -# docs = retriever.get_relevant_documents("How to clone a github repo") - -print("\n") -# Iterate through each document and print structured information -for doc in docs: - data = json.loads(doc.page_content) - print(f"Command: {data['command']}") - print(f"Description: {data['description']}") - print("Parameters:") - for param in data['parameters']: - print(f"- {param['param']}: {param['param_description']})") - - print('='*50) diff --git a/GitAI_LLM/GitAI_fn_calling/chroma_db/b70615c6-55b6-424d-8e49-bfe241cdbba7/data_level0.bin b/GitAI_LLM/GitAI_fn_calling/chroma_db/b70615c6-55b6-424d-8e49-bfe241cdbba7/data_level0.bin deleted file mode 100644 index ae2f009..0000000 Binary files a/GitAI_LLM/GitAI_fn_calling/chroma_db/b70615c6-55b6-424d-8e49-bfe241cdbba7/data_level0.bin and /dev/null differ diff --git a/GitAI_LLM/GitAI_fn_calling/chroma_db/b70615c6-55b6-424d-8e49-bfe241cdbba7/header.bin b/GitAI_LLM/GitAI_fn_calling/chroma_db/b70615c6-55b6-424d-8e49-bfe241cdbba7/header.bin deleted file mode 100644 index d2383e3..0000000 Binary files a/GitAI_LLM/GitAI_fn_calling/chroma_db/b70615c6-55b6-424d-8e49-bfe241cdbba7/header.bin and /dev/null differ diff --git a/GitAI_LLM/GitAI_fn_calling/chroma_db/b70615c6-55b6-424d-8e49-bfe241cdbba7/length.bin b/GitAI_LLM/GitAI_fn_calling/chroma_db/b70615c6-55b6-424d-8e49-bfe241cdbba7/length.bin deleted file mode 100644 index 1dc89f8..0000000 Binary files a/GitAI_LLM/GitAI_fn_calling/chroma_db/b70615c6-55b6-424d-8e49-bfe241cdbba7/length.bin and /dev/null differ diff --git a/GitAI_LLM/GitAI_fn_calling/chroma_db/b70615c6-55b6-424d-8e49-bfe241cdbba7/link_lists.bin b/GitAI_LLM/GitAI_fn_calling/chroma_db/b70615c6-55b6-424d-8e49-bfe241cdbba7/link_lists.bin deleted file mode 100644 index e69de29..0000000 diff --git a/GitAI_LLM/GitAI_fn_calling/chroma_db/chroma.sqlite3 b/GitAI_LLM/GitAI_fn_calling/chroma_db/chroma.sqlite3 deleted file mode 100644 index 72b74cd..0000000 Binary files a/GitAI_LLM/GitAI_fn_calling/chroma_db/chroma.sqlite3 and /dev/null differ diff --git a/GitAI_LLM/GitAI_fn_calling/requirements.txt b/GitAI_LLM/GitAI_fn_calling/requirements.txt deleted file mode 100644 index c4a127d..0000000 --- a/GitAI_LLM/GitAI_fn_calling/requirements.txt +++ /dev/null @@ -1,5 +0,0 @@ -langchain_community -langchain_huggingface -sentence_transformers -json -chromadb \ No newline at end of file diff --git a/LLMFineTuning/GitAI_MLX.py b/LLMFineTuning/GitAI_MLX.py new file mode 100644 index 0000000..0e47ad2 --- /dev/null +++ b/LLMFineTuning/GitAI_MLX.py @@ -0,0 +1,7 @@ +from mlx_lm import load, generate + +model, tokenizer = load("YashJain/GitAI-gemma-2b") + +prompt = "How git commit hash are generated" + +response = generate(model, tokenizer, prompt=prompt, verbose=True, max_tokens=100) diff --git a/LLMFineTuning/MLX Implementation/README.md b/LLMFineTuning/MLX Implementation/README.md deleted file mode 100644 index ad036bb..0000000 --- a/LLMFineTuning/MLX Implementation/README.md +++ /dev/null @@ -1,152 +0,0 @@ -# Fine-Tuning with LoRA or QLoRA - -We'll use `mlx-lm` package to fine-tune an LLM with low rank -adaptation (LoRA) for a target task. The example also supports quantized -LoRA fine-tuning works with the following model families: - -- Mistral -- Llama -- Phi2 -- Mixtral -- Qwen2 -- Gemma -- OLMo -- MiniCPM -- InternLM2 - -**Access our Fine-Tuned GitAI Model at [Hugging Face](https://huggingface.co/collections/YashJain/gitai-66716f5414a2d8e2b6d93bd9)** - -### Data - -For fine-tuning (`--train`), the data loader expects a `train.jsonl` and a -`valid.jsonl` to be in the data directory. For evaluation (`--test`), the data -loader expects a `test.jsonl` in the data directory. - -Currently, `*.jsonl` files support three data formats: `chat`, -`completions`, and `text`. Here are three examples of these formats: - -`chat`: - -```jsonl -{ - "messages": [ - { - "role": "system", - "content": "You are a helpful assistant." - }, - { - "role": "user", - "content": "Hello." - }, - { - "role": "assistant", - "content": "How can I assistant you today." - } - ] -} -``` - -`completions`: - -```jsonl -{ - "prompt": "What is the capital of France?", - "completion": "Paris." -} -``` - -`text`: - -```jsonl -{ - "text": "This is an example for the model." -} -``` - -Note, the format is automatically determined by the dataset. Note also, keys in each line not expected by the loader will be ignored. - -For the `chat` and `completions` formats, Hugging Face [chat -templates](https://huggingface.co/blog/chat-templates) are used. This applies the model's chat template by default. If the model does not have a chat template, then Hugging Face will use a default. For example, the final text in the `chat` example above with Hugging Face's default template becomes: - -```text -<|im_start|>system -You are a helpful assistant.<|im_end|> -<|im_start|>user -Hello.<|im_end|> -<|im_start|>assistant -How can I assistant you today.<|im_end|> -``` - -If you are unsure of the format to use, the `chat` or `completions` are good to start with. For custom requirements on the format of the dataset, use the `text` format to assemble the content yourself. - - - -### Fine-tune - -To fine-tune a model use: - -```shell -mlx_lm.lora \ - --model \ - --train \ - --data \ - --iters 600 -``` - -The `--data` argument must specify a path to a `train.jsonl`, `valid.jsonl` when using `--train` and a path to a `test.jsonl` when using `--test`. (Deafult folder name 'data') - -For example, to fine-tune a Qwen2 0.5B Instruct you can use `--model -Qwen/Qwen2-0.5B-Instruct-MLX`. - -Try using a smaller batch size with `--batch-size`. `2` or `1` will reduce memory consumption. This may slow things down a little, but will also reduce the memory use. - -Reduce the number of layers to fine-tune with `--lora-layers` to `8` or `4`. This reduces the amount of memory needed for back propagation. It may also reduce the quality of the fine-tuned model if you are fine-tuning with a lot of data. - -By default, the adapter config and weights are saved in `adapters/`. You can specify the output location with `--adapter-path`. - -You can resume fine-tuning with an existing adapter with -`--resume-adapter-file `. - -### Evaluate - -To compute test set perplexity use: - -```shell -mlx_lm.lora \ - --model \ - --adapter-path \ - --data \ - --test -``` - -### Generate - -For generation use `mlx_lm.generate`: - -```shell -mlx_lm.generate \ - --model \ - --adapter-path \ - --prompt "" -``` - -## Fuse -To generate the fused model run: - -```shell -mlx_lm.fuse --model -``` - -This will by default load the adapters from `adapters/`, and save the fused model in the path `lora_fused_model/`. All of these are configurable. - -To upload a fused model, supply the `--upload-repo` and `--hf-path` arguments to `mlx_lm.fuse`. The latter is the repo name of the original model, which is useful for the sake of attribution and model versioning. - -For example, to fuse and upload a model derived from Qwen2-0.5B-Instruct-MLX, run: - -```shell -mlx_lm.fuse \ - --model Qwen/Qwen2-0.5B-Instruct-MLX \ - --upload-repo YashJain/GitAI-Qwen2-0.5B-Instruct-MLX-v1 \ - --hf-path Qwen/Qwen2-0.5B-Instruct-MLX -``` - diff --git a/LLMFineTuning/README.md b/LLMFineTuning/README.md new file mode 100644 index 0000000..faeb4e2 --- /dev/null +++ b/LLMFineTuning/README.md @@ -0,0 +1,168 @@ +# Fine-Tuning with LoRA or QLoRA + +We'll use the `mlx-lm` package to fine-tune an LLM with low rank adaptation (LoRA) for a target task. The example also supports quantized LoRA fine-tuning and works with the following model families: + +- Mistral +- Llama +- Phi2 +- Mixtral +- Qwen2 +- Gemma +- OLMo +- MiniCPM +- InternLM2 + +**Access our Fine-Tuned GitAI Model at [Hugging Face](https://huggingface.co/collections/YashJain/gitai-66716f5414a2d8e2b6d93bd9)** + +## Data + +For fine-tuning (`--train`), the data loader expects a `train.jsonl` and a `valid.jsonl` to be in the data directory. For evaluation (`--test`), the data loader expects a `test.jsonl` in the data directory. + +Currently, `*.jsonl` files support three data formats: `chat`, `completions`, and `text`. Here are three examples of these formats: + +`chat`: + +```jsonl +{ + "messages": [ + { + "role": "system", + "content": "You are a helpful assistant." + }, + { + "role": "user", + "content": "Hello." + }, + { + "role": "assistant", + "content": "How can I assist you today." + } + ] +} +``` + +`completions`: + +```jsonl +{ + "prompt": "What is the capital of France?", + "completion": "Paris." +} +``` + +`text`: + +```jsonl +{ + "text": "This is an example for the model." +} +``` + +Note, the format is automatically determined by the dataset. Keys in each line not expected by the loader will be ignored. + +For the `chat` and `completions` formats, Hugging Face [chat templates](https://huggingface.co/blog/chat-templates) are used. This applies the model's chat template by default. If the model does not have a chat template, then Hugging Face will use a default. + +If you are unsure of the format to use, the `chat` or `completions` are good to start with. For custom requirements on the format of the dataset, use the `text` format to assemble the content yourself. + +## Fine-tune + +To fine-tune a model use: + +```shell +mlx_lm.lora \ + --model \ + --train \ + --data \ + --iters 600 +``` + +The `--data` argument must specify a path to a `train.jsonl`, `valid.jsonl` when using `--train` and a path to a `test.jsonl` when using `--test`. (Default folder name 'data') + +For example, to fine-tune a Qwen2 0.5B Instruct you can use `--model Qwen/Qwen2-0.5B-Instruct-MLX`. + +Try using a smaller batch size with `--batch-size`. `2` or `1` will reduce memory consumption. This may slow things down a little, but will also reduce the memory use. + +Reduce the number of layers to fine-tune with `--lora-layers` to `8` or `4`. This reduces the amount of memory needed for back propagation. It may also reduce the quality of the fine-tuned model if you are fine-tuning with a lot of data. + +By default, the adapter config and weights are saved in `adapters/`. You can specify the output location with `--adapter-path`. + +You can resume fine-tuning with an existing adapter with `--resume-adapter-file `. + +## Evaluate + +To compute test set perplexity use: + +```shell +mlx_lm.lora \ + --model \ + --adapter-path \ + --data \ + --test +``` + +## Generate + +For generation use `mlx_lm.generate`: + +```shell +mlx_lm.generate \ + --model \ + --adapter-path \ + --prompt "" +``` + +## Fuse + +To generate the fused model run: + +```shell +mlx_lm.fuse --model +``` + +This will by default load the adapters from `adapters/`, and save the fused model in the path `lora_fused_model/`. All of these are configurable. + +To upload a fused model, supply the `--upload-repo` and `--hf-path` arguments to `mlx_lm.fuse`. The latter is the repo name of the original model, which is useful for the sake of attribution and model versioning. + +For example, to fuse and upload a model derived from Qwen2-0.5B-Instruct-MLX, run: + +```shell +mlx_lm.fuse \ + --model Qwen/Qwen2-0.5B-Instruct-MLX \ + --upload-repo YashJain/GitAI-Qwen2-0.5B-Instruct-MLX-v1 \ + --hf-path Qwen/Qwen2-0.5B-Instruct-MLX +``` + +## Contributing + +We welcome contributions to improve and expand the Fine-Tuning with LoRA or QLoRA project. Here are some ways you can contribute: + +### Implementing PyTorch and TensorFlow Versions + +One of the key areas for contribution is implementing PyTorch and TensorFlow versions of the fine-tuning process. This will allow users to choose their preferred deep learning framework. Here are some steps to get started: + +1. Create a new directory for each framework (e.g., `pytorch_implementation/` and `tensorflow_implementation/`). +2. Implement the core LoRA and QLoRA algorithms in each framework. +3. Adapt the data loading and processing steps to work with PyTorch and TensorFlow. +4. Implement the training, evaluation, and generation scripts for each framework. +5. Ensure compatibility with the same model families supported by the MLX version. +6. Write documentation and usage examples for each implementation. + +### Other Contribution Areas + +- Enhancing the existing MLX implementation with new features or optimizations. +- Adding support for more model families. +- Improving documentation and adding more detailed tutorials. +- Implementing additional fine-tuning techniques beyond LoRA and QLoRA. +- Creating benchmarks to compare performance across different frameworks. +- Adding more sophisticated evaluation metrics. +- Developing tools for easier model deployment and serving. + +### Contributing Guidelines + +1. Fork the repository and create a new branch for your feature or bug fix. +2. Follow the coding style and conventions used in the existing codebase. +3. Write clear, concise commit messages. +4. Add or update tests as necessary to cover your changes. +5. Update the documentation to reflect any changes in functionality or usage. +6. Submit a pull request with a clear description of your changes and their benefits. + diff --git a/RAGQABot/RAGQABot.py b/RAGQABot/RAGQABot.py index 055c5c2..e43e89e 100644 --- a/RAGQABot/RAGQABot.py +++ b/RAGQABot/RAGQABot.py @@ -1,6 +1,6 @@ import os # Replace this with your HF token -# os.environ["HUGGINGFACE_ACCESS_TOKEN"] = "hf_xxxx" +os.environ["HUGGINGFACE_ACCESS_TOKEN"] = "hf_xxx" from embedchain import App @@ -29,7 +29,7 @@ app = App.from_config(config=config) app.add("./RAGQABot/CombinedQA.jsonl") -response = app.query("How to find difference between 2 git commits") +response = app.query("git rebase vs git merge along with example commands") print(response) diff --git a/RAGQABot/README.md b/RAGQABot/README.md index aeecfa7..f9f5615 100644 --- a/RAGQABot/README.md +++ b/RAGQABot/README.md @@ -25,7 +25,7 @@ Set up your Hugging Face access token: ## Configuration -``` +```python config = { 'llm': { 'provider': 'huggingface', @@ -61,3 +61,33 @@ app.query("How to find difference between 2 git commits") - To change the embedding model, update the `model` under the `embedder` configuration. - To use a different vector database, change the `provider` and `config` under the `vectordb` configuration. +## Contributing + +We welcome contributions to improve and expand the RAG QA Bot project. Here are some ways you can contribute: + +### Areas for Contribution + +1. Expanding the Git knowledge base +2. Improving answer quality +3. Performance optimization +4. Multi-language support + +### Contributing Guidelines + +1. Fork the repository and create a new branch for your feature or bug fix. +2. Follow the coding style and conventions used in the existing codebase. +3. Write clear, concise commit messages. +4. Add or update tests as necessary to cover your changes. +5. Update the documentation to reflect any changes in functionality or usage. +6. Submit a pull request with a clear description of your changes and their benefits. + + +### Reporting Issues + +If you encounter any bugs or have suggestions for improvements: + +1. Check if the issue already exists in the project's issue tracker. +2. If not, create a new issue with a clear title and description. +3. Include steps to reproduce the problem and any relevant error messages. + +By contributing, you help make the RAG QA Bot more powerful, efficient, and user-friendly. Your efforts will benefit the entire Git community. Thank you for your support! \ No newline at end of file diff --git a/README.md b/README.md index 66b5fbe..795045f 100644 --- a/README.md +++ b/README.md @@ -4,7 +4,7 @@ GitAI is a project aimed at assisting developers in learning and using Git comma ### 1. Fine-tuned LLMs -**[Hugging Face Link](https://huggingface.co/collections/YashJain/gitai-66716f5414a2d8e2b6d93bd9)** + We fine-tuned two smaller language models to run locally: diff --git a/contribution.md b/contribution.md new file mode 100644 index 0000000..cddc938 --- /dev/null +++ b/contribution.md @@ -0,0 +1,86 @@ +# Contributing to GitAI + +We're excited that you're interested in contributing to GitAI! This document outlines the process for contributing to our project and provides guidelines to ensure a smooth collaboration. + +## Table of Contents + +1. [Project Overview](#project-overview) +2. [Getting Started](#getting-started) +3. [How to Contribute](#how-to-contribute) +4. [Submitting Pull Requests](#submitting-pull-requests) +5. [Reporting Issues](#reporting-issues) + + +## Project Overview + +GitAI is a project aimed at assisting developers in learning and using Git commands effectively. We've explored various approaches, including: + +1. Fine-tuned LLMs +2. CLI Native AI Agent +3. Commonly Used Commands Retrieval system +4. RAG QA Bot + +Each approach has its strengths, and we're continually working to improve and integrate these solutions. + +## Getting Started + +1. Fork the repository on GitHub. +2. Clone your forked repository to your local machine. +3. Set up the development environment +## How to Contribute + +There are many ways to contribute to GitAI: + +1. Improving existing models and systems +2. Adding new features or approaches +3. Enhancing documentation +4. Fixing bugs +5. Writing tests +6. Providing feedback and suggestions + +Before starting work on a major feature or change, please open an issue to discuss it with the maintainers. + +## Contribution Areas +1. Fine-tuned LLMs + +- Implement PyTorch and TensorFlow versions of the fine-tuning process +- Enhance the existing MLX implementation +- Add support for more model families + +2. CLI Native AI Agent + +- Improve the agent's reasoning capabilities +- Enhance error handling and user feedback +- Guardrails for AI Agents + +3. Commonly Used Commands Retrieval System + +- Expand the database of Git commands +- Improve the retrieval algorithm +- Implement multi-language support + +4. RAG QA Bot + +- Expand the Git knowledge base +- Improve answer quality and relevance +- Optimize performance and response times + + +## Submitting Pull Requests + +1. Create a new branch for your feature or bug fix. +2. Make your changes in the new branch. +3. Ensure your code passes all tests and linting checks. +4. Commit your changes with clear, descriptive commit messages. +5. Push your branch to your fork on GitHub. +6. Open a pull request against the main repository's `main` branch. +7. Provide a clear description of the changes in your pull request. +8. Be prepared to make changes based on feedback. + +## Reporting Issues + +- Use the GitHub issue tracker to report bugs or suggest features. +- Clearly describe the issue, including steps to reproduce for bugs. +- Include relevant information such as OS, Python version, and any error messages. + +Thank you for your interest in contributing to GitAI. Your efforts help make Git more accessible and easier to use for developers worldwide!