Skip to content

Commit e0fcba0

Browse files
editing the README and refactoring contribution documentation
1 parent 0352c77 commit e0fcba0

File tree

2 files changed

+88
-84
lines changed

2 files changed

+88
-84
lines changed

CONTRIBUTING.md

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
## Contributing
2+
3+
If you would like to contribute to this project, we recommend following the "fork-and-pull" Git workflow.
4+
5+
1. **Fork** the repo on GitHub
6+
2. **Clone** the project to your own machine
7+
3. **Commit** changes to your own branch
8+
4. **Push** your work back up to your fork
9+
5. Submit a **Pull request** so that we can review your changes
10+
11+
NOTE: Be sure to merge the latest from "upstream" before making a pull request!
12+
13+
### Set Up Dev Environment
14+
15+
<details>
16+
<summary>1. Clone Repo</summary>
17+
18+
```shell
19+
git clone https://github.com/georgian-io/LLM-Finetuning-Toolkit.git
20+
cd LLM-Finetuning-Toolkit/
21+
```
22+
23+
</details>
24+
25+
<details>
26+
<summary>2. Install Dependencies</summary>
27+
<details>
28+
<summary>Install with Docker [Recommended]</summary>
29+
30+
```shell
31+
docker build -t llm-toolkit
32+
```
33+
34+
```shell
35+
# CPU
36+
docker run -it llm-toolkit
37+
# GPU
38+
docker run -it --gpus all llm-toolkit
39+
```
40+
41+
</details>
42+
43+
<details>
44+
<summary>Poetry (recommended)</summary>
45+
46+
See poetry documentation page for poetry [installation instructions](https://python-poetry.org/docs/#installation)
47+
48+
```shell
49+
poetry install
50+
```
51+
52+
</details>
53+
<details>
54+
<summary>pip</summary>
55+
We recommend using a virtual environment like `venv` or `conda` for installation
56+
57+
```shell
58+
pip install -e .
59+
```
60+
61+
</details>
62+
</details>
63+
64+
### Checklist Before Pull Request (Optional)
65+
66+
1. Use `ruff check --fix` to check and fix lint errors
67+
2. Use `ruff format` to apply formatting
68+
69+
NOTE: Ruff linting and formatting checks are done when PR is raised via Git Action. Before raising a PR, it is a good practice to check and fix lint errors, as well as apply formatting.
70+
71+
### Releasing
72+
73+
To manually release a PyPI package, please run:
74+
75+
```shell
76+
make build-release
77+
```
78+
79+
Note: Make sure you have a pypi token for this [PyPI repo](https://pypi.org/project/llm-toolkit/).

README.md

Lines changed: 9 additions & 84 deletions
Original file line numberDiff line numberDiff line change
@@ -6,17 +6,17 @@
66

77
## Overview
88

9-
LLM Finetuning toolkit is a config-based CLI tool for launching a series of LLM finetuning experiments on your data and gathering their results. From one single `yaml` config file, control all elements of a typical experimentation pipeline - **prompts**, **open-source LLMs**, **optimization strategy** and **LLM testing**.
9+
LLM Finetuning toolkit is a config-based CLI tool for launching a series of LLM fine-tuning experiments on your data and gathering their results. From one single `yaml` config file, control all elements of a typical experimentation pipeline - **prompts**, **open-source LLMs**, **optimization strategy** and **LLM testing**.
1010

1111
<p align="center">
1212
<img src="https://github.com/georgian-io/LLM-Finetuning-Toolkit/blob/main/assets/overview_diagram.png?raw=true" width="900" />
1313
</p>
1414

1515
## Installation
1616

17-
### pipx (recommended)
17+
### [pipx](https://pipx.pypa.io/stable/) (recommended)
1818

19-
pipx installs the package and depdencies in a seperate virtual environment
19+
[pipx](https://pipx.pypa.io/stable/) installs the package and dependencies in a separate virtual environment
2020

2121
```shell
2222
pipx install llm-toolkit
@@ -39,8 +39,8 @@ This guide contains 3 stages that will enable you to get the most out of this to
3939
### Basic
4040

4141
```shell
42-
llmtune generate config
43-
llmtune run ./config.yml
42+
llmtune generate config
43+
llmtune run ./config.yml
4444
```
4545

4646
The first command generates a helpful starter `config.yml` file and saves in the current working directory. This is provided to users to quickly get started and as a base for further modification.
@@ -166,7 +166,7 @@ qa:
166166

167167
#### Artifact Outputs
168168

169-
This config will run finetuning and save the results under directory `./experiment/[unique_hash]`. Each unique configuration will generate a unique hash, so that our tool can automatically pick up where it left off. For example, if you need to exit in the middle of the training, by relaunching the script, the program will automatically load the existing dataset that has been generated under the directory, instead of doing it all over again.
169+
This config will run fine-tuning and save the results under directory `./experiment/[unique_hash]`. Each unique configuration will generate a unique hash, so that our tool can automatically pick up where it left off. For example, if you need to exit in the middle of the training, by relaunching the script, the program will automatically load the existing dataset that has been generated under the directory, instead of doing it all over again.
170170

171171
After the script finishes running you will see these distinct artifacts:
172172

@@ -236,84 +236,9 @@ lora:
236236

237237
## Extending
238238

239-
The toolkit provides a modular and extensible architecture that allows developers to customize and enhance its functionality to suit their specific needs. Each component of the toolkit, such as data ingestion, finetuning, inference, and quality assurance testing, is designed to be easily extendable.
239+
The toolkit provides a modular and extensible architecture that allows developers to customize and enhance its functionality to suit their specific needs. Each component of the toolkit, such as data ingestion, fine-tuning, inference, and quality assurance testing, is designed to be easily extendable.
240240

241241
## Contributing
242242

243-
If you would like to contribute to this project, we recommend following the "fork-and-pull" Git workflow.
244-
245-
1. **Fork** the repo on GitHub
246-
2. **Clone** the project to your own machine
247-
3. **Commit** changes to your own branch
248-
4. **Push** your work back up to your fork
249-
5. Submit a **Pull request** so that we can review your changes
250-
251-
NOTE: Be sure to merge the latest from "upstream" before making a pull request!
252-
253-
### Set Up Dev Environment
254-
255-
<details>
256-
<summary>1. Clone Repo</summary>
257-
258-
```shell
259-
git clone https://github.com/georgian-io/LLM-Finetuning-Toolkit.git
260-
cd LLM-Finetuning-Toolkit/
261-
```
262-
263-
</details>
264-
265-
<details>
266-
<summary>2. Install Dependencies</summary>
267-
<details>
268-
<summary>Install with Docker [Recommended]</summary>
269-
270-
```shell
271-
docker build -t llm-toolkit
272-
```
273-
274-
```shell
275-
# CPU
276-
docker run -it llm-toolkit
277-
# GPU
278-
docker run -it --gpus all llm-toolkit
279-
```
280-
281-
</details>
282-
283-
<details>
284-
<summary>Poetry (recommended)</summary>
285-
286-
See poetry documentation page for poetry [installation instructions](https://python-poetry.org/docs/#installation)
287-
288-
```shell
289-
poetry install
290-
```
291-
292-
</details>
293-
<details>
294-
<summary>pip</summary>
295-
We recommend using a virtual environment like `venv` or `conda` for installation
296-
297-
```shell
298-
pip install -e .
299-
```
300-
301-
</details>
302-
</details>
303-
304-
### Checklist Before Pull Request (Optional)
305-
306-
1. Use `ruff check --fix` to check and fix lint errors
307-
2. Use `ruff format` to apply formatting
308-
309-
NOTE: Ruff linting and formatting checks are done when PR is raised via Git Action. Before raising a PR, it is a good practice to check and fix lint errors, as well as apply formatting.
310-
311-
### Releasing
312-
313-
To manually release a PyPI package, please run:
314-
315-
```shell
316-
make build-release
317-
```
318-
319-
Note: Make sure you have pypi token for this [PyPI repo](https://pypi.org/project/llm-toolkit/).
243+
Open-source contributions to this toolkit are welcome and encouraged.
244+
If you would like to contribute, please see [CONTRIBUTING.md](CONTRIBUTING.md).

0 commit comments

Comments
 (0)