diff --git a/README.md b/README.md index 879e7153..b145f855 100644 --- a/README.md +++ b/README.md @@ -5,7 +5,7 @@

-πŸš€**LLaMA2-Accessory** is an open-source toolkit for pre-training, fine-tuning and deployment of **Large Language Models (LLMs)** and **mutlimodal LLMs**. This repo is mainly inherited from [LLaMA-Adapter](https://github.com/OpenGVLab/LLaMA-Adapter) with more advanced features.🧠 +πŸš€**LLaMA2-Accessory** is an open-source toolkit for pre-training, fine-tuning and deployment of **Large Language Models (LLMs)** and **multimodals LLMs**. This repo is mainly inherited from [LLaMA-Adapter](https://github.com/OpenGVLab/LLaMA-Adapter) with more advanced features.🧠 ## News - **[2023.08.21]** We release the Quantization codes and Evaluation resultπŸ”₯πŸ”₯πŸ”₯ @@ -19,7 +19,7 @@ - 🌈 Multi-modal fine-tuning with image-text pairs ([LAION](https://laion.ai/blog/laion-5b/), [COYO](https://github.com/kakaobrain/coyo-dataset) and more), interleaved image-text data ([MMC4](https://github.com/allenai/mmc4) and [OBELISC](https://github.com/huggingface/OBELISC)) and visual instruction data ([LLaVA](https://github.com/haotian-liu/LLaVA), [Shrika](https://github.com/shikras/shikra), [Bard](https://bard.google.com/)) - πŸ”§ LLM for API Control ([GPT4Tools](https://github.com/StevenGrove/GPT4Tools) and [Gorilla](https://github.com/ShishirPatil/gorilla)). * **⚑Efficient Optimization and Deployment** - - 🚝 Parameter-efficient fine-tuning with [Zero-init Attenion](https://github.com/OpenGVLab/LLaMA-Adapter) and [Bias-norm Tuning](https://github.com/OpenGVLab/LLaMA-Adapter). + - 🚝 Parameter-efficient fine-tuning with [Zero-init Attention](https://github.com/OpenGVLab/LLaMA-Adapter) and [Bias-norm Tuning](https://github.com/OpenGVLab/LLaMA-Adapter). - πŸ’» Fully Sharded Data Parallel ([FSDP](https://engineering.fb.com/2021/07/15/open-source/fsdp/)), [Flash Attention 2](https://github.com/Dao-AILab/flash-attention) and [QLoRA](https://github.com/artidoro/qlora). * **πŸ‹οΈβ€β™€οΈSupport More Visual Encoders and LLMs** diff --git a/docs/finetune.md b/docs/finetune.md index 36053f10..b044183c 100644 --- a/docs/finetune.md +++ b/docs/finetune.md @@ -17,10 +17,10 @@ This document demonstrates the fine-tuning use cases supported by LLaMA2-Accesso > ## Prerequisites > -> To run our provided experiment scripts on you own machine, please first adjust the following configurations: +> To run our provided experiment scripts on your own machine, please first adjust the following configurations: > > + Modify the value of the `pretrained_path` variable in the `.sh` file. This variable should point to the directory containing checkpoints to fine-tune from. -> + If you fine-tune from the officianl LLaMA / LLaMA2 checkpoints released by META, the directory should be like: +> + If you fine-tune from the official LLaMA / LLaMA2 checkpoints released by META, the directory should be like: > ``` > pretrained_path > β”œβ”€β”€ consolidated.00.pth @@ -28,7 +28,7 @@ This document demonstrates the fine-tuning use cases supported by LLaMA2-Accesso > └── ... > ``` > -> and your should set `pretrained_type=meta_ori` in the `.sh` file. +> and you should set `pretrained_type=meta_ori` in the `.sh` file. > + Alternatively, you may also fine-tune from checkpoints saved by LLaMA2-Accessory. In such cases, the directory should be like: > ``` > pretrained_path @@ -199,7 +199,7 @@ python demos/multi_turn.py \ > My content is: > > ``` -> I want study math everyday because I like it. It makes me feel very good. I do not like English because it is hard. I can not remember the words +> I want to study math every day because I like it. It makes me feel very good. I do not like English because it is hard. I can not remember the words > ``` > > **Assistant:** "I am eager to study mathematics daily as I find it fascinating. It brings me great satisfaction. On the other hand, I do not have the same enthusiasm for English as it is a challenging subject for me. Memorizing vocabulary and grammar rules is not my strong suit." @@ -207,7 +207,7 @@ python demos/multi_turn.py \ > **Human:** My second content is: > > ``` -> My name is Jimmy, a master student studing computer science. I have done some projects. For example, I designed a image classification model, its name is CLSModel. I learned a lot from this project. +> My name is Jimmy, a master student studying computer science. I have done some projects. For example, I designed an image classification model, its name is CLSModel. I learned a lot from this project. > ``` > > **Assistant:** "My name is Jimmy, and I am a master's student pursuing a degree in computer science. I have had the opportunity to work on various projects, one of which was an image classification model called CLSModel. Through this project, I gained valuable experience and knowledge that has helped me grow as a computer scientist." diff --git a/docs/pretrain.md b/docs/pretrain.md index 7f160be3..7f392a32 100644 --- a/docs/pretrain.md +++ b/docs/pretrain.md @@ -4,9 +4,9 @@ LLaMA2-Accessory currently supports two kinds of pre-training datasets: the *van The vanilla dataset is supported in [`data/falcon.py`](../accessory/data/falcon.py). It loads data directly from `.parquet` data files (as an example, see [*Falcon Refined-web*](https://huggingface.co/datasets/tiiuae/falcon-refinedweb)). With the vanilla dataset, every piece of data will be converted into tokens of fixed length. Specifically, it will be truncated if it is longer than the target length, and padded if shorter. -An example for pre-training with the vanilla dataset is provided in [exps/pretrain/vanilla.sh](../accessory/exps/pretrain/vanilla.sh). Here are some notes about the script: +An example of pre-training with the vanilla dataset is provided in [exps/pretrain/vanilla.sh](../accessory/exps/pretrain/vanilla.sh). Here are some notes about the script: -+ To run the script one your own environment, point the `llama_config` variable to the `params.json` file defining the model structure, and the `tokenizer_path` variable to the `tokenizer.model` file. ++ To run the script on your own environment, point the `llama_config` variable to the `params.json` file defining the model structure, and the `tokenizer_path` variable to the `tokenizer.model` file. + A meta file specifying the list of `.parquet` files to use should be created and pointed to by the `data_meta_path` variable. We provide an example meta file for the *Falcon Refined-web* dataset [here](../data_example/PretrainMeta.json). + The elements in the meta file should be either absolute paths, or paths relative to `data_root`. @@ -18,7 +18,7 @@ For more efficient token utilization, the packed dataset is supported in [data/f python -u tools/generate_packed_data.py ``` -An example for pre-training with the packed dataset is provided in [exps/pretrain/13B_packed.sh](../accessory/exps/pretrain/13B_packed.sh). Similar to the case of the vanilla dataset, you also need to create a meta file and point `data_meta_path` to it. If you use our `generate_packed_dataset.py` to preprocess data, elements in the meta file should end with `.pkl` (See [here](../data_example/PretrainMetaPacked.json) for example). +An example of pre-training with the packed dataset is provided in [exps/pretrain/13B_packed.sh](../accessory/exps/pretrain/13B_packed.sh). Similar to the case of the vanilla dataset, you also need to create a meta file and point `data_meta_path` to it. If you use our `generate_packed_dataset.py` to preprocess data, elements in the meta file should end with `.pkl` (See [here](../data_example/PretrainMetaPacked.json) for example).