diff --git a/docs/android.md b/doc/android.md similarity index 100% rename from docs/android.md rename to doc/android.md diff --git a/docs/backend/BLIS.md b/doc/backend/BLIS.md similarity index 100% rename from docs/backend/BLIS.md rename to doc/backend/BLIS.md diff --git a/docs/backend/CANN.md b/doc/backend/CANN.md similarity index 99% rename from docs/backend/CANN.md rename to doc/backend/CANN.md index 23f10175a6b2d..de84c9e465ff8 100644 --- a/docs/backend/CANN.md +++ b/doc/backend/CANN.md @@ -15,7 +15,7 @@ **Ascend NPU** is a range of AI processors using Neural Processing Unit. It will efficiently handle matrix-matrix multiplication, dot-product and scalars. -**CANN** (Compute Architecture for Neural Networks) is a heterogeneous computing architecture for AI scenarios, providing support for multiple AI frameworks on the top and serving AI processors and programming at the bottom. It plays a crucial role in bridging the gap between upper and lower layers, and is a key platform for improving the computing efficiency of Ascend AI processors. Meanwhile, it offers a highly efficient and easy-to-use programming interface for diverse application scenarios, allowing users to rapidly build AI applications and services based on the Ascend platform. +**CANN** (Compute Architecture for Neural Networks) is a heterogeneous computing architecture for AI scenarios, providing support for multiple AI frameworks on the top and serving AI processors and programming at the bottom. It plays a crucial role in bridging the gap between upper and lower layers, and is a key platform for improving the computing efficiency of Ascend AI processors. Meanwhile, it offers a highly efficient and easy-to-use programming interface for diverse application scenarios, allowing users to rapidly build AI applications and services based on the Alsscend platform. **Llama.cpp + CANN** diff --git a/docs/backend/SYCL.md b/doc/backend/SYCL.md similarity index 100% rename from docs/backend/SYCL.md rename to doc/backend/SYCL.md diff --git a/docs/build.md b/doc/build.md similarity index 100% rename from docs/build.md rename to doc/build.md diff --git a/docs/development/HOWTO-add-model.md b/doc/development/HOWTO-add-model.md similarity index 100% rename from docs/development/HOWTO-add-model.md rename to doc/development/HOWTO-add-model.md diff --git a/docs/development/debugging-tests.md b/doc/development/debugging-tests.md similarity index 100% rename from docs/development/debugging-tests.md rename to doc/development/debugging-tests.md diff --git a/docs/development/llama-star/idea-arch.key b/doc/development/llama-star/idea-arch.key similarity index 100% rename from docs/development/llama-star/idea-arch.key rename to doc/development/llama-star/idea-arch.key diff --git a/docs/development/llama-star/idea-arch.pdf b/doc/development/llama-star/idea-arch.pdf similarity index 100% rename from docs/development/llama-star/idea-arch.pdf rename to doc/development/llama-star/idea-arch.pdf diff --git a/docs/development/token_generation_performance_tips.md b/doc/development/token_generation_performance_tips.md similarity index 100% rename from docs/development/token_generation_performance_tips.md rename to doc/development/token_generation_performance_tips.md diff --git a/docs/docker.md b/doc/docker.md similarity index 100% rename from docs/docker.md rename to doc/docker.md diff --git a/docs/install.md b/doc/install.md similarity index 100% rename from docs/install.md rename to doc/install.md diff --git a/doc/issue.md b/doc/issue.md new file mode 100644 index 0000000000000..f500be25a2dd4 --- /dev/null +++ b/doc/issue.md @@ -0,0 +1,4 @@ +# git + +if you modify doc/build.md, you cannot git add it, because it is in .gitignore file + diff --git a/doc/story.md b/doc/story.md new file mode 100644 index 0000000000000..1578fba6433e4 --- /dev/null +++ b/doc/story.md @@ -0,0 +1,47 @@ +# Complie + +doc/build.md + +# model + +[prithivMLmods/Llama-Deepsync-1B-GGUF](https://huggingface.co/prithivMLmods/Llama-Deepsync-1B-GGUF) + +# Run model + +./build/bin/llama-cli -m models/Llama-Deepsync-1B.Q4_K_M.gguf -p "what's your name" + +# Specific parameter + +cd /home/xunchan/Workspace/llama.xunchan/gguf-py/examples + +python reader.py /home/xunchan/Workspace/llama.xunchan/models/Llama-Deepsync-1B.Q4_K_M.gguf + +or + +show in [huggingface](https://huggingface.co/) frontend + +## general.architecture + +llama + +## llama + +### llama.block_count + +### llama.context_length + +### llama.embedding_length + +### llama.feed_forward_length + +### + +#### block + +# Modify code + +## Find the llama block code process location + +## Write it precifily + +## \ No newline at end of file