Skip to content

Conversation

tybalex
Copy link
Collaborator

@tybalex tybalex commented Oct 23, 2024

  • Self Reported Review Complexity:
    • Review Complexity : Low
    • Review Complexity : Medium
    • Review Complexity : High
  • I have read the contributing guidelines

slaren and others added 30 commits September 10, 2024 18:04
* llama_sampler_penalties : clamp penalty_last_n to zero
* arg : bring back missing ifdef

* replace with llama_supports_gpu_offload
Flake lock file updates:

• Updated input 'flake-parts':
    'github:hercules-ci/flake-parts/af510d4a62d071ea13925ce41c95e3dec816c01d?narHash=sha256-ODYRm8zHfLTH3soTFWE452ydPYz2iTvr9T8ftDMUQ3E%3D' (2024-08-30)
  → 'github:hercules-ci/flake-parts/567b938d64d4b4112ee253b9274472dc3a346eb6?narHash=sha256-%2Bebgonl3NbiKD2UD0x4BszCZQ6sTfL4xioaM49o5B3Y%3D' (2024-09-01)
• Updated input 'flake-parts/nixpkgs-lib':
    'https://github.com/NixOS/nixpkgs/archive/a5d394176e64ab29c852d03346c1fc9b0b7d33eb.tar.gz?narHash=sha256-uFf2QeW7eAHlYXuDktm9c25OxOyCoUOQmh5SZ9amE5Q%3D' (2024-08-01)
  → 'https://github.com/NixOS/nixpkgs/archive/356624c12086a18f2ea2825fed34523d60ccc4e3.tar.gz?narHash=sha256-Ss8QWLXdr2JCBPcYChJhz4xJm%2Bh/xjl4G0c0XlP6a74%3D' (2024-09-01)
• Updated input 'nixpkgs':
    'github:NixOS/nixpkgs/71e91c409d1e654808b2621f28a327acfdad8dc2?narHash=sha256-GnR7/ibgIH1vhoy8cYdmXE6iyZqKqFxQSVkFgosBh6w%3D' (2024-08-28)
  → 'github:NixOS/nixpkgs/574d1eac1c200690e27b8eb4e24887f8df7ac27c?narHash=sha256-v3rIhsJBOMLR8e/RNWxr828tB%2BWywYIoajrZKFM%2B0Gg%3D' (2024-09-06)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* sycl : update support condition to im2col

Signed-off-by: Alberto Cabrera <[email protected]>

* Added TODO to remind supporting FP32 im2col

---------

Signed-off-by: Alberto Cabrera <[email protected]>
…url flag (#9255)

* feat: Implements retrying logic for downloading models using --model-url flag

* Update common/common.cpp

Co-authored-by: Xuan Son Nguyen <[email protected]>

* Update common/common.cpp

Co-authored-by: Xuan Son Nguyen <[email protected]>

* apply comments

* implements a retry function to avoid duplication

* fix editorconfig

* change function name

---------

Co-authored-by: farbod <[email protected]>
Co-authored-by: Xuan Son Nguyen <[email protected]>
Co-authored-by: slaren <[email protected]>
Co-authored-by: Xuan Son Nguyen <[email protected]>
* Support of converting local models added to convert-hf-to-gguf-update.py

* Description fixed

* shutil added to imports
* ggml : hide ggml_object, ggml_cgraph, ggml_hash_set

ggml-ci

* ggml : add ggml-impl.h to backends

* ggml : fix compiler warnings

ggml-ci

* ggml : add assert upon adding nodes
- Added ggml_cpu_has_riscv_v() in GGML to print system info in log
- Modified Makefile to only use flag when cross compiling for RISC-V
* add phi2 tokenizer

* add phi name to convert_hf_to_gguf_update.py

* make tokenizer_pre consistent; llama.cpp work
* `GGML_TARGET_DEFINES-NOTFOUND` fix for builds without `GGML_CDEF_PUBLIC`

* Update CMakeLists.txt, spaces fix
* lora : raise error if lm_head is ignored

* fix style

* clarify comment
* feat: Add host buffer type for Ascend NPU(CANN backend)

* fix some checking errors

* Add a few comments
* server : added with_pieces functionality to /tokenize endpoint

* server : Add tokenize with pieces tests to server.feature

* Handle case if tokenizer splits along utf8 continuation bytes

* Add example of token splitting

* Remove trailing ws

* Fix trailing ws

* Maybe fix ci

* maybe this fix windows ci?

---------

Co-authored-by: Xuan Son Nguyen <[email protected]>
* feat: remove a sampler from a chain

* fix: return removed sampler

* fix: safer casting
* llama : llama_perf + option to disable timings during decode

ggml-ci

* common : add llama_arg

* Update src/llama.cpp

Co-authored-by: Xuan Son Nguyen <[email protected]>

* perf : separate functions in the API

ggml-ci

* perf : safer pointer handling + naming update

ggml-ci

* minor : better local var name

* perf : abort on invalid sampler pointer

ggml-ci

---------

Co-authored-by: Xuan Son Nguyen <[email protected]>
MollySophia and others added 13 commits October 22, 2024 13:33
* Add chat template for RWKV-World

Signed-off-by: Molly Sophia <[email protected]>

* RWKV: Fix the chat template not being used

Signed-off-by: Molly Sophia <[email protected]>

* RWKV v6: Set EOT token to ``\n\n``

Signed-off-by: Molly Sophia <[email protected]>

* readme: add rwkv into supported model list

Signed-off-by: Molly Sophia <[email protected]>

---------

Signed-off-by: Molly Sophia <[email protected]>
* llama: remove useless template matching for rwkv-world

Signed-off-by: Molly Sophia <[email protected]>

* converter: Add comment about the hack for rwkv models

Signed-off-by: Molly Sophia <[email protected]>

* Update src/llama.cpp

Co-authored-by: Xuan Son Nguyen <[email protected]>

---------

Signed-off-by: Molly Sophia <[email protected]>
Co-authored-by: Xuan Son Nguyen <[email protected]>
This commit renames the member field batch in llm_build_context to
ubatch, and also the parameter batch in llama_build_graph, and
llama_set_inputs to ubatch.

The motivation for this change is to make the code more readable
(considering there are the structs llama_batch and llama_sbatch), and
consistent with other parts of the code base where parameters/fields of
type llama_ubatch are named ubatch.
* llama : fix empty batch cause llama_batch_allocr to crash

* move batch_allocr inside decode/encode_internal

* fix build

* add GGML_ASSERT

* Apply suggestions from code review

Co-authored-by: Georgi Gerganov <[email protected]>

---------

Co-authored-by: Georgi Gerganov <[email protected]>
Flake lock file updates:

• Updated input 'nixpkgs':
    'github:NixOS/nixpkgs/5633bcff0c6162b9e4b5f1264264611e950c8ec7?narHash=sha256-9UTxR8eukdg%2BXZeHgxW5hQA9fIKHsKCdOIUycTryeVw%3D' (2024-10-09)
  → 'github:NixOS/nixpkgs/4c2fcb090b1f3e5b47eaa7bd33913b574a11e0a0?narHash=sha256-/uilDXvCIEs3C9l73JTACm4quuHUsIHcns1c%2BcHUJwA%3D' (2024-10-18)
* add pool_2d

Signed-off-by: Junhee Yoo <[email protected]>

* fix im2col and add unittest for N>=1024

Signed-off-by: Junhee Yoo <[email protected]>

* add tests for N % 1024 != 0

Signed-off-by: Junhee Yoo <[email protected]>

* remove trailing whitespaces

Signed-off-by: Junhee Yoo <[email protected]>

* apply suggestions

Signed-off-by: Junhee Yoo <[email protected]>

* apply more optimization

- original IM2COL kernel + _ext with MIN()

Signed-off-by: Junhee Yoo <[email protected]>

* apply review: change kernel name of pool_2d

Signed-off-by: Junhee Yoo <[email protected]>

* apply review

Signed-off-by: Junhee Yoo <[email protected]>

* fix more formatting and enhance readability

Signed-off-by: Junhee Yoo <[email protected]>

---------

Signed-off-by: Junhee Yoo <[email protected]>
* added classic vim support

* fixed ring update, removed blank line

* minor

* minor

* minor doc update

* removed uneeded var

* minor

* minor

* fixed job_start creating new scratch buffers

* fixed job_start creating new scratch buffers

* fixed ghost text indenting when expandtab is on

* removed unused code

* minor

* unified fim_on_exit

* minor

* vim ghost text rendering now uses pos_x and pos_y parameters

* renamed *_hlgroup to hlgroup_*

* renamed *_ghost_text to ghost_text_*, moved nvim/vim detection to llama#init()

* minor

---------

Co-authored-by: Michael Coppola <[email protected]>
This commit removes the setting of the `used` field of the contexts in
the global state (g_state) in `ggml_init`.

The motivation for this change is that I believe that this additional
initialization might not be required after the changes in Commit
45fc4fe ("sync : latest changes from
whisper.cpp"), which changed the initialization of the contexts field
from `{ 0 }` to `{ { 0 } }`:

```console
             g_state = (struct ggml_state) {
-                /*.contexts =*/ { 0 },
+                /*.contexts =*/ { { 0 } },
             };
```
My understanding is that the `{0}` initialization might not have
zero-initialized all the nested fields in every array element because of
compiler differences, and might have been the reason for having the
explicit setting of the `used` fields to false.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.