add solar pro support #9541

mxyng · 2024-09-18T22:38:06Z

solar pro introduces block skip connections where blocks are connected to other, non-sequential blocks with a scale multiple

this change adds 4 new keys to store the skip connections and one new tensor to store the scalar. the scalar is implemented as a 1-dimensional tensor with 2 elements derived from the model's bskcn_tv configuration. in general, the values are (bskcn_tv, 1 - bskcn_tv)

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

solar pro introduces block skip connections where blocks are connected to other, non-sequential blocks with a scale multiple this change adds 4 new keys to store the skip connections and one new tensor to store the scalar. the scalar is implemented a 1-dimensional tensor with 2 elements dervied from the model's bskcn_tv configuration. in general, the values are (bskcn_tv, 1 - bskcn_tv)

slaren · 2024-09-20T00:08:14Z

src/llama.cpp

        }
    }
+
+    bool n_bskcn(uint32_t n, uint32_t il = 0) const {


The n_ prefix implies that this returns an integer, however it returns a boolean.

SteelPh0enix · 2024-09-25T17:46:41Z

is this PR active and maintained?
it'd be nice to see this merged

vignesh1507

I agree with the changes.

compilade · 2024-10-06T20:13:20Z

convert_hf_to_gguf.py

+    def prepare_tensors(self):
+        if bskcn_tv := self.find_hparam(['bskcn_tv'], optional=True):
+          # use bskcn_tv[1] for inference since bskcn_tv[0] is for training
+          self.gguf_writer.add_tensor(self.format_tensor_name(gguf.MODEL_TENSOR.BSKCN_TV), np.array([bskcn_tv[1], 1 - bskcn_tv[1]], dtype=np.float32))
+
+        super().prepare_tensors()


I think this should override generate_extra_tensors instead of prepare_tensors. Otherwise LoRA conversion will not work properly, at least since #9396.

compilade · 2024-10-06T20:19:40Z

src/llama.cpp

+            if (hparams.n_bskcn(2, il)) {
+                inpSA = ggml_add(
+                   ctx0,
+                   ggml_mul(ctx0, bskcn_1, ggml_view_1d(ctx0, model.layers[il].bskcn_tv, 1, 0)),


bskcn_1 is not necessarily initialized here, because a model file could be crafted to make hparams.n_bskcn(2, il) return true while making hparams.n_bskcn(1, il) always return false.

compilade · 2024-10-06T20:30:44Z

convert_hf_to_gguf.py

+        for i, bskcn in enumerate(self.hparams[k] for k in self.hparams.keys() if k.startswith("bskcn_") and k != 'bskcn_tv'):
+            # store the skip connections as a layer index where a non-zero value indicates a skip connection
+            # this approach simplifies lookup at inference time
+            self.gguf_writer.add_block_skip_connection(i, [1 if n in bskcn else 0 for n in range(self.block_count)])


This assumes bskcn_{n} are in the correct order in config.json. Why not instead iterate them by their names?

Nexesenex · 2024-10-13T17:50:54Z

@mxyng Is this PR still on?

github-actions bot added the python python script changes label Sep 18, 2024

slaren reviewed Sep 20, 2024

View reviewed changes

vignesh1507 approved these changes Oct 6, 2024

View reviewed changes

compilade reviewed Oct 6, 2024

View reviewed changes

brankoradovanovic-mcom mentioned this pull request Oct 12, 2024

Upstage Solar Pro Preview model is not supported nomic-ai/gpt4all#2960

Open

mxyng closed this by deleting the head repository Dec 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add solar pro support #9541

add solar pro support #9541

Uh oh!

mxyng commented Sep 18, 2024

Uh oh!

slaren Sep 20, 2024

Uh oh!

SteelPh0enix commented Sep 25, 2024

Uh oh!

vignesh1507 left a comment

Uh oh!

compilade Oct 6, 2024 •

edited

Loading

Uh oh!

compilade Oct 6, 2024

Uh oh!

compilade Oct 6, 2024

Uh oh!

Nexesenex commented Oct 13, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

add solar pro support #9541

add solar pro support #9541

Uh oh!

Conversation

mxyng commented Sep 18, 2024

Uh oh!

slaren Sep 20, 2024

Choose a reason for hiding this comment

Uh oh!

SteelPh0enix commented Sep 25, 2024

Uh oh!

vignesh1507 left a comment

Choose a reason for hiding this comment

Uh oh!

compilade Oct 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

compilade Oct 6, 2024

Choose a reason for hiding this comment

Uh oh!

compilade Oct 6, 2024

Choose a reason for hiding this comment

Uh oh!

Nexesenex commented Oct 13, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

compilade Oct 6, 2024 •

edited

Loading