Skip to content

Roo crashes TabbyApi/Exlamma after v3.25.20Β #7581

@drknyt

Description

@drknyt

App Version

v3.25.20

API Provider

OpenAI Compatible

Model Used

Devstral-Small-2507

Roo Code Task Links (Optional)

No response

πŸ” Steps to Reproduce

  1. Update RooCode to any version higher than v3.25.20.
  2. Setup an OpenAI compatible model in the Configuration Profile. The server is a local machine on the same network running a TabbyApi instance which serves a model via the ExlammaV2 engine on an OpenAI compatible endpoint.
  3. Run any task.

πŸ’₯ Outcome Summary

Expected response but received error:
API Request Failed.
Chat completion aborted. Please check the server console.

Image

TabbyApi server logs posted below.

Release v3.25.21 seems to have introduced an issue with the way Roo communicates to an OpenAI compatible endpoint, which is causing an issue. Downgrading to v3.25.20 works but any version beyond that throws the error.

πŸ“„ Relevant Logs or Errors (Optional)

2025-09-01 16:35:24.417 INFO:     # Current Workspace Directory 
(c:/repo/novel-writer) Files
2025-09-01 16:35:24.417 INFO:     .vscode/
2025-09-01 16:35:24.417 INFO:     documentation/
2025-09-01 16:35:24.417 INFO:     project_details/
2025-09-01 16:35:24.417 INFO:     prompt_plan/
2025-09-01 16:35:24.417 INFO:     public/
2025-09-01 16:35:24.417 INFO:     roadmap/
2025-09-01 16:35:24.417 INFO:     src/
2025-09-01 16:35:24.417 INFO:     src/components/
2025-09-01 16:35:24.417 INFO:     src/components/Breadcrumbs/
2025-09-01 16:35:24.417 INFO:     src/components/CharacterDetails/
2025-09-01 16:35:24.417 INFO:     src/components/ColorSchemeToggle/
2025-09-01 16:35:24.417 INFO:     src/components/CommitModal/
2025-09-01 16:35:24.417 INFO:     src/components/CustomParagraph/
2025-09-01 16:35:24.417 INFO:     src/components/ExportComponent/
2025-09-01 16:35:24.417 INFO:     src/components/FileImportModal/
2025-09-01 16:35:24.417 INFO:     
2025-09-01 16:35:24.417 INFO:     (File list truncated. Use list_files on 
specific subdirectories if you need to explore further.)
2025-09-01 16:35:24.417 INFO:     You have not created a todo list yet. Create 
one with `update_todo_list` if your task is complicated or involves multiple 
steps.
2025-09-01 16:35:24.417 INFO:     </environment_details>[/INST]
2025-09-01 16:35:30.013 ERROR:    FATAL ERROR with generation. Attempting to 
recreate the generator. If this fails, please restart the server.
2025-09-01 16:35:30.013 WARNING:  Immediately terminating all jobs. Clients will
have their requests cancelled.
2025-09-01 16:35:30.016 ERROR:    Traceback (most recent call last):
2025-09-01 16:35:30.016 ERROR:      File 
"/app/endpoints/OAI/utils/chat_completion.py", line 376, in 
stream_generate_chat_completion
2025-09-01 16:35:30.016 ERROR:        raise generation
2025-09-01 16:35:30.016 ERROR:      File 
"/app/endpoints/OAI/utils/completion.py", line 118, in _stream_collector
2025-09-01 16:35:30.016 ERROR:        async for generation in new_generation:
2025-09-01 16:35:30.016 ERROR:      File "/app/backends/exllamav2/model.py", 
line 977, in stream_generate
2025-09-01 16:35:30.016 ERROR:        async for generation_chunk in 
self.generate_gen(
2025-09-01 16:35:30.016 ERROR:      File "/app/backends/exllamav2/model.py", 
line 1465, in generate_gen
2025-09-01 16:35:30.016 ERROR:        raise ex
2025-09-01 16:35:30.016 ERROR:      File "/app/backends/exllamav2/model.py", 
line 1403, in generate_gen
2025-09-01 16:35:30.016 ERROR:        async for result in job:
2025-09-01 16:35:30.016 ERROR:      File 
"/opt/venv/lib/python3.12/site-packages/exllamav2/generator/dynamic_async.py", 
line 97, in __aiter__
2025-09-01 16:35:30.016 ERROR:        raise result
2025-09-01 16:35:30.016 ERROR:      File 
"/opt/venv/lib/python3.12/site-packages/exllamav2/generator/dynamic_async.py", 
line 28, in _run_iteration
2025-09-01 16:35:30.016 ERROR:        results = self.generator.iterate()
2025-09-01 16:35:30.016 ERROR:                  ^^^^^^^^^^^^^^^^^^^^^^^^
2025-09-01 16:35:30.016 ERROR:      File 
"/opt/venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, 
in decorate_context
2025-09-01 16:35:30.016 ERROR:        return func(*args, **kwargs)
2025-09-01 16:35:30.016 ERROR:               ^^^^^^^^^^^^^^^^^^^^^
2025-09-01 16:35:30.016 ERROR:      File 
"/opt/venv/lib/python3.12/site-packages/exllamav2/generator/dynamic.py", line 
1002, in iterate
2025-09-01 16:35:30.016 ERROR:        self.iterate_gen(results)
2025-09-01 16:35:30.016 ERROR:      File 
"/opt/venv/lib/python3.12/site-packages/exllamav2/generator/dynamic.py", line 
1251, in iterate_gen
2025-09-01 16:35:30.016 ERROR:        job.receive_logits(job_logits)
2025-09-01 16:35:30.016 ERROR:      File 
"/opt/venv/lib/python3.12/site-packages/exllamav2/generator/dynamic.py", line 
1888, in receive_logits
2025-09-01 16:35:30.016 ERROR:        ExLlamaV2Sampler.sample(
2025-09-01 16:35:30.016 ERROR:      File 
"/opt/venv/lib/python3.12/site-packages/exllamav2/generator/sampler.py", line 
540, in sample
2025-09-01 16:35:30.016 ERROR:        m = ext_c.sample_basic(
2025-09-01 16:35:30.016 ERROR:            ^^^^^^^^^^^^^^^^^^^
2025-09-01 16:35:30.016 ERROR:    TypeError: sample_basic(): incompatible 
function arguments. The following argument types are supported:
2025-09-01 16:35:30.016 ERROR:        1. (arg0: torch.Tensor, arg1: float, arg2:
int, arg3: float, arg4: float, arg5: float, arg6: float, arg7: float, arg8: 
float, arg9: torch.Tensor, arg10: torch.Tensor, arg11: torch.Tensor, arg12: 
torch.Tensor, arg13: torch.Tensor, arg14: bool, arg15: list[float], arg16: 
float, arg17: float, arg18: float, arg19: torch.Tensor, arg20: float, arg21: 
float, arg22: float, arg23: float, arg24: float, arg25: float, arg26: float) -> 
list[float]
2025-09-01 16:35:30.016 ERROR:    
2025-09-01 16:35:30.016 ERROR:    Invoked with: tensor([[[4.5430, 4.5859, 
1.3486,  ..., 4.1016, 3.3926, 4.9648]]]), None, 0, 1.0, 0.0, 0.0, 1.0, 1.0, 
0.8176540387510759, tensor([[841435730]]), tensor([[2.2581e+33]]), tensor(..., 
device='meta', size=(1, 1)), tensor(..., device='meta', size=(1, 1)), 
tensor(..., device='meta', size=(1, 1)), False, [], 1.5, 0.3, 1.0, tensor(..., 
device='meta', size=(1, 1)), 0.0, 0.1, 1.0, 1.0, 1.0, 0.0, 0.0
2025-09-01 16:35:30.023 ERROR:    Sent to request: Chat completion aborted. 
Please check the server console.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Issue - In ProgressSomeone is actively working on this. Should link to a PR soon.bugSomething isn't working

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions