Conversation
Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>
Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>
Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>
Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>
the training fails with memory error Error during training: MPS backend out of memory (MPS allocated: 181.43 GiB, other allocations: 107.86 MiB, max allowed: 182.78 GiB). Tried to allocate 1.53 GiB on private pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure). Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>
Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>
the training fails withe error on macos with 128GB memory RuntimeError: MPS backend out of memory (MPS allocated: 178.54 GiB, other allocations: 3.84 GiB, max allowed: 182.78 GiB). Tried to allocate 506.25 MiB on private pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure). Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>
Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>
Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>
📝 WalkthroughWalkthroughAdds a local PyTorch LLM backend, lifecycle hooks (open/close) on AI interfaces, Excel color constants and a default timeout, dependency updates and README setup additions, and bumps the package version. Changes
Sequence DiagramsequenceDiagram
participant CLI as Application / CLI
participant SbkAI as SbkAI Instance
participant PyLLM as PyTorchLLM Backend
participant Model as Local Model (HF / CUDA / MPS / CPU)
CLI->>SbkAI: open(args)
SbkAI->>PyLLM: open(args)
PyLLM->>Model: _initialize_model() / load weights to device
CLI->>SbkAI: generate(prompt)
SbkAI->>PyLLM: _generate_text(prompt)
PyLLM->>Model: tokenize & run generation (sampling params)
Model-->>PyLLM: tokens / logits
PyLLM->>PyLLM: _clean_generated_text()
PyLLM-->>SbkAI: (success, text)
SbkAI-->>CLI: return generated text
CLI->>SbkAI: close(args)
SbkAI->>PyLLM: close(args)
PyLLM->>Model: optional _train_on_output() / _save_model()
PyLLM->>Model: free resources
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 10
🤖 Fix all issues with AI agents
In `@README.md`:
- Around line 131-132: The README currently pins a non-existent Python version
in the conda create command ("conda create -n conda-sbk-charts python=3.14 -y");
update that command to use a released interpreter (e.g., python=3.12 or
python=3.13) so environment creation succeeds, and optionally note a range like
"python>=3.12,<3.14" or "python=3.12" for clarity.
In `@requirements.txt`:
- Around line 13-14: The requirements.txt entry for transformers is pinned to a
non-existent version "transformers~=4.57.3"; update that token to a valid
release (e.g., change "transformers~=4.57.3" to "transformers~=4.57.4" or a
later valid version) while leaving "torch~=2.9.1" unchanged so pip can resolve
packages correctly.
In `@src/custom_ai/pytorch_llm/pytorch_llm.py`:
- Around line 100-105: The call to AutoModelForCausalLM.from_pretrained is using
the wrong keyword argument (dtype) so the dtype is ignored; change the parameter
name to torch_dtype when calling
AutoModelForCausalLM.from_pretrained(saved_model_dir, ...) and pass the same
dtype variable as torch_dtype; apply the identical change to the second
occurrence of the same call (the block around the lines referenced 116-121) so
both places use torch_dtype instead of dtype.
- Around line 100-105: Add a new boolean CLI flag (e.g., --pt-trust-remote-code)
to the existing argument parser with default False and store it in a variable
(e.g., pt_trust_remote_code); then replace the hardcoded trust_remote_code=True
in every model load call (the AutoModelForCausalLM.from_pretrained and any other
.from_pretrained usage in pytorch_llm.py) to use that variable
(trust_remote_code=pt_trust_remote_code) so users must explicitly opt in to
remote code execution.
- Around line 476-478: The autocast device selection in _train_on_example
currently only checks for 'cuda' and falls back to 'cpu'; update it to mirror
_train_on_output by including 'mps' detection: when forming the torch.autocast
call inside _train_on_example, set device_type to 'mps' if 'mps' is in
str(self.device), otherwise 'cuda' if 'cuda' is in str(self.device), else 'cpu',
and keep passing model_dtype as dtype so mixed-precision works correctly on
Apple Silicon.
- Around line 191-195: In open(), _is_initialized is being set to True
unconditionally after calling _initialize_model(); change the logic so you
capture the return value of _initialize_model() and only set
self._is_initialized = True when that call indicates success (e.g., returned
True), otherwise leave it False and handle the failure (log/error/raise or
return a failure status) so retries are possible; update the open() method to
use the boolean result from _initialize_model() and ensure callers can
observe/initiate retry behavior.
- Around line 279-281: The attention_mask is incorrectly cast to the model's
floating dtype; instead, keep it as an integer tensor and only move it to the
model device (or explicitly cast to torch.long). Update the code that currently
does inputs['attention_mask'] = inputs['attention_mask'].to(dtype=model_dtype)
to something like inputs['attention_mask'] =
inputs['attention_mask'].to(device=model.device, dtype=torch.long) (or preserve
original integer dtype while moving device), and make the same change in the
_train_on_example function so 'attention_mask' remains an integer (long) tensor
rather than being converted to model_dtype.
- Around line 440-493: The _train_on_example method returns inconsistent types
(float/None in docstring but (False, str) on exception); change the except block
to return None (or re-raise the exception) and log the error instead of
returning a tuple, and update the method docstring to clearly state it returns
float or None on failure; specifically modify the except block in
_train_on_example to call your logger (e.g., self.logger.error or similar) with
the exception message and then return None, removing the (False, str) tuple.
- Line 123: The call self.model = self.model.to(self.device) is redundant and
harmful when using device_map="auto" because the model is already placed across
devices; remove that unconditional .to(...) or guard it so it only runs when
device_map is not "auto" (e.g., check the loaded config/argument for device_map
before applying .to). Update the model-loading path where self.model is assigned
to avoid forcing .to(self.device) when device_map=="auto", leaving placement to
the loader to preserve model parallelism.
🧹 Nitpick comments (2)
README.md (1)
130-130: Add language specifiers to fenced code blocks.Per markdownlint MD040, fenced code blocks should have a language specified for proper syntax highlighting.
Proposed fix
-``` +```bash # Create a new conda environment with Python 3.14 or higher-``` +```bash conda deactivateAlso applies to: 149-149
src/custom_ai/pytorch_llm/pytorch_llm.py (1)
334-334: Remove unused variablewas_training.The variable
was_trainingis assigned but never used, as flagged by Ruff F841.Proposed fix
- was_training = self.model.training
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
README.mdrequirements.txtsrc/ai/sbk_ai.pysrc/custom_ai/pytorch_llm/__init__.pysrc/custom_ai/pytorch_llm/pytorch_llm.pysrc/genai/genai.pysrc/main/sbk_charts.py
🧰 Additional context used
🧬 Code graph analysis (4)
src/genai/genai.py (2)
src/custom_ai/pytorch_llm/pytorch_llm.py (2)
open(191-195)close(198-219)src/ai/sbk_ai.py (2)
open(194-196)close(198-200)
src/ai/sbk_ai.py (4)
src/custom_ai/pytorch_llm/pytorch_llm.py (3)
parse_args(179-188)open(191-195)close(198-219)src/genai/genai.py (3)
parse_args(55-56)open(58-59)close(61-62)src/custom_ai/lm_studio/lm_studio.py (1)
parse_args(98-103)src/custom_ai/ollama/ollama.py (1)
parse_args(113-119)
src/main/sbk_charts.py (3)
src/custom_ai/pytorch_llm/pytorch_llm.py (2)
open(191-195)close(198-219)src/ai/sbk_ai.py (2)
open(194-196)close(198-200)src/genai/genai.py (2)
open(58-59)close(61-62)
src/custom_ai/pytorch_llm/pytorch_llm.py (2)
src/genai/genai.py (13)
SbkGenAI(25-356)add_args(52-53)parse_args(55-56)open(58-59)get_model_description(65-73)get_throughput_analysis(327-336)get_throughput_prompt(75-122)get_latency_analysis(339-348)get_latency_prompt(124-215)get_total_mb_analysis(351-352)get_total_mb_prompt(218-247)get_percentile_histogram_analysis(355-356)get_percentile_histogram_prompt(249-323)src/ai/sbk_ai.py (3)
add_args(141-168)parse_args(171-192)open(194-196)
🪛 markdownlint-cli2 (0.18.1)
README.md
130-130: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
149-149: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🪛 Ruff (0.14.11)
src/genai/genai.py
58-59: SbkGenAI.open is an empty method in an abstract base class, but has no abstract decorator
(B027)
61-62: SbkGenAI.close is an empty method in an abstract base class, but has no abstract decorator
(B027)
src/custom_ai/pytorch_llm/pytorch_llm.py
133-133: Consider moving this statement to an else block
(TRY300)
135-135: Do not catch blind exception: Exception
(BLE001)
136-136: Use explicit conversion flag
Replace with conversion flag
(RUF010)
191-191: Unused method argument: args
(ARG002)
198-198: Unused method argument: args
(ARG002)
222-222: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
243-243: Consider moving this statement to an else block
(TRY300)
245-245: Do not catch blind exception: Exception
(BLE001)
246-246: Use explicit conversion flag
Replace with conversion flag
(RUF010)
314-314: Consider moving this statement to an else block
(TRY300)
316-316: Do not catch blind exception: Exception
(BLE001)
317-317: Use explicit conversion flag
Replace with conversion flag
(RUF010)
334-334: Local variable was_training is assigned to but never used
Remove assignment to unused variable was_training
(F841)
389-389: Consider moving this statement to an else block
(TRY300)
391-391: Do not catch blind exception: Exception
(BLE001)
392-392: Use explicit conversion flag
Replace with conversion flag
(RUF010)
490-490: Consider moving this statement to an else block
(TRY300)
492-492: Do not catch blind exception: Exception
(BLE001)
493-493: Use explicit conversion flag
Replace with conversion flag
(RUF010)
500-500: Do not catch blind exception: Exception
(BLE001)
501-501: Use explicit conversion flag
Replace with conversion flag
(RUF010)
508-508: Do not catch blind exception: Exception
(BLE001)
509-509: Use explicit conversion flag
Replace with conversion flag
(RUF010)
516-516: Do not catch blind exception: Exception
(BLE001)
517-517: Use explicit conversion flag
Replace with conversion flag
(RUF010)
524-524: Do not catch blind exception: Exception
(BLE001)
525-525: Use explicit conversion flag
Replace with conversion flag
(RUF010)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (5)
requirements.txt (1)
8-8: Verify huggingface_hub version compatibility.The
huggingface_hubwas downgraded from 1.2.3 to 0.34.0. Ensure this version is compatible with thetransformersversion being used, as transformers may have minimum version requirements for huggingface_hub.src/genai/genai.py (1)
58-62: LGTM - Optional lifecycle hooks.The
open()andclose()methods are intentionally non-abstract to serve as optional lifecycle hooks. Backends likePyTorchLLMcan override them for resource initialization and cleanup (e.g., model loading, training on close), while other backends can use the default no-op implementation. The Ruff B027 warning can be safely ignored as this is the intended design.src/ai/sbk_ai.py (2)
54-54: LGTM - Clean timeout constant extraction.Good refactor to extract
DEFAULT_TIMEOUT_SECONDSas a module-level constant, making the default value explicit and reusable. Theint()conversion inparse_argsproperly handles string input from command-line arguments.Also applies to: 138-138, 187-187
194-200: LGTM - Lifecycle delegation pattern.The
open()andclose()methods correctly delegate to the active AI backend instance when present. This enables proper resource management for backends likePyTorchLLMthat need initialization and cleanup.src/custom_ai/pytorch_llm/pytorch_llm.py (1)
37-37: No changes needed.openai/gpt-oss-20bis a legitimate Hugging Face model—OpenAI's open-weight GPT-OSS model (~21B parameters) released under Apache 2.0 license and available on Hugging Face in multiple formats.Likely an incorrect or invalid review comment.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| # Create a new conda environment with Python 3.14 or higher | ||
| conda create -n conda-sbk-charts python=3.14 -y |
There was a problem hiding this comment.
Python 3.14 does not exist yet.
Python 3.14 is not released (current stable versions are 3.12/3.13). This will cause conda environment creation to fail. Consider using Python 3.12 or 3.13.
Proposed fix
-# Create a new conda environment with Python 3.14 or higher
-conda create -n conda-sbk-charts python=3.14 -y
+# Create a new conda environment with Python 3.12 or higher
+conda create -n conda-sbk-charts python=3.12 -y📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Create a new conda environment with Python 3.14 or higher | |
| conda create -n conda-sbk-charts python=3.14 -y | |
| # Create a new conda environment with Python 3.12 or higher | |
| conda create -n conda-sbk-charts python=3.12 -y |
🤖 Prompt for AI Agents
In `@README.md` around lines 131 - 132, The README currently pins a non-existent
Python version in the conda create command ("conda create -n conda-sbk-charts
python=3.14 -y"); update that command to use a released interpreter (e.g.,
python=3.12 or python=3.13) so environment creation succeeds, and optionally
note a range like "python>=3.12,<3.14" or "python=3.12" for clarity.
| torch~=2.9.1 | ||
| transformers~=4.57.3 |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check latest available versions of torch and transformers on PyPI
echo "Checking torch versions:"
curl -s https://pypi.org/pypi/torch/json | jq -r '.releases | keys | .[-15:]' | sort -V
echo -e "\nChecking transformers versions:"
curl -s https://pypi.org/pypi/transformers/json | jq -r '.releases | keys | .[-15:]' | sort -VRepository: kmgowda/sbk-charts
Length of output: 469
Fix invalid transformers version in requirements.txt.
Line 14 specifies transformers~=4.57.3, but this version does not exist on PyPI. Available versions jump from 4.9.2 to 4.57.4. Update to transformers~=4.57.4 or later.
The torch~=2.9.1 version is valid and exists on PyPI.
🤖 Prompt for AI Agents
In `@requirements.txt` around lines 13 - 14, The requirements.txt entry for
transformers is pinned to a non-existent version "transformers~=4.57.3"; update
that token to a valid release (e.g., change "transformers~=4.57.3" to
"transformers~=4.57.4" or a later valid version) while leaving "torch~=2.9.1"
unchanged so pip can resolve packages correctly.
| self.model = AutoModelForCausalLM.from_pretrained( | ||
| saved_model_dir, | ||
| device_map="auto", | ||
| dtype=dtype, | ||
| trust_remote_code=True | ||
| ) |
There was a problem hiding this comment.
Wrong parameter name: use torch_dtype instead of dtype.
The AutoModelForCausalLM.from_pretrained() method expects torch_dtype, not dtype. This will cause the dtype setting to be silently ignored.
Proposed fix
self.model = AutoModelForCausalLM.from_pretrained(
saved_model_dir,
device_map="auto",
- dtype=dtype,
+ torch_dtype=dtype,
trust_remote_code=True
)Apply the same fix at lines 116-121.
Also applies to: 116-121
🤖 Prompt for AI Agents
In `@src/custom_ai/pytorch_llm/pytorch_llm.py` around lines 100 - 105, The call to
AutoModelForCausalLM.from_pretrained is using the wrong keyword argument (dtype)
so the dtype is ignored; change the parameter name to torch_dtype when calling
AutoModelForCausalLM.from_pretrained(saved_model_dir, ...) and pass the same
dtype variable as torch_dtype; apply the identical change to the second
occurrence of the same call (the block around the lines referenced 116-121) so
both places use torch_dtype instead of dtype.
🧩 Analysis chain
🏁 Script executed:
# First, let's verify the file exists and get its context
fd src/custom_ai/pytorch_llm/pytorch_llm.pyRepository: kmgowda/sbk-charts
Length of output: 44
🏁 Script executed:
# Search for pytorch_llm files
fd pytorch_llm
# Also search for files containing AutoModelForCausalLM to understand context
rg AutoModelForCausalLM --type py -lRepository: kmgowda/sbk-charts
Length of output: 170
🏁 Script executed:
# Check file size
wc -l src/custom_ai/pytorch_llm/pytorch_llm.py
# Read the file to understand context
cat -n src/custom_ai/pytorch_llm/pytorch_llm.pyRepository: kmgowda/sbk-charts
Length of output: 24290
Add configurable --pt-trust-remote-code flag to control arbitrary code execution.
The code hardcodes trust_remote_code=True when loading models (lines 100-105, 106-109, 112-115, 116-121), which enables arbitrary code execution from model repositories. Add a command-line flag --pt-trust-remote-code to the existing argument parser that defaults to False, allowing users to explicitly opt-in to this security-critical feature only when needed.
🤖 Prompt for AI Agents
In `@src/custom_ai/pytorch_llm/pytorch_llm.py` around lines 100 - 105, Add a new
boolean CLI flag (e.g., --pt-trust-remote-code) to the existing argument parser
with default False and store it in a variable (e.g., pt_trust_remote_code); then
replace the hardcoded trust_remote_code=True in every model load call (the
AutoModelForCausalLM.from_pretrained and any other .from_pretrained usage in
pytorch_llm.py) to use that variable (trust_remote_code=pt_trust_remote_code) so
users must explicitly opt in to remote code execution.
| trust_remote_code=True | ||
| ) | ||
|
|
||
| self.model = self.model.to(self.device) |
There was a problem hiding this comment.
Redundant .to(device) after device_map="auto".
When using device_map="auto", the model is automatically placed on appropriate devices. Calling .to(self.device) afterwards can cause issues with model parallelism and is unnecessary.
Proposed fix
- self.model = self.model.to(self.device)
+ # Note: device_map="auto" handles device placement automatically🤖 Prompt for AI Agents
In `@src/custom_ai/pytorch_llm/pytorch_llm.py` at line 123, The call self.model =
self.model.to(self.device) is redundant and harmful when using device_map="auto"
because the model is already placed across devices; remove that unconditional
.to(...) or guard it so it only runs when device_map is not "auto" (e.g., check
the loaded config/argument for device_map before applying .to). Update the
model-loading path where self.model is assigned to avoid forcing
.to(self.device) when device_map=="auto", leaving placement to the loader to
preserve model parallelism.
| # Ensure attention mask is the same dtype as the model | ||
| if 'attention_mask' in inputs: | ||
| inputs['attention_mask'] = inputs['attention_mask'].to(dtype=model_dtype) |
There was a problem hiding this comment.
Incorrect dtype for attention_mask.
Attention masks should remain as integer tensors (long/int), not be cast to the model's floating-point dtype. This can cause errors or incorrect behavior.
Proposed fix
- # Ensure attention mask is the same dtype as the model
- if 'attention_mask' in inputs:
- inputs['attention_mask'] = inputs['attention_mask'].to(dtype=model_dtype)
+ # Note: attention_mask should remain as integer type, no conversion neededThe same issue exists in _train_on_example at lines 472-474.
🤖 Prompt for AI Agents
In `@src/custom_ai/pytorch_llm/pytorch_llm.py` around lines 279 - 281, The
attention_mask is incorrectly cast to the model's floating dtype; instead, keep
it as an integer tensor and only move it to the model device (or explicitly cast
to torch.long). Update the code that currently does inputs['attention_mask'] =
inputs['attention_mask'].to(dtype=model_dtype) to something like
inputs['attention_mask'] = inputs['attention_mask'].to(device=model.device,
dtype=torch.long) (or preserve original integer dtype while moving device), and
make the same change in the _train_on_example function so 'attention_mask'
remains an integer (long) tensor rather than being converted to model_dtype.
| def _train_on_example(self, prompt: str, target: str) -> None: | ||
| """Train the model on a single example. | ||
|
|
||
| Args: | ||
| prompt: The input prompt | ||
| target: The target response | ||
|
|
||
| Returns: | ||
| float: The loss value if successful, None otherwise | ||
| """ | ||
| try: | ||
| # Combine prompt and target with separator | ||
| text = f"{prompt}{target}{self.tokenizer.eos_token}" | ||
|
|
||
| # Get model's dtype for consistent typing | ||
| model_dtype = next(self.model.parameters()).dtype | ||
|
|
||
| # Tokenize the input | ||
| inputs = self.tokenizer( | ||
| text, | ||
| return_tensors="pt", | ||
| padding="max_length", | ||
| max_length=self.max_length, | ||
| truncation=True | ||
| ) | ||
|
|
||
| # Move inputs to the correct device and type | ||
| inputs = {k: v.to(self.device) for k, v in inputs.items()} | ||
|
|
||
| # Create labels (shifted input) and ensure correct dtype | ||
| labels = inputs["input_ids"].clone() | ||
|
|
||
| # Ensure attention mask is the same dtype as the model | ||
| if 'attention_mask' in inputs: | ||
| inputs['attention_mask'] = inputs['attention_mask'].to(dtype=model_dtype) | ||
|
|
||
| # Forward pass with autocast for mixed precision training | ||
| with torch.autocast(device_type='cuda' if 'cuda' in str(self.device) else 'cpu', | ||
| dtype=model_dtype): | ||
| outputs = self.model( | ||
| input_ids=inputs["input_ids"], | ||
| attention_mask=inputs.get("attention_mask", None), | ||
| labels=labels | ||
| ) | ||
|
|
||
| # Backward pass and optimize | ||
| loss = outputs.loss | ||
| if loss is not None: | ||
| loss.backward() | ||
| return loss.item() | ||
| return None | ||
|
|
||
| except Exception as e: | ||
| return False, f"Error in text generation: {str(e)}" |
There was a problem hiding this comment.
Inconsistent return type in _train_on_example.
The docstring states the method returns float or None, but line 493 returns a tuple (False, str). This type inconsistency can cause issues for callers.
Proposed fix
except Exception as e:
- return False, f"Error in text generation: {str(e)}"
+ print(f"Error in training: {e!r}")
+ return None🧰 Tools
🪛 Ruff (0.14.11)
490-490: Consider moving this statement to an else block
(TRY300)
492-492: Do not catch blind exception: Exception
(BLE001)
493-493: Use explicit conversion flag
Replace with conversion flag
(RUF010)
🤖 Prompt for AI Agents
In `@src/custom_ai/pytorch_llm/pytorch_llm.py` around lines 440 - 493, The
_train_on_example method returns inconsistent types (float/None in docstring but
(False, str) on exception); change the except block to return None (or re-raise
the exception) and log the error instead of returning a tuple, and update the
method docstring to clearly state it returns float or None on failure;
specifically modify the except block in _train_on_example to call your logger
(e.g., self.logger.error or similar) with the exception message and then return
None, removing the (False, str) tuple.
| # Forward pass with autocast for mixed precision training | ||
| with torch.autocast(device_type='cuda' if 'cuda' in str(self.device) else 'cpu', | ||
| dtype=model_dtype): |
There was a problem hiding this comment.
Missing MPS device handling in _train_on_example autocast.
The _train_on_example method doesn't handle 'mps' device type in autocast, unlike _train_on_output (line 292). This inconsistency could cause issues on Apple Silicon.
Proposed fix
- with torch.autocast(device_type='cuda' if 'cuda' in str(self.device) else 'cpu',
+ device_type = 'mps' if 'mps' in str(self.device) else 'cuda' if 'cuda' in str(self.device) else 'cpu'
+ with torch.autocast(device_type=device_type,
dtype=model_dtype):📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Forward pass with autocast for mixed precision training | |
| with torch.autocast(device_type='cuda' if 'cuda' in str(self.device) else 'cpu', | |
| dtype=model_dtype): | |
| # Forward pass with autocast for mixed precision training | |
| device_type = 'mps' if 'mps' in str(self.device) else 'cuda' if 'cuda' in str(self.device) else 'cpu' | |
| with torch.autocast(device_type=device_type, | |
| dtype=model_dtype): |
🤖 Prompt for AI Agents
In `@src/custom_ai/pytorch_llm/pytorch_llm.py` around lines 476 - 478, The
autocast device selection in _train_on_example currently only checks for 'cuda'
and falls back to 'cpu'; update it to mirror _train_on_output by including 'mps'
detection: when forming the torch.autocast call inside _train_on_example, set
device_type to 'mps' if 'mps' is in str(self.device), otherwise 'cuda' if 'cuda'
is in str(self.device), else 'cpu', and keep passing model_dtype as dtype so
mixed-precision works correctly on Apple Silicon.
| ch.parse_args(args) | ||
| ch.open(args) | ||
| ch.add_performance_details() | ||
| ch.close(args) |
There was a problem hiding this comment.
Wrap open/close lifecycle in try/finally for exception safety.
If add_performance_details() raises an exception, close() will never be called. This is particularly important for the PyTorchLLM backend where close() handles training completion and model persistence.
🛠️ Suggested fix
ch.parse_args(args)
- ch.open(args)
- ch.add_performance_details()
- ch.close(args)
+ ch.open(args)
+ try:
+ ch.add_performance_details()
+ finally:
+ ch.close(args)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ch.parse_args(args) | |
| ch.open(args) | |
| ch.add_performance_details() | |
| ch.close(args) | |
| ch.parse_args(args) | |
| ch.open(args) | |
| try: | |
| ch.add_performance_details() | |
| finally: | |
| ch.close(args) |
Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/custom_ai/pytorch_llm/pytorch_llm.py`:
- Around line 485-493: The method currently calls loss.backward() but never
updates parameters; after loss.backward() call in the function containing
loss.backward() add a call to optimizer.step() and then optimizer.zero_grad()
(or zero gradients before the backward as per existing pattern) to perform the
weight update; also fix the except block that returns (False, str) so the
function adheres to its docstring by returning None on error (or re-raising) and
optionally logging the exception instead of returning a tuple—update references
to loss.backward(), optimizer.step(), optimizer.zero_grad(), and the except
block return to ensure a consistent float|None return type.
♻️ Duplicate comments (5)
src/custom_ai/pytorch_llm/pytorch_llm.py (5)
100-105: Wrong parameter name: usetorch_dtypeinstead ofdtype.The
AutoModelForCausalLM.from_pretrained()method expectstorch_dtype, notdtype. This causes the dtype setting to be silently ignored, and the model will load with default precision.Proposed fix
self.model = AutoModelForCausalLM.from_pretrained( saved_model_dir, device_map="auto", - dtype=dtype, + torch_dtype=dtype, trust_remote_code=True )Apply the same fix at lines 116-121.
123-123: Redundant.to(device)afterdevice_map="auto".When using
device_map="auto", the model is automatically distributed across devices. Calling.to(self.device)can break model parallelism and cause device mismatch errors for large models.Proposed fix
- self.model = self.model.to(self.device) + # Note: device_map="auto" handles device placement automatically
191-196: Bug:_is_initializedset to True regardless of initialization result.If
_initialize_model()returnsFalse(initialization failed),_is_initializedis still set toTrueon line 195, masking the failure and preventing retry attempts.Proposed fix
def open(self, args): # Reinitialize model if needed if not self._is_initialized: - self._initialize_model() - self._is_initialized = True + if not self._initialize_model(): + print("Warning: Model initialization failed in open()") + return
279-281: Incorrect dtype for attention_mask.Attention masks should remain as integer tensors (
torch.long), not be cast to the model's floating-point dtype. This can cause errors or incorrect behavior during forward pass.Proposed fix
- # Ensure attention mask is the same dtype as the model - if 'attention_mask' in inputs: - inputs['attention_mask'] = inputs['attention_mask'].to(dtype=model_dtype) + # Note: attention_mask should remain as integer type, no conversion neededThe same issue exists in
_train_on_exampleat lines 472-474.
476-478: Missing MPS device handling in autocast.Unlike
_train_on_output(line 292), this method doesn't handle 'mps' device type in autocast, causing inconsistent behavior on Apple Silicon.Proposed fix
- with torch.autocast(device_type='cuda' if 'cuda' in str(self.device) else 'cpu', + device_type = 'mps' if 'mps' in str(self.device) else 'cuda' if 'cuda' in str(self.device) else 'cpu' + with torch.autocast(device_type=device_type, dtype=model_dtype):
🧹 Nitpick comments (4)
src/custom_ai/pytorch_llm/pytorch_llm.py (1)
334-334: Unused variablewas_training.The variable
was_trainingis assigned but never used. This appears to be leftover code from an incomplete implementation to restore training mode after generation.Either remove it or implement the restoration logic:
Option 1: Remove unused variable
- was_training = self.model.trainingOption 2: Implement training mode restoration
was_training = self.model.training try: + self.model.eval() # ... generation code ... + finally: + if was_training: + self.model.train()src/custom_ai/pytorch_llm/README.md (3)
128-140: Python code should use Python code block, not bash.The model download example contains Python code but is wrapped in a bash code block. This affects syntax highlighting and copy-paste usability.
Proposed fix
1. Download a model from Hugging Face Hub: - ```bash + ```python from transformers import AutoModelForCausalLM, AutoTokenizer
175-184: Add language specifier to fenced code block.The directory structure code block should specify a language (e.g.,
textorplaintext) for consistent formatting.Proposed fix
-``` +```text src/ └── custom_ai/ └── pytorch_llm/ ├── __init__.py ├── pytorch_llm.py # Main implementation └── README.md # This document ```
184-187: Add blank line before License heading.Missing blank line before the
## Licenseheading affects Markdown rendering consistency.Proposed fix
└── README.md # This document
License
</details> </blockquote></details> </blockquote></details> <details> <summary>📜 Review details</summary> **Configuration used**: defaults **Review profile**: CHILL **Plan**: Pro <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between e3d801a6a028eb836baa8cff6e72abbaa17048c2 and 37af43e01b81b45b042ff6d7c98fdcb12cd0fd39. </details> <details> <summary>⛔ Files ignored due to path filters (2)</summary> * `samples/charts/.DS_Store` is excluded by `!**/.DS_Store` * `samples/charts/sbk-file-rocksdb-read-pytorch-llm.xlsx` is excluded by `!**/*.xlsx` </details> <details> <summary>📒 Files selected for processing (3)</summary> * `src/custom_ai/pytorch_llm/README.md` * `src/custom_ai/pytorch_llm/pytorch_llm.py` * `src/version/sbk_version.py` </details> <details> <summary>🧰 Additional context used</summary> <details> <summary>🧬 Code graph analysis (1)</summary> <details> <summary>src/custom_ai/pytorch_llm/pytorch_llm.py (2)</summary><blockquote> <details> <summary>src/genai/genai.py (14)</summary> * `SbkGenAI` (25-356) * `add_args` (52-53) * `parse_args` (55-56) * `open` (58-59) * `close` (61-62) * `get_model_description` (65-73) * `get_throughput_analysis` (327-336) * `get_throughput_prompt` (75-122) * `get_latency_analysis` (339-348) * `get_latency_prompt` (124-215) * `get_total_mb_analysis` (351-352) * `get_total_mb_prompt` (218-247) * `get_percentile_histogram_analysis` (355-356) * `get_percentile_histogram_prompt` (249-323) </details> <details> <summary>src/ai/sbk_ai.py (4)</summary> * `add_args` (141-168) * `parse_args` (171-192) * `open` (194-196) * `close` (198-200) </details> </blockquote></details> </details><details> <summary>🪛 markdownlint-cli2 (0.18.1)</summary> <details> <summary>src/custom_ai/pytorch_llm/README.md</summary> 177-177: Fenced code blocks should have a language specified (MD040, fenced-code-language) </details> </details> <details> <summary>🪛 Ruff (0.14.11)</summary> <details> <summary>src/custom_ai/pytorch_llm/pytorch_llm.py</summary> 133-133: Consider moving this statement to an `else` block (TRY300) --- 135-135: Do not catch blind exception: `Exception` (BLE001) --- 136-136: Use explicit conversion flag Replace with conversion flag (RUF010) --- 191-191: Unused method argument: `args` (ARG002) --- 198-198: Unused method argument: `args` (ARG002) --- 222-222: PEP 484 prohibits implicit `Optional` Convert to `T | None` (RUF013) --- 243-243: Consider moving this statement to an `else` block (TRY300) --- 245-245: Do not catch blind exception: `Exception` (BLE001) --- 246-246: Use explicit conversion flag Replace with conversion flag (RUF010) --- 314-314: Consider moving this statement to an `else` block (TRY300) --- 316-316: Do not catch blind exception: `Exception` (BLE001) --- 317-317: Use explicit conversion flag Replace with conversion flag (RUF010) --- 334-334: Local variable `was_training` is assigned to but never used Remove assignment to unused variable `was_training` (F841) --- 389-389: Consider moving this statement to an `else` block (TRY300) --- 391-391: Do not catch blind exception: `Exception` (BLE001) --- 392-392: Use explicit conversion flag Replace with conversion flag (RUF010) --- 490-490: Consider moving this statement to an `else` block (TRY300) --- 492-492: Do not catch blind exception: `Exception` (BLE001) --- 493-493: Use explicit conversion flag Replace with conversion flag (RUF010) --- 500-500: Do not catch blind exception: `Exception` (BLE001) --- 501-501: Use explicit conversion flag Replace with conversion flag (RUF010) --- 508-508: Do not catch blind exception: `Exception` (BLE001) --- 509-509: Use explicit conversion flag Replace with conversion flag (RUF010) --- 516-516: Do not catch blind exception: `Exception` (BLE001) --- 517-517: Use explicit conversion flag Replace with conversion flag (RUF010) --- 524-524: Do not catch blind exception: `Exception` (BLE001) --- 525-525: Use explicit conversion flag Replace with conversion flag (RUF010) </details> </details> </details> <details> <summary>⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)</summary> * GitHub Check: build </details> <details> <summary>🔇 Additional comments (6)</summary><blockquote> <details> <summary>src/version/sbk_version.py (1)</summary><blockquote> `11-11`: **LGTM!** Version bump to `3.26.1.1` is appropriate for the new PyTorch LLM backend addition. </blockquote></details> <details> <summary>src/custom_ai/pytorch_llm/pytorch_llm.py (4)</summary><blockquote> `198-220`: **LGTM on training lifecycle in `close()`.** The training flow correctly sets the model to training mode, iterates over outputs, and conditionally saves the model. The unused `args` parameter matches the interface contract from `SbkGenAI`. --- `495-526`: **LGTM on analysis methods.** The analysis methods correctly delegate to prompt generation and text generation, following the interface pattern from `SbkGenAI`. Exception handling provides graceful error reporting. --- `527-534`: **LGTM on destructor cleanup.** The `__del__` method properly cleans up model and tokenizer resources, and clears CUDA cache when available. --- `36-46`: This comment is incorrect. The `openai/gpt-oss-20b` model is a legitimate, real Hugging Face model—OpenAI's open-weight 21B-parameter language model released under Apache 2.0 and available on the Hugging Face Hub. Users will not encounter download failures; the model can be loaded using `AutoModelForCausalLM.from_pretrained()` as intended. The README acknowledges that this default model consumes over 200GB of RAM, which may be impractical for many setups. If resource constraints are a concern, users can specify alternative models via the `--pt-model` flag (e.g., `mistralai/Mistral-7B-v0.1`). > Likely an incorrect or invalid review comment. </blockquote></details> <details> <summary>src/custom_ai/pytorch_llm/README.md (1)</summary><blockquote> `19-19`: **Verify the default model claim.** The documentation mentions `openai/gpt-oss-20b` consuming more than 200GB RAM. If this model doesn't exist on Hugging Face (as noted in the code review), this claim may be misleading. Consider updating with a verified model and accurate memory requirements. </blockquote></details> </blockquote></details> <sub>✏️ Tip: You can disable this entire section by setting `review_details` to `false` in your review settings.</sub> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/custom_ai/pytorch_llm/pytorch_llm.py`:
- Around line 56-61: Update the class/module docstring to reflect the real
default model: replace the mention of "gpt2" with the actual DEFAULT_MODEL value
("openai/gpt-oss-20b") so the docstring matches the constant DEFAULT_MODEL used
in this file (refer to DEFAULT_MODEL in src/custom_ai/pytorch_llm/pytorch_llm.py
and the surrounding class/docstring near the top of the file).
- Around line 357-358: The max_new_tokens calculation currently uses
max(DEFAULT_MAX_LENGTH, inputs['input_ids'].shape[1]) which can let
input+generation exceed the model context; change the calculation in the method
where max_new_tokens is set (referencing max_new_tokens, DEFAULT_MAX_LENGTH,
inputs['input_ids'] and self.max_length) to compute max_new_tokens = max(256,
self.max_length - inputs['input_ids'].shape[1]) (ensure the subtraction is
clamped so it never goes below 256 and result is an int) so that input tokens +
generated tokens never exceed the model context window.
♻️ Duplicate comments (7)
src/custom_ai/pytorch_llm/pytorch_llm.py (7)
100-105: Wrong parameter name: usetorch_dtypeinstead ofdtype.The
AutoModelForCausalLM.from_pretrained()method expectstorch_dtype, notdtype. This will cause the dtype setting to be silently ignored, and the model will load with default precision.Proposed fix
self.model = AutoModelForCausalLM.from_pretrained( saved_model_dir, device_map="auto", - dtype=dtype, + torch_dtype=dtype, trust_remote_code=True )Apply the same fix at lines 116-121.
123-123: Redundant.to(device)afterdevice_map="auto".When using
device_map="auto", the model is automatically distributed across available devices. Calling.to(self.device)can break model parallelism or cause errors when the model spans multiple GPUs.Proposed fix
- self.model = self.model.to(self.device) + # Note: device_map="auto" handles device placement automatically
142-177: Add configurable--pt-trust-remote-codeflag.The code hardcodes
trust_remote_code=Truewhen loading models, which enables arbitrary code execution from model repositories. Add a CLI flag that defaults toFalse, allowing users to explicitly opt-in.Proposed fix
parser.add_argument( "--pt-top-p", type=float, help=f"Top-p sampling parameter (default: {DEFAULT_TOP_P})", default=DEFAULT_TOP_P ) + parser.add_argument( + "--pt-trust-remote-code", + action="store_true", + help="Trust remote code when loading models (default: False, security risk)", + default=False + )Then update
parse_argsto storeself.trust_remote_code = args.pt_trust_remote_codeand use it in_initialize_model.
191-195: Bug:_is_initializedset to True regardless of initialization result.If
_initialize_model()returnsFalse,_is_initializedis still set toTrueon line 195, masking the failure and preventing retry attempts.Proposed fix
def open(self, args): # Reinitialize model if needed if not self._is_initialized: - self._initialize_model() - self._is_initialized = True + if not self._initialize_model(): + print("Warning: Model initialization failed in open()") + return False + return True
279-281: Incorrect dtype for attention_mask.Attention masks should remain as integer tensors (long/int), not be cast to the model's floating-point dtype. This can cause errors or incorrect behavior during attention computation.
Proposed fix
- # Ensure attention mask is the same dtype as the model - if 'attention_mask' in inputs: - inputs['attention_mask'] = inputs['attention_mask'].to(dtype=model_dtype) + # Note: attention_mask should remain as integer type, no conversion needed
484-492: Critical: Missingoptimizer.step()- training is ineffective.
loss.backward()computes gradients butoptimizer.step()is never called, so weights are never updated. This makes the training completely non-functional.Also, line 492 returns
(False, str)but the docstring states it returnsfloatorNone.Proposed fix
# Backward pass and optimize loss = outputs.loss if loss is not None: loss.backward() + if not hasattr(self, 'optimizer'): + self.optimizer = torch.optim.AdamW(self.model.parameters(), lr=5e-5) + self.optimizer.step() + self.optimizer.zero_grad() return loss.item() return None except Exception as e: - return False, f"Error in text generation: {str(e)}" + print(f"Error in training: {e!r}") + return None
471-477: Attention mask dtype and missing MPS handling.Two issues in this block:
- Line 472-473: Attention mask incorrectly cast to float dtype
- Line 476: Missing 'mps' device type handling (unlike
_train_on_outputat line 292)Proposed fix
- # Ensure attention mask is the same dtype as the model - if 'attention_mask' in inputs: - inputs['attention_mask'] = inputs['attention_mask'].to(dtype=model_dtype) + # Note: attention_mask should remain as integer type # Forward pass with autocast for mixed precision training - with torch.autocast(device_type='cuda' if 'cuda' in str(self.device) else 'cpu', + device_type = 'mps' if 'mps' in str(self.device) else 'cuda' if 'cuda' in str(self.device) else 'cpu' + with torch.autocast(device_type=device_type, dtype=model_dtype):
🧹 Nitpick comments (1)
src/custom_ai/pytorch_llm/pytorch_llm.py (1)
222-222: Update type hint to explicitOptionalor union syntax.Per PEP 484, implicit
Optionalis prohibited. The type hint should be explicit.Proposed fix
- def _save_model(self, output_dir: str = None) -> bool: + def _save_model(self, output_dir: str | None = None) -> bool:
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/custom_ai/pytorch_llm/pytorch_llm.py
🧰 Additional context used
🧬 Code graph analysis (1)
src/custom_ai/pytorch_llm/pytorch_llm.py (2)
src/genai/genai.py (14)
SbkGenAI(25-356)add_args(52-53)parse_args(55-56)open(58-59)close(61-62)get_model_description(65-73)get_throughput_analysis(327-336)get_throughput_prompt(75-122)get_latency_analysis(339-348)get_latency_prompt(124-215)get_total_mb_analysis(351-352)get_total_mb_prompt(218-247)get_percentile_histogram_analysis(355-356)get_percentile_histogram_prompt(249-323)src/ai/sbk_ai.py (4)
add_args(141-168)parse_args(171-192)open(194-196)close(198-200)
🪛 Ruff (0.14.11)
src/custom_ai/pytorch_llm/pytorch_llm.py
133-133: Consider moving this statement to an else block
(TRY300)
135-135: Do not catch blind exception: Exception
(BLE001)
136-136: Use explicit conversion flag
Replace with conversion flag
(RUF010)
191-191: Unused method argument: args
(ARG002)
198-198: Unused method argument: args
(ARG002)
222-222: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
243-243: Consider moving this statement to an else block
(TRY300)
245-245: Do not catch blind exception: Exception
(BLE001)
246-246: Use explicit conversion flag
Replace with conversion flag
(RUF010)
314-314: Consider moving this statement to an else block
(TRY300)
316-316: Do not catch blind exception: Exception
(BLE001)
317-317: Use explicit conversion flag
Replace with conversion flag
(RUF010)
388-388: Consider moving this statement to an else block
(TRY300)
390-390: Do not catch blind exception: Exception
(BLE001)
391-391: Use explicit conversion flag
Replace with conversion flag
(RUF010)
489-489: Consider moving this statement to an else block
(TRY300)
491-491: Do not catch blind exception: Exception
(BLE001)
492-492: Use explicit conversion flag
Replace with conversion flag
(RUF010)
499-499: Do not catch blind exception: Exception
(BLE001)
500-500: Use explicit conversion flag
Replace with conversion flag
(RUF010)
507-507: Do not catch blind exception: Exception
(BLE001)
508-508: Use explicit conversion flag
Replace with conversion flag
(RUF010)
515-515: Do not catch blind exception: Exception
(BLE001)
516-516: Use explicit conversion flag
Replace with conversion flag
(RUF010)
523-523: Do not catch blind exception: Exception
(BLE001)
524-524: Use explicit conversion flag
Replace with conversion flag
(RUF010)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (6)
src/custom_ai/pytorch_llm/pytorch_llm.py (6)
36-46: LGTM on configuration defaults.The device auto-detection logic correctly prioritizes CUDA, then MPS, then CPU. The default parameters are reasonable for a causal language model.
198-219: LGTM on close method logic.The training loop and model saving flow are well-structured. The early break on
Noneloss appropriately handles training failures.
396-410: LGTM on model description.Clear and informative description output with appropriate initialization check.
412-437: LGTM on text cleaning utility.The regex-based cleanup for repeated punctuation is well-implemented with proper character escaping.
494-524: LGTM on analysis entry points.All four analysis methods follow a consistent pattern delegating to
_generate_textwith appropriate error handling.
526-533: LGTM on resource cleanup.The destructor properly checks for attribute existence before deletion and clears CUDA cache when available.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| # Calculate max_new_tokens, ensuring it's at least DEFAULT_MAX_LENGTH | ||
| max_new_tokens = max(DEFAULT_MAX_LENGTH, inputs['input_ids'].shape[1]) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n src/custom_ai/pytorch_llm/pytorch_llm.py | head -50Repository: kmgowda/sbk-charts
Length of output: 1902
🏁 Script executed:
# Find DEFAULT_MAX_LENGTH definition
rg "DEFAULT_MAX_LENGTH" src/custom_ai/pytorch_llm/pytorch_llm.py -B 2 -A 2Repository: kmgowda/sbk-charts
Length of output: 905
🏁 Script executed:
# Look at the _generate_text method context around lines 357-358
cat -n src/custom_ai/pytorch_llm/pytorch_llm.py | sed -n '320,390p'Repository: kmgowda/sbk-charts
Length of output: 3726
🏁 Script executed:
# Check if self.max_length is used in the class
rg "self\.max_length" src/custom_ai/pytorch_llm/pytorch_llm.py -B 1 -A 1Repository: kmgowda/sbk-charts
Length of output: 881
Fix max_new_tokens calculation to respect model context window.
The current logic max(DEFAULT_MAX_LENGTH, inputs['input_ids'].shape[1]) causes the total sequence length to exceed the model's context window. If the input is 100 tokens, max_new_tokens becomes 2048, resulting in 2148 total tokens (exceeding the 2048 context limit). Use instead:
max_new_tokens = max(256, self.max_length - inputs['input_ids'].shape[1])This ensures the generated tokens + input tokens fit within the model's context window while maintaining a minimum generation length.
🤖 Prompt for AI Agents
In `@src/custom_ai/pytorch_llm/pytorch_llm.py` around lines 357 - 358, The
max_new_tokens calculation currently uses max(DEFAULT_MAX_LENGTH,
inputs['input_ids'].shape[1]) which can let input+generation exceed the model
context; change the calculation in the method where max_new_tokens is set
(referencing max_new_tokens, DEFAULT_MAX_LENGTH, inputs['input_ids'] and
self.max_length) to compute max_new_tokens = max(256, self.max_length -
inputs['input_ids'].shape[1]) (ensure the subtraction is clamped so it never
goes below 256 and result is an int) so that input tokens + generated tokens
never exceed the model context window.
Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/custom_ai/pytorch_llm/pytorch_llm.py`:
- Around line 239-252: close currently assumes self.model is initialized and
calls self.model.train(), which can raise AttributeError if _initialize_model()
in open() failed and self.model is None; update close to guard that call by
checking self.model is not None (and optionally log or skip training when model
is missing) before calling self.model.train(), and likewise skip invoking
_train_on_output or iterating self.output_list when self.model is None or
train_mode is True but model is uninitialized; reference the close method, the
self.model attribute, train_mode flag, and _train_on_output/output_list to
locate where to add the guard.
- Around line 74-75: The instance-level list output_list on PyTorchLLM is never
cleared (it is appended to in _generate_text and reused across open()/close()
cycles), so modify the open() method to reset/clear self.output_list (e.g., set
to [] or .clear()) when initializing the session (alongside setting
self._is_initialized = True) to ensure previous session outputs are not carried
into future _generate_text or close() operations.
♻️ Duplicate comments (1)
src/custom_ai/pytorch_llm/pytorch_llm.py (1)
480-533: Multiple issues already flagged in past reviews.This method has several known issues (missing
optimizer.step(), inconsistent return type, attention_mask dtype, missing MPS handling) that were identified in previous review iterations. Please address those comments.
🧹 Nitpick comments (3)
src/custom_ai/pytorch_llm/pytorch_llm.py (3)
143-146: Bareexceptwithpasssilently swallows all exceptions.This catches every exception type (including
KeyboardInterrupt,SystemExit) and discards it without logging. At minimum, catch a specific exception type or log the failure.Proposed fix
try: config_file = cached_file(model_name, 'config.json') - except: - pass + except Exception: + config_file = None # Fall through to fallback logic
263-263: Use explicitOptionaltype annotation.Per PEP 484, use
Optional[str]orstr | Noneinstead of implicit optional via default value.Proposed fix
- def _save_model(self, output_dir: str = None) -> bool: + def _save_model(self, output_dir: str | None = None) -> bool:
567-574: Consider removing__del__in favor of explicit cleanup.
__del__is unreliable for resource management:
- Not guaranteed to be called (circular references, interpreter shutdown)
- Can cause issues if exceptions occur during garbage collection
- Already have
close()for explicit cleanupThe
close()method is the proper place for cleanup. Consider moving CUDA cache clearing there and removing__del__, or makingPyTorchLLMa context manager.Proposed approach
def close(self, args): # ... existing training logic ... + + # Cleanup resources + if torch.cuda.is_available(): + torch.cuda.empty_cache() -def __del__(self): - """Clean up resources when the object is destroyed.""" - if hasattr(self, 'model') and self.model is not None: - del self.model - if hasattr(self, 'tokenizer') and self.tokenizer is not None: - del self.tokenizer - if torch.cuda.is_available(): - torch.cuda.empty_cache()
| self._is_initialized = False | ||
| self.output_list = [] |
There was a problem hiding this comment.
output_list accumulates indefinitely across sessions.
The list is initialized once and appended to in _generate_text but never cleared. If the same PyTorchLLM instance is reused across multiple open()/close() cycles, outputs from previous sessions persist and are re-trained on in subsequent close() calls.
Consider clearing the list in open():
Proposed fix
def open(self, args):
# Reinitialize model if needed
+ self.output_list = [] # Clear previous session outputs
if not self._is_initialized:🤖 Prompt for AI Agents
In `@src/custom_ai/pytorch_llm/pytorch_llm.py` around lines 74 - 75, The
instance-level list output_list on PyTorchLLM is never cleared (it is appended
to in _generate_text and reused across open()/close() cycles), so modify the
open() method to reset/clear self.output_list (e.g., set to [] or .clear()) when
initializing the session (alongside setting self._is_initialized = True) to
ensure previous session outputs are not carried into future _generate_text or
close() operations.
| def close(self, args): | ||
| loss = None | ||
| # If in training mode and we have a target, train on the generated output | ||
| if self.train_mode: | ||
| self.model.train() | ||
| print("\n" + "=" * 50) | ||
| print("🚀 Starting training on generated output") | ||
|
|
||
| for output_text in self.output_list: | ||
| print(f"📄 Generated length: {len(output_text)} chars") | ||
| loss = self._train_on_output(output_text) | ||
| if loss is None: | ||
| break | ||
| print("=" * 50) |
There was a problem hiding this comment.
Potential AttributeError if model initialization failed.
If _initialize_model() failed in open(), self.model remains None. Then close() calls self.model.train() on Line 243, which raises AttributeError: 'NoneType' object has no attribute 'train'.
Proposed fix
def close(self, args):
loss = None
# If in training mode and we have a target, train on the generated output
- if self.train_mode:
+ if self.train_mode and self.model is not None:
self.model.train()🧰 Tools
🪛 Ruff (0.14.13)
239-239: Unused method argument: args
(ARG002)
🤖 Prompt for AI Agents
In `@src/custom_ai/pytorch_llm/pytorch_llm.py` around lines 239 - 252, close
currently assumes self.model is initialized and calls self.model.train(), which
can raise AttributeError if _initialize_model() in open() failed and self.model
is None; update close to guard that call by checking self.model is not None (and
optionally log or skip training when model is missing) before calling
self.model.train(), and likewise skip invoking _train_on_output or iterating
self.output_list when self.model is None or train_mode is True but model is
uninitialized; reference the close method, the self.model attribute, train_mode
flag, and _train_on_output/output_list to locate where to add the guard.
Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>
Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>
Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>
Signed-off-by: Keshava Munegowda <keshava.gowda@gmail.com>
Summary by CodeRabbit
New Features
Documentation
Chores
✏️ Tip: You can customize this high-level summary in your review settings.