-
Notifications
You must be signed in to change notification settings - Fork 147
Add noc reasoning tutorial #1242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add noc reasoning tutorial #1242
Conversation
Signed-off-by: Amparo Canaveras <[email protected]>
Signed-off-by: Amparo Canaveras <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
31 files reviewed, 5 comments
| site/ | ||
| site/ | ||
| site/ | ||
| site/ | ||
| site/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
site/ is duplicated 5 times
| site/ | |
| site/ | |
| site/ | |
| site/ | |
| site/ | |
| site/ |
| @@ -0,0 +1,377 @@ | |||
| --- | |||
| date: 2025-09-05 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The date 2025-09-05 is in the future. Should be a valid past date
| skip = True | ||
| if skip and os.path.exists(output_path): | ||
| os.remove(output_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable skip is hardcoded to True, which unconditionally deletes the output file on every run. This destroys the resume functionality mentioned on line 136. Either remove this code block or make skip a CLI argument
| from src.tools import ( | ||
| ALL_TOOLS, | ||
| Check_Alarm_Status, | ||
| Check_Apply_Configuration, | ||
| Check_Element_Health, | ||
| Check_Element_Neighbors, | ||
| Check_External_Issues, | ||
| Check_Performance, | ||
| Check_remote_files, | ||
| Create_Ticket, | ||
| Execute_Remote_Action, | ||
| Orchestration_tool, | ||
| Triage_Toolkit_Tool, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These imports reference src.tools but the script is located in recipes/noc-reasoning-agent/scripts/. Check that PYTHONPATH is set correctly before running, otherwise imports will fail
| react_agent = create_react_agent(chat_llm, tools=[], checkpointer=checkpointer) | ||
| for index, row in tqdm(unprocessed_df.iterrows(), total=len(unprocessed_df), desc="Processing Rows"): | ||
| # print(row) | ||
| # exit() | ||
|
|
||
| # --- Step 2: Get the row-specific tools using the helper function. --- | ||
| # row_tools = get_bound_tools(row) | ||
|
|
||
| config = {"configurable": {"thread_id": str(index)}} | ||
| input_question = row["input"] | ||
| system_message = row["system"] | ||
| input_messages = [{"role": "system", "content": system_message}, {"role": "user", "content": input_question}] | ||
|
|
||
| try: | ||
| # We still invoke one-by-one, but the tool creation is now efficient. | ||
|
|
||
| ## WE need to fix this. Add system message and tool calling messages | ||
| # separate user input messages | ||
| TOOLS = { | ||
| "Check_Alarm_Status": lambda args: Check_Alarm_Status(**args), | ||
| "Check_Element_Neighbors": lambda args: Check_Element_Neighbors(**args), | ||
| "Check_Element_Health": lambda args: Check_Element_Health(**args), | ||
| "Execute_Remote_Action": lambda args: Execute_Remote_Action(**args), | ||
| "Check_External_Issues": lambda args: Check_External_Issues(**args), | ||
| "Check_Apply_Configuration": lambda args: Check_Apply_Configuration(**args), | ||
| "Check_Performance": lambda args: Check_Performance(**args), | ||
| "Create_Ticket": lambda args: Create_Ticket(**args), | ||
| "Orchestration_tool": lambda args: Orchestration_tool(**args), | ||
| "Triage_Toolkit_Tool": lambda args: Triage_Toolkit_Tool(**args), | ||
| "Check_remote_files": lambda args: Check_remote_files(**args), | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created ReAct agent with empty tools=[] on line 174, then manually maps tool calls on lines 192-204. Consider using LangGraph's built-in tool binding (as shown in create_agent_with_tools.py) to simplify this logic
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
📝 WalkthroughWalkthroughThis PR introduces a complete end-to-end NOC Reasoning Agent recipe featuring documentation, configuration, prompts, Python utility scripts for data preparation, model training coordination, agent construction, evaluation, and visualization. It enables fine-tuning Qwen models to autonomously diagnose and resolve telecom network incidents using structured reasoning and tool integration. Changes
Sequence DiagramssequenceDiagram
participant User
participant DataPrepPipeline as Data Prep Pipeline
participant SFTConfig as SFT Config
participant Model as Qwen Model
participant Training as Training Engine
User->>DataPrepPipeline: CSV Incidents
DataPrepPipeline->>DataPrepPipeline: Filter (auto-recover, soft-solve, codes)
DataPrepPipeline->>DataPrepPipeline: Categorize by Keywords
DataPrepPipeline->>DataPrepPipeline: Convert to JSONL
DataPrepPipeline->>DataPrepPipeline: Enrich with Reasoning Processes
DataPrepPipeline->>SFTConfig: Prepared JSONL
SFTConfig->>SFTConfig: Apply Deduplication & Filtering
SFTConfig->>SFTConfig: Shuffle & Downsample
SFTConfig->>Model: Format as SFT Manifest
Model->>Training: Tokenize with Qwen Tokenizer
Training->>Training: Fine-tune with Megatron Backend
Training->>User: Trained Model
sequenceDiagram
participant User
participant Agent as ReAct Agent
participant ToolExecutor as Tool Executor
participant Model as LM Model
participant Tools as Domain Tools
participant Memory as Memory Saver
User->>Agent: Incident JSONL Row
Agent->>Memory: Load Checkpoint (skip if processed)
Memory->>Agent: Resume State
Agent->>Model: Format User Message + System Prompt
Model->>Model: Generate Response (max_tokens, temp, top_p)
Model->>Agent: Tool Call or Conclusion
alt Tool Call Detected
Agent->>ToolExecutor: Parse Tool Name & Arguments
ToolExecutor->>Tools: Execute Tool (Check_Alarm, Health, etc.)
Tools->>ToolExecutor: Tool Response
ToolExecutor->>Agent: Observation
Agent->>Model: Add Observation to Context
Model->>Agent: Next Action
else Conclusion
Agent->>Memory: Save Final Response
Agent->>User: Output JSONL with Response
end
sequenceDiagram
participant User
participant EvalPipeline as Evaluation Pipeline
participant Metrics as Metric Calculators
participant LLMJudge as LLM Judge
participant Output as Output JSONL
User->>EvalPipeline: Agent Responses + Expected Answers
EvalPipeline->>EvalPipeline: Extract Reasoning & Output
EvalPipeline->>Metrics: Compute ROUGE (rouge1, rougeL)
Metrics->>EvalPipeline: ROUGE Scores
EvalPipeline->>Metrics: Compute BERTScore F1
Metrics->>EvalPipeline: BERTScore Results
EvalPipeline->>LLMJudge: Score Final Output (LLM-based)
LLMJudge->>EvalPipeline: Score + Reasoning
EvalPipeline->>LLMJudge: Score Reasoning Quality (LLM-based)
LLMJudge->>EvalPipeline: Score + Reasoning
EvalPipeline->>Output: Augment Row with All Metrics
EvalPipeline->>User: Evaluation JSONL
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts (beta)
Tip Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
🤖 Fix all issues with AI agents
In `@recipes/noc-reasoning-agent/scripts/create_agent_with_tools_batch.py`:
- Around line 132-135: The code currently hardcodes skip = True which causes
deletion of output_path every run and breaks resume logic; change this so skip
is not hardcoded—make it a CLI flag, environment variable, or default False—and
only call os.remove(output_path) when skip is explicitly set by the user; ensure
the block around skip, os.path.exists(output_path), and os.remove(output_path)
is gated by the configurable skip variable and leave
load_processed_indices(output_path) unchanged so resume logic
(load_processed_indices / processed_indices) can work when skip is False.
- Around line 156-157: The call to AutoModelForCausalLM.from_pretrained in
create_agent_with_tools_batch.py is using the wrong parameter name (dtype=) so
the intended precision is ignored; update the call to pass the dtype via
torch_dtype (e.g., torch_dtype=dtype) so the model loads in the specified
bfloat16/float16 precision, ensuring you modify the
AutoModelForCausalLM.from_pretrained(...) invocation accordingly.
In `@recipes/noc-reasoning-agent/scripts/evaluation/evaluation_with_judge.py`:
- Line 12: The parser.add_argument("--output_file", help="Path to output") can
leave args.output_file as None and cause pd.DataFrame.to_json(args.output_file,
...) to raise a ValueError; update the argument declaration
(parser.add_argument) to either set required=True or provide a sensible default
(e.g., default="evaluation_results.json"), and/or add a guard before writing
that validates args.output_file is not None and raises a clear error message;
ensure references to args.output_file and the DataFrame.to_json call are updated
accordingly.
In `@recipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py`:
- Around line 153-166: The code currently uses two regex matches (m and
m_no_args) and then unconditionally calls m.groups(), which creates dead code
and a latent crash; instead, after performing the initial match m =
re.match(r"^\s*([A-Za-z_]\w*)\s*\[(.*)\]\s*$", ... , re.DOTALL), check if m is
truthy and if not immediately return the original string (or handle the error),
and only then call m.groups() to assign tool_name and raw_args; remove the
separate m_no_args branch entirely so empty brackets are handled by the first
regex and m.groups() is always called on a valid match.
In `@recipes/noc-reasoning-agent/scripts/utils/token_usage.py`:
- Around line 207-276: The __main__ block contains leftover debug logic that
hardcodes csv_file/model, calls AutoTokenizer directly (unsafe if transformers
missing), and calls exit() before main(), making main() unreachable; remove or
replace the entire if __name__ == "__main__": block so it simply parses
CLI/default args and invokes main(), ensure any tokenizer creation uses the
existing guarded import/creation logic (e.g., reuse the module's
AutoTokenizer-safe factory or wrap AutoTokenizer.from_pretrained in the same
try/except used elsewhere), eliminate hardcoded paths and stray exit() calls,
and wire csv_file, column_name, model, output_csv, output_plot into main()
parameters or argparse so the script runs via main() correctly.
In
`@recipes/noc-reasoning-agent/scripts/visualization/generate_trace_visualization.py`:
- Around line 26-71: The function parse_final_reasoning contains an early return
and debug prints that make the intended parsing logic unreachable; remove the
premature return and the print statements and restructure the function to (1)
first find the final "Finish" block using trace_block_match or last_index as a
fallback, (2) set finish_text deterministically (return a clear error string if
neither is found), (3) run the step_pattern regex against that finish_text to
extract Thought/Action/Observation entries, format them into parsed_steps, and
finally return the joined string; refer to parse_final_reasoning,
trace_block_match, last_index, finish_text, step_pattern, and parsed_steps when
making these changes.
- Around line 74-88: The function find_finish_action assigns finish_text only
when last_index != -1 but always returns finish_text, causing UnboundLocalError
when "Finish" isn't found; update find_finish_action to handle the not-found
case by returning a safe default (e.g., None or empty string) or raising a clear
exception, and ensure finish_text is defined in both branches (use last_index,
generation_text, and finish_text to locate/construct the return value or the
fallback).
🟠 Major comments (22)
recipes/noc-reasoning-agent/scripts/ns_pipelines/generate_synthetic_data.py-33-34 (1)
33-34:⚠️ Potential issue | 🟠 MajorStub silently does nothing — user gets no feedback when using a non-default
--llm.If a user passes
--llmwith any value other than"qwen2.5-32b-instruct",generate_synthetic_data_oss_gptis called and silently completes without action. This violates the principle that errors should not pass silently. Either raiseNotImplementedErroror restrict the--llmchoices.♻️ Suggested fix (option A: raise error)
def generate_synthetic_data_oss_gpt(args, cluster, num_gpus): - pass + raise NotImplementedError("OSS GPT generation path is not yet implemented")♻️ Suggested fix (option B: restrict choices)
- parser.add_argument("--llm", type=str, default="qwen2.5-32b-instruct", help="The LLM to use for generation") + parser.add_argument( + "--llm", type=str, default="qwen2.5-32b-instruct", + choices=["qwen2.5-32b-instruct"], + help="The LLM to use for generation", + )As per coding guidelines, "Follow the Zen of Python principles: … ensure errors never pass silently unless explicitly silenced".
Also applies to: 57-62
recipes/noc-reasoning-agent/scripts/ns_pipelines/generate_synthetic_data.py-43-55 (1)
43-55:⚠️ Potential issue | 🟠 Major
--gpusargument is defined but never used;--num_gpusduplicates its purpose.A user setting
--gpus 4would have no effect — only--num_gpusis read (line 54). Additionally, line 55 misleadingly prints"specified via --gpus". Remove the unused--gpusargument.♻️ Suggested fix
- # parser.add_argument('--gpus', type=int, default=detect_num_gpus(), - # help='Number of GPUs to use for generation') - parser.add_argument("--gpus", type=int, default=8, help="Number of GPUs to use for generation") parser.add_argument("--llm", type=str, default="qwen2.5-32b-instruct", help="The LLM to use for generation") parser.add_argument( "--num_gpus", type=int, default=8, help="Number of GPUs to use (auto-detected if not specified)" ) args = parser.parse_args() cluster = "local" num_gpus = args.num_gpus - print(f"Using {num_gpus} GPUs (specified via --gpus)") + print(f"Using {num_gpus} GPUs (specified via --num_gpus)")As per coding guidelines, "Avoid silently ignoring unused user-passed parameters; the code should fail if a required argument is not specified or if unsupported arguments are provided."
recipes/noc-reasoning-agent/scripts/visualization/extract_representation_columns.py-47-56 (1)
47-56: 🛠️ Refactor suggestion | 🟠 MajorCompute all enrichment before writing the output file.
The current code reads, enriches, and writes line-by-line in a single pass. If an error occurs mid-processing (e.g., malformed JSON on line N), a partial output file is left behind. Separate the read/compute phase from the write phase.
Proposed fix
-enriched_rows = [] -with open(jsonl_path, "r", encoding="utf-8") as fin, open(output_jsonl_path, "w", encoding="utf-8") as fout: +enriched_rows = [] +with open(jsonl_path, "r", encoding="utf-8") as fin: for line in fin: if not line.strip(): continue row = json.loads(line) row = enrich_row(row, csv_idx) enriched_rows.append(row) - fout.write(json.dumps(row, ensure_ascii=False) + "\n") + +with open(output_jsonl_path, "w", encoding="utf-8") as fout: + for row in enriched_rows: + fout.write(json.dumps(row, ensure_ascii=False) + "\n")As per coding guidelines, "perform all computations before re-opening files for writing to avoid accidental data loss if code fails during execution".
recipes/noc-reasoning-agent/scripts/utils/reasoning_processes.py-200-212 (1)
200-212: 🛠️ Refactor suggestion | 🟠 MajorUse direct dictionary access instead of
.get()— key is guaranteed to exist.
"Service-off"is defined inPROBLEM_CODE_REASONING_PROCESSat line 2, so.get("Service-off", "")masks a potential future error if the key were removed. Direct access["Service-off"]will fail fast with a clearKeyError.♻️ Proposed fix
-DEFAULT_REASONING_PROCESS = PROBLEM_CODE_REASONING_PROCESS.get("Service-off", "") +DEFAULT_REASONING_PROCESS = PROBLEM_CODE_REASONING_PROCESS["Service-off"] SYNTHETIC_FAULT_CATEGORY_REASONING = { - "power_ac_failure_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""), - "power_dc_rectifier_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""), - "power_battery_discharge_response": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""), - "power_generator_failure_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""), - "env_high_temperature_response": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""), - "env_hvac_fault_recovery": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""), - "env_water_intrusion_response": PROBLEM_CODE_REASONING_PROCESS.get("Service-off", ""), + "power_ac_failure_recovery": PROBLEM_CODE_REASONING_PROCESS["Service-off"], + "power_dc_rectifier_recovery": PROBLEM_CODE_REASONING_PROCESS["Service-off"], + "power_battery_discharge_response": PROBLEM_CODE_REASONING_PROCESS["Service-off"], + "power_generator_failure_recovery": PROBLEM_CODE_REASONING_PROCESS["Service-off"], + "env_high_temperature_response": PROBLEM_CODE_REASONING_PROCESS["Service-off"], + "env_hvac_fault_recovery": PROBLEM_CODE_REASONING_PROCESS["Service-off"], + "env_water_intrusion_response": PROBLEM_CODE_REASONING_PROCESS["Service-off"], }As per coding guidelines, "Do not use
.get()for accessing dictionary keys if the code expects them to be present; use direct dictionary accessdict[key]instead to allow proper error handling and fail fast with clear errors".recipes/noc-reasoning-agent/scripts/visualization/extract_scores.py-35-42 (1)
35-42: 🛠️ Refactor suggestion | 🟠 MajorFile is read and parsed twice; reuse the existing DataFrame.
The JSONL file is loaded identically at lines 9–12 and 36–39, and
dffrom line 15 already holds the same data. Remove the second read and reuse the existingdf.♻️ Proposed fix
-# Load JSONL file into a list -data = [] -with open(file_path, "r", encoding="utf-8") as f: - for line in f: - data.append(json.loads(line)) - -# Convert to DataFrame -df = pd.DataFrame(data) - # Select only the relevant columns columns_to_display = ["expected_answer", "agent_response", "llm_judge_reason"]recipes/noc-reasoning-agent/scripts/evaluation/evaluation_with_judge.py-121-125 (1)
121-125:⚠️ Potential issue | 🟠 MajorBERTScore is computed per-row — extremely slow due to repeated model loading.
bert_score()is called once per row (line 122), which reloads the model each time. For a dataset of any meaningful size, this will be orders of magnitude slower than a single batched call. Collect all pairs first and callbert_score()once.⚡ Sketch of batched approach
# Collect all pairs first references = [] candidates = [] for index, row in df.iterrows(): # ... extract generated_reasoning_part and expected content ... candidates.append(generated_reasoning_part) references.append(conclusion_expected + reasoning_expected) # Single batched BERTScore call P, R, F1 = bert_score(candidates, references, lang="en", verbose=True) # Then iterate again to assemble output rows with F1[i].item()recipes/noc-reasoning-agent/scripts/visualization/generate_trace_visualization.py-221-293 (1)
221-293: 🛠️ Refactor suggestion | 🟠 MajorFile is read twice and contains ~35 lines of commented-out code.
The input file is read manually (lines 224–226) and then again via
pd.read_json(line 233). The first read is entirely superseded by the DataFrame path (line 293 overwritesdata). Remove the first read. Also, the large commented-outselection_criteriablock (lines 246–280) and scattered debug comments should be cleaned up before merging.recipes/noc-reasoning-agent/scripts/utils/split_incident_data.py-28-29 (1)
28-29:⚠️ Potential issue | 🟠 MajorO(n) membership checks on lists — use sets for train/test partitions.
train_setandtest_setare lists, sonumber in train_set(line 42) andnumber in test_set(line 57) are O(n) per lookup. With potentially thousands of incidents across multiple iteration files, this becomes a significant bottleneck.⚡ Proposed fix
- train_set = incidents[:n_train] - test_set = incidents[n_train:] + train_set = set(incidents[:n_train]) + test_set = set(incidents[n_train:])Also applies to: 42-42, 57-57
recipes/noc-reasoning-agent/scripts/visualization/generate_trace_visualization.py-192-218 (1)
192-218:⚠️ Potential issue | 🟠 MajorUnescaped values injected into HTML summary.
Lines 198–199 insert
incident_data.get(...)values directly into the HTML without callingesc(), unlike the rest of the template. This is an XSS vector if incident data contains HTML-special characters.🛡️ Proposed fix
- <li>Category: {incident_data.get("incident_classification", incident_data.get("category"))}</li> - <li>Problem Code: {incident_data.get("fault_category", incident_data.get("u_problem_code"))}</li> + <li>Category: {esc(incident_data.get("incident_classification", incident_data.get("category")))}</li> + <li>Problem Code: {esc(incident_data.get("fault_category", incident_data.get("u_problem_code")))}</li>recipes/noc-reasoning-agent/prompts/prompt_incident.yaml-50-51 (1)
50-51:⚠️ Potential issue | 🟠 Major
\thinkshould be<think>to match the pipeline's expected tag format.The YAML prompt files use
\think, butsplit_incident_data.py(line 67) injectsbackground = "\n<think>\n"with angle brackets. The training configuration (noc_reasoning_sft.yaml,noc_reasoning_sft_6.yaml) also filters for and expects XML-style<think>and</think>tags (e.g.,"<think>thinking</think>summary"). In YAML block scalars,\thinkis treated as literal text, not an escape sequence, so it won't match the<think>tags injected by the pipeline. This issue also appears inprompt_reasoning.yaml(line 166) andshortened_prompt_reasoning.yaml(line 147).recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py-9-10 (1)
9-10:⚠️ Potential issue | 🟠 MajorRelative imports require package initialization; this will fail with ImportError.
from .reasoning_processes import ...andfrom .schema_columns import ...(lines 9-10) perform relative imports, which require the parent directory to be a Python package (with__init__.py). There is no__init__.pyinrecipes/noc-reasoning-agent/scripts/utils/,scripts/, or any parent directory. Running the script directly withpython create_input_jsonl_from_incidents.pywill raiseImportError: attempted relative import with no known parent package.All other scripts in this repository use absolute imports; switch to those or add
__init__.pyfiles to makeutils/andscripts/proper packages.recipes/noc-reasoning-agent/scripts/filtering/filter_rows.py-93-98 (1)
93-98: 🛠️ Refactor suggestion | 🟠 MajorDuplicated CSV loading with overly broad
except Exception— extract a helper.The
try: read_csv(encoding="latin1") except Exception: read_csv()pattern is repeated four times. The broadexcept Exceptioncatches everything (e.g.,FileNotFoundError,MemoryError) when only encoding errors are expected.Extract a helper and narrow the exception:
Suggested refactor
+def _load_csv(path: str) -> pd.DataFrame: + """Load CSV, falling back from latin1 to default encoding.""" + try: + return pd.read_csv(path, encoding="latin1") + except UnicodeDecodeError: + return pd.read_csv(path) + + def filter_auto(input_csv, output_csv): """Filter incident data and save results to a new CSV.""" - try: - df = pd.read_csv(input_csv, encoding="latin1") - except Exception: - df = pd.read_csv(input_csv) + df = _load_csv(input_csv)As per coding guidelines, "Do not catch exceptions when they are not normally expected to be raised; let code fail with clear errors instead of silently misbehaving."
Also applies to: 116-121, 131-135, 147-151
recipes/noc-reasoning-agent/scripts/tools.py-13-20 (1)
13-20:⚠️ Potential issue | 🟠 Major
Check_Alarm_Statusis missing the_safe_tool_valwrapper, unlike every other tool function.All other tool functions pass through
_safe_tool_val()to sanitize NaN/None values, butCheck_Alarm_Statusreturns the raw.get()result directly. If the row containsNaNfor the"Check_Alarm_Status"key,.get()returns theNaNvalue (since the key exists), which will not be JSON-safe.def Check_Alarm_Status(row: pd.Series, site_or_element_id: str) -> str: ... - return row.get("Check_Alarm_Status", "NotApplicable") + return _safe_tool_val(row.get("Check_Alarm_Status", "NotApplicable"))recipes/noc-reasoning-agent/scripts/ns_pipelines/prepare_react_agent.py-30-30 (1)
30-30:⚠️ Potential issue | 🟠 MajorReplace
assertwith a proper check for runtime validation.
assertstatements are stripped in optimized mode (python -O), so this validation would silently disappear. Use a conditional raise instead.- assert len(tool_calls) == len(tool_response) + if len(tool_calls) != len(tool_response): + raise ValueError( + f"Mismatched tool_call/tool_response counts: {len(tool_calls)} vs {len(tool_response)}" + )As per coding guidelines, "ensure errors never pass silently unless explicitly silenced."
recipes/noc-reasoning-agent/scripts/create_agent_with_tools.py-28-35 (1)
28-35:⚠️ Potential issue | 🟠 MajorSilent
except Exception: continueswallows all errors during resume loading.This catches and silently ignores any exception (including unexpected ones like
PermissionError,MemoryError, etc.) while parsing the output file. Per the coding guidelines, don't catch exceptions not normally expected. The batch variant (create_agent_with_tools_batch.pyline 69) correctly catches onlyjson.JSONDecodeError.- except Exception: - # ignore malformed lines so we can keep going - continue + except json.JSONDecodeError: + print(f"[WARNING] Skipping malformed line in output file: {line}")As per coding guidelines, "Do not catch exceptions when they are not normally expected to be raised; let code fail with clear errors instead of silently misbehaving."
recipes/noc-reasoning-agent/scripts/create_agent_with_tools.py-92-211 (1)
92-211:⚠️ Potential issue | 🟠 MajorHeavy objects recreated on every loop iteration — significant performance waste.
llm_with_tools,MemorySaver(), andcreate_react_agent(...)are all created inside the per-row loop (lines 209-211). The tool closures are also redefined on every iteration (lines 102-205). Since the model and LLM wrappers don't change between rows, the agent and tool binding should be constructed once before the loop.The tool functions can capture
rowvia a factory function to avoid the B023 closure issue flagged by static analysis, while still being defined once.Suggested restructuring (sketch)
+ # Build agent once outside the loop + checkpointer = MemorySaver() + for index, row in tqdm(df.iterrows()): if index in processed_indices: continue input_question = row.get("input", "") print(f"Processing input {index + 1}/{len(df)}") - `@tool` - def Check_Alarm_Status(site_or_element_id: str) -> str: - ... - # ... all tool definitions ... - - tools = [...] - llm_with_tools = chat_llm.bind_tools(tools) - checkpointer = MemorySaver() - react_agent = create_react_agent(llm_with_tools, tools, checkpointer=checkpointer) + # Define row-bound tools using a factory or partial + tools = _make_tools(row) + llm_with_tools = chat_llm.bind_tools(tools) + react_agent = create_react_agent(llm_with_tools, tools, checkpointer=checkpointer)Note: Reusing the
MemorySaveroutside the loop is safe; thethread_idinconfigalready isolates per-row state.recipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py-10-12 (1)
10-12:⚠️ Potential issue | 🟠 Major
_incident_idcan silently returnNone, corrupting downstream data.If neither
incident_identifiernornumberis present, this returnsNone. ThatNoneis then used as a dictionary key throughout (responses[number],formatted_steps_taken[number],full_steps[number]), silently merging unrelated incidents under a singleNonekey. This is a data integrity issue.Proposed fix
def _incident_id(data): """Synthetic schema uses incident_identifier; legacy uses number.""" - return data.get("incident_identifier") or data.get("number") + incident_id = data.get("incident_identifier") or data.get("number") + if incident_id is None: + raise ValueError(f"Missing incident identifier in data: {list(data.keys())}") + return incident_idrecipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py-63-69 (1)
63-69: 🛠️ Refactor suggestion | 🟠 MajorBare
Exceptioncatch silently swallows unexpected errors.Line 66 catches all exceptions and just prints them, masking bugs in the parsing logic. Per coding guidelines, exceptions that are not normally expected should not be caught — let them propagate so issues surface clearly. The outer
JSONDecodeErrorcatch on line 68 is fine.Proposed fix — remove the bare catch
except json.JSONDecodeError as e: print(text) print(f"Error decoding JSON: {e}") - except Exception as e: - print(f"An unexpected error occurred: {e}")As per coding guidelines, "Do not catch exceptions when they are not normally expected to be raised; let code fail with clear errors instead of silently misbehaving".
recipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py-314-323 (1)
314-323: 🛠️ Refactor suggestion | 🟠 MajorBroad
except Exceptionsilently discards incidents on any error.This catches and suppresses all errors — including programming bugs in
token_convertingor data access issues — and quietly removes the affected incidents from the output. This makes debugging very difficult and violates the fail-fast principle.Consider catching only the specific expected exception (e.g.,
KeyError,ValueError) or at minimum, logging the incident data that caused the failure.Proposed fix
try: for i in range(len(steps_taken[number])): if steps_taken[number][i]["tool_call"]: steps_taken[number][i]["tool_call"] = token_converting( steps_taken[number][i]["tool_call"], model ) steps_taken[number][i]["thinking"] = reasoning_steps[number][steps_taken[number][i]["step_number"]] - except Exception as e: - print(e) + except (KeyError, ValueError) as e: + print(f"Error merging steps for incident {number}: {e}") broken_numbers.append(number)As per coding guidelines, "Do not catch exceptions when they are not normally expected to be raised; let code fail with clear errors instead of silently misbehaving".
recipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py-445-448 (1)
445-448:⚠️ Potential issue | 🟠 MajorCase C silently drops intermediate non-tool-call steps — likely a bug.
When a step has no
tool_calland is not the final step,step_textis computed but never appended tocurrent_assistant_content. The comment on line 447 states "We already added to current_assistant_content at the top of loop" but no such addition exists. This means intermediate reasoning steps are lost from the conversation context, and their thinking/actions won't appear in subsequentbackgroundfields.If these steps should accumulate context, the content needs to be appended:
Proposed fix (if accumulation is intended)
# --- CASE C: Intermediate Step (just accumulation) --- else: - # We already added to current_assistant_content at the top of loop - pass + # Accumulate intermediate thinking into the conversation context + current_assistant_content.append( + {"role": "assistant", "content": step_text} + )recipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py-85-87 (1)
85-87:⚠️ Potential issue | 🟠 MajorSilent data corruption when
"final<|message|>"delimiter is missing.If
textdoes not contain"final<|message|>",rfindreturns-1, and the slice becomestext[15:](i.e.,-1 + 16), silently producing garbage data for the thinking content with no warning.Proposed fix
- thinking = text[text.rfind("final<|message|>") + len("final<|message|>") :] + delimiter = "final<|message|>" + idx = text.rfind(delimiter) + if idx == -1: + print(f"Warning: delimiter '{delimiter}' not found for incident {number}, step {step_number}") + continue + thinking = text[idx + len(delimiter) :]recipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py-476-489 (1)
476-489: 🛠️ Refactor suggestion | 🟠 MajorSame broad
except Exceptionpattern — masks bugs during compilation.Same issue as in
merge_reasoning_steps. This silently counts programming errors as "incorrect incidents" and continues, making it hard to distinguish data issues from code bugs.Proposed fix
- except Exception as e: - print(f"Error for incident {number}: {e}") + except (KeyError, ValueError) as e: + print(f"Error for incident {number}: {e}", flush=True) incorrect_incidents += 1As per coding guidelines, "Do not catch exceptions when they are not normally expected to be raised; let code fail with clear errors instead of silently misbehaving".
🟡 Minor comments (29)
recipes/noc-reasoning-agent/scripts/utils/split_mocktools_answers.py-94-94 (1)
94-94:⚠️ Potential issue | 🟡 MinorOutput key
"problem code"contains a space, inconsistent with all other keys.All other output keys use underscores (e.g.,
u_region2,short_description). This looks like an accidental deviation. If intentional (matching downstream expectations), please confirm; otherwise rename to"problem_code".- "problem code": d1.get("u_problem_code", ""), + "problem_code": d1.get("u_problem_code", ""),recipes/noc-reasoning-agent/scripts/utils/split_mocktools_answers.py-60-71 (1)
60-71:⚠️ Potential issue | 🟡 MinorSame
.get("number")and silent-skip concern in file2 loading.Apply the same fix as suggested for the file1 block above.
Suggested fix
d = json.loads(line) - num = d.get("number") - if num: - data2[num] = d.get("generation", "") - except json.JSONDecodeError as e: - print(f"Error decoding JSON in file2 on line {line_num}: '{line}'. Error: {e}") + num = d["number"] + data2[num] = d.get("generation", "") + except (json.JSONDecodeError, KeyError) as e: + raise ValueError(f"Error processing file2 on line {line_num}: '{line}'. Error: {e}") from erecipes/noc-reasoning-agent/scripts/utils/split_mocktools_answers.py-46-57 (1)
46-57:⚠️ Potential issue | 🟡 MinorUse direct dictionary access for expected keys instead of
.get().Lines 53 and 67 use
d.get("number")then silently skip records where the key is missing. If"number"is a required field in these JSONL files, direct access (d["number"]) would fail fast with a clearKeyError, which is preferable. Additionally, theJSONDecodeErrorcatch-and-print on lines 56–57 (and 70–71) silently drops malformed lines, which can mask data corruption.Suggested fix for file1 loading
d = json.loads(line) - num = d.get("number") - if num: - data1[num] = d - except json.JSONDecodeError as e: - print(f"Error decoding JSON in file1 on line {line_num}: '{line}'. Error: {e}") + num = d["number"] + data1[num] = d + except (json.JSONDecodeError, KeyError) as e: + raise ValueError(f"Error processing file1 on line {line_num}: '{line}'. Error: {e}") from eAs per coding guidelines, "Do not use
.get()for accessing dictionary keys if the code expects them to be present" and "Do not catch exceptions when they are not normally expected to be raised; let code fail with clear errors instead of silently misbehaving."recipes/noc-reasoning-agent/scripts/visualization/extract_representation_columns.py-60-64 (1)
60-64:⚠️ Potential issue | 🟡 MinorMisleading section comment: says
solved_categorybut code usesresolution_method/close_code.Line 60 says "Bucket by solved_category" but line 63 derives the category from
resolution_methodorclose_code, notsolved_category. This will confuse future readers.Proposed fix
-# ---------- Bucket by solved_category ---------- +# ---------- Bucket by resolution_method / close_code ----------recipes/noc-reasoning-agent/scripts/visualization/extract_representation_columns.py-32-44 (1)
32-44:⚠️ Potential issue | 🟡 MinorUse direct dictionary access for guaranteed keys; validate
numis not None.Two issues in
enrich_row:
Line 33: If the row contains neither
"incident_identifier"nor"number",numsilently becomesNone, and the lookup silently fails — producing records withNonefields and no indication of a data problem. This should raise an error.Lines 41-43:
infois sourced fromcsv_idx(built on line 28 from the validated columnstime_to_resolve,solved_category,solved_reason), so these keys are guaranteed to exist. Per coding guidelines, use direct dictionary access instead of.get().Proposed fix
def enrich_row(row, lookup): - num = row.get("incident_identifier", row.get("number")) + if "incident_identifier" in row: + num = row["incident_identifier"] + elif "number" in row: + num = row["number"] + else: + raise KeyError(f"Row missing both 'incident_identifier' and 'number': {row}") info = lookup.get(num) if info is None: - # No match: set to None (or choose sensible defaults) row["time_to_resolve"] = None - row["solved_category"] = row.get("solved_category") # preserve if already present - row["solved_reason"] = row.get("solved_reason") + row["solved_category"] = row.get("solved_category") + row["solved_reason"] = row.get("solved_reason") else: - row["time_to_resolve"] = info.get("time_to_resolve") - row["solved_category"] = info.get("solved_category") - row["solved_reason"] = info.get("solved_reason") + row["time_to_resolve"] = info["time_to_resolve"] + row["solved_category"] = info["solved_category"] + row["solved_reason"] = info["solved_reason"] return rowAs per coding guidelines, "Do not use
.get()for accessing dictionary keys if the code expects them to be present; use direct dictionary accessdict[key]instead to allow proper error handling and fail fast with clear errors" and "ensure errors never pass silently unless explicitly silenced"..gitignore-52-56 (1)
52-56:⚠️ Potential issue | 🟡 MinorRemove duplicate
site/entries.Five identical
site/ignore patterns were added. Git only needs one entry per pattern, so the duplicates serve no purpose and suggest an accidental copy-paste or merge error.🧹 Proposed fix to remove duplicates
.idea site/ -site/ -site/ -site/ -site/recipes/noc-reasoning-agent/configs/config.ini-1-10 (1)
1-10:⚠️ Potential issue | 🟡 MinorConfig references appear stale relative to the tutorial.
The
[download]section listsqwen2.5-14,qwen2.5-32,gpt-oss-120b, andnemotron-49b-1.5, but the tutorial (noc-reasoning-agent.md) trains and evaluates with Qwen3-32B — which is absent here. Similarly,original_data_pathpoints todata/anonymized-Incidents_Last_6_Months.csvwhile the tutorial instructs users to place data asdata/synthetic_incidents.csv. If this config is meant to be consumed by scripts in this recipe, the mismatches could confuse users or break automation.recipes/noc-reasoning-agent/prompts/formatting_prompt.yaml-37-37 (1)
37-37:⚠️ Potential issue | 🟡 MinorMalformed sentence in
Create_Ticketdescription.
"Logs and routes tickets to departments (e.g., Field Ops) via the ticketing system. It (e.g., Jira/ServiceNow)."— the second sentence is a fragment with a dangling "It" and double space. This is part of the prompt the model will see.docs/tutorials/posts/noc-reasoning-agent.md-1-4 (1)
1-4:⚠️ Potential issue | 🟡 MinorIncorrect or backdated publication date.
date: 2025-09-05predates the PR creation (Feb 2026). If this is intentional for ordering purposes, a comment would help; otherwise, update to the actual publication date.recipes/noc-reasoning-agent/scripts/utils/reasoning_processes.py-51-51 (1)
51-51:⚠️ Potential issue | 🟡 MinorTypo in tool name:
Check_Performancel()→Check_Performance().Extra trailing "l" in the "Degraded Prach" workflow at step 7. This text is injected into prompts and could confuse the model's tool-calling behavior.
docs/tutorials/posts/noc-reasoning-agent.md-352-365 (1)
352-365:⚠️ Potential issue | 🟡 MinorMissing expected evaluation results for tested models.
The tutorial says "You should see a meaningful improvement" but doesn't provide concrete accuracy numbers or expected ranges. Including representative results (e.g., baseline vs. fine-tuned close-code accuracy on a reference dataset) would make the tutorial more actionable and help users validate their setup. As per coding guidelines, "When adding new benchmarks, add documentation with example commands for how to run evaluation, expected results for tested models, and any dataset-specific details".
recipes/noc-reasoning-agent/prompts/formatting_prompt.yaml-17-17 (1)
17-17:⚠️ Potential issue | 🟡 MinorStray character in tool description.
Line 17 ends with a dangling "I" after the period:
"...cell status). I". This will be seen by the model and may degrade prompt quality.✏️ Proposed fix
- - Purpose: Polls to monitor the system, EMS/NM/Telemetry, and retrieve device health metrics (e.g., RU radiation, cell status). I + - Purpose: Polls to monitor the system, EMS/NM/Telemetry, and retrieve device health metrics (e.g., RU radiation, cell status).docs/tutorials/posts/noc-reasoning-agent.md-229-244 (1)
229-244:⚠️ Potential issue | 🟡 MinorTypo: "indicence" → "incidence".
--log_dir=/workspace/prepare-sft-data-indicence(line 231) and the equivalent at line 296 both contain the same typo.✏️ Proposed fix
- --log_dir=/workspace/prepare-sft-data-indicence \ + --log_dir=/workspace/prepare-sft-data-incidence \docs/tutorials/posts/noc-reasoning-agent.md-96-118 (1)
96-118:⚠️ Potential issue | 🟡 MinorIn-place file overwriting risks data loss.
The second, third, and fourth
filter_rows.pycalls use the same path for both--input_csvand--output_csv(data/filtered_file.csv). If the script reads the CSV lazily or fails mid-write, data is lost with no backup. Consider using distinct intermediate filenames or documenting that the script fully reads before writing.recipes/noc-reasoning-agent/scripts/visualization/extract_scores.py-44-53 (1)
44-53:⚠️ Potential issue | 🟡 MinorMissing guard for display columns — will crash with
KeyErrorif columns are absent.The metrics section (lines 20–24) defensively checks which columns exist, but
columns_to_displayis accessed directly without validation. If the JSONL doesn't contain one of["expected_answer", "agent_response", "llm_judge_reason"], this will raise an unguardedKeyError.🛡️ Proposed fix
columns_to_display = ["expected_answer", "agent_response", "llm_judge_reason"] +available_display = [c for c in columns_to_display if c in df.columns] +if not available_display: + print("No display columns found; skipping sample output.") +else: -df_subset = df[columns_to_display].head(10) + df_subset = df[available_display].head(10)recipes/noc-reasoning-agent/scripts/utils/split_incident_data.py-58-62 (1)
58-62:⚠️ Potential issue | 🟡 MinorDebug prints left in production code.
Lines 59 (
print(row["response"])), 60 (# exit()), and 62 (print("in")) are debug artifacts that will produce excessive output and should be removed.recipes/noc-reasoning-agent/prompts/prompt_incident.yaml-48-48 (1)
48-48:⚠️ Potential issue | 🟡 MinorClose code list contains both "Reset RU" and "RU Reset".
These appear to be duplicates with reversed word order. This could confuse the model during inference. Consider keeping only one canonical form.
recipes/noc-reasoning-agent/scripts/utils/create_input_jsonl_from_incidents.py-144-144 (1)
144-144:⚠️ Potential issue | 🟡 MinorDebug
print(required_incidents)left in.This prints the full list of required incident IDs to stdout, which is noisy and potentially leaks data in production pipelines.
recipes/noc-reasoning-agent/scripts/evaluation/evaluation_with_judge.py-148-149 (1)
148-149:⚠️ Potential issue | 🟡 MinorMisleading success message.
Line 149 prints
"Evaluations saved to evaluations.jsonl"regardless of the actual output path. Use the actual path:🐛 Fix
-print("Evaluations saved to evaluations.jsonl") +print(f"Evaluations saved to {args.output_file}")recipes/noc-reasoning-agent/scripts/utils/split_incident_data.py-96-99 (1)
96-99:⚠️ Potential issue | 🟡 Minor
--previewargument is parsed but never used.The
previewargument (line 97) is accepted but has no effect. Either implement the preview functionality or remove the argument.As per coding guidelines, "Avoid silently ignoring unused user-passed parameters; the code should fail if a required argument is not specified or if unsupported arguments are provided."
recipes/noc-reasoning-agent/scripts/filtering/match_keywords.py-254-261 (1)
254-261:⚠️ Potential issue | 🟡 MinorUnused
patternparameter – compiled regex patterns are dead code.
find_keyword_matchesaccepts apatternargument (the compiled regex) but never uses it. The regex patterns compiled on lines 281-282 are passed intofind_keyword_matcheson lines 290 and 296 but silently discarded. The function only performs an exactincheck against thekeywordslist.Either remove the
patternparameter and the regex compilation, or refactor the function to actually use the regex for substring matching (which would change semantics — matching keywords within a field vs. exact match).Option A: Remove unused pattern (keeps current exact-match behavior)
-def find_keyword_matches(row, pattern, keywords): +def find_keyword_matches(row, keywords): """Finds which specific keywords from a list match within a DataFrame row.""" # Synthetic schema uses resolution_method (same meaning as close_code) resolution_method = row.get("resolution_method", row.get("close_code", "")) if resolution_method and str(resolution_method).lower().replace(" ", "") in keywords: return True return FalseAnd update the call sites:
- hw_pattern = re.compile("|".join(re.escape(k) for k in hw_keywords), flags=re.IGNORECASE) - sw_pattern = re.compile("|".join(re.escape(k) for k in sw_keywords), flags=re.IGNORECASE) - ... - hw_matches = find_keyword_matches(row, hw_pattern, hw_keywords) + hw_matches = find_keyword_matches(row, hw_keywords) ... - sw_matches = find_keyword_matches(row, sw_pattern, sw_keywords) + sw_matches = find_keyword_matches(row, sw_keywords)Also applies to: 280-296
recipes/noc-reasoning-agent/scripts/create_agent_with_tools.py-46-46 (1)
46-46:⚠️ Potential issue | 🟡 MinorMissing
required=Trueon--inputargument.
--inputhas no default and norequired=True, soargs.inputwill beNoneif omitted, causing aTypeErroronPath(args.input)at line 59.- parser.add_argument("--input", help="Path to the JSONL file containing inputs and answers") + parser.add_argument("--input", required=True, help="Path to the JSONL file containing inputs and answers")As per coding guidelines, "the code should fail if a required argument is not specified."
recipes/noc-reasoning-agent/scripts/create_agent_with_tools_batch.py-126-126 (1)
126-126:⚠️ Potential issue | 🟡 Minor
--limitargument lackstype=int, will fail with a confusing error on non-numeric input.
args.limitis a string by default. Line 141 doesint(args.limit), which raisesValueErrorfor non-numeric input. Declare the type on the argument:- parser.add_argument("--limit", required=False) + parser.add_argument("--limit", type=int, required=False, default=None)recipes/noc-reasoning-agent/configs/noc_reasoning_sft.yaml-58-58 (1)
58-58:⚠️ Potential issue | 🟡 MinorComment contradicts the value:
remove_no_think_tags: falsebut comment says "Enabled, as requested".The value is
false(disabled), but the comment reads "Enabled, as requested". This is confusing and likely a copy-paste error. If the filter should be enabled, change totrue; otherwise, fix the comment.- remove_no_think_tags: false # Enabled, as requested + remove_no_think_tags: true # Enabled, as requestedrecipes/noc-reasoning-agent/scripts/filtering/filter_rows.py-180-188 (1)
180-188:⚠️ Potential issue | 🟡 MinorInvalid
filter_typesilently does nothing.If the user passes
--filter_type=foo, none of theelifbranches match and the script exits silently with no output. Per coding guidelines, unsupported arguments should cause a failure.Suggested fix
elif args.filter_type == "close_codes": filter_close_codes(args.input_csv, args.output_csv) + else: + parser.error(f"Unknown filter_type: {args.filter_type!r}. Choose from: auto, soft_solve, problem_codes, close_codes")Or better, use
choiceson the argument itself:- parser.add_argument("--filter_type", type=str, default="auto") + parser.add_argument("--filter_type", type=str, default="auto", + choices=["auto", "soft_solve", "problem_codes", "close_codes"])As per coding guidelines, "Avoid silently ignoring unused user-passed parameters; the code should fail if a required argument is not specified or if unsupported arguments are provided."
recipes/noc-reasoning-agent/scripts/ns_pipelines/prepare_react_agent.py-22-52 (1)
22-52:⚠️ Potential issue | 🟡 Minor
get_toolshas an inconsistent return type: tuple(None, None)vs. dict.On the error path (line 50), the function returns
(None, None)(a 2-tuple), but on the success path (line 52), it returnsmatches(a dict). The caller on line 73 assigns the result to a single variable and compares with== (None, None), which works incidentally but is fragile and confusing.Return
Noneon the empty case and checkif matches is Noneat the call site.Suggested fix
if not matches: - # print("No tools!") - return None, None - # print(matches) + return None return matchesAnd update the caller:
- matches = get_tools(d["initial_background"]) - if matches == (None, None): + matches = get_tools(d["initial_background"]) + if matches is None:recipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py-130-130 (1)
130-130:⚠️ Potential issue | 🟡 MinorTypo in user-facing message: "disgarded" → "discarded".
- print(f"{incorrect_incidents} incidents were not parsed correctly and disgarded.") + print(f"{incorrect_incidents} incidents were not parsed correctly and discarded.")recipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py-46-52 (1)
46-52:⚠️ Potential issue | 🟡 MinorOff-by-one in
end_indexcheck:]-not-found case passes the guard.When
"]"is absent,rfindreturns-1, soend_indexbecomes0after the+ 1. The guardend_index != -1on line 52 evaluates toTrue, allowing a nonsensical slicetext[start_index:0](empty string) that falls through to aJSONDecodeErrorinstead of the intended error message on line 61.Proposed fix
- end_index = text.rfind("]") + 1 + end_index_raw = text.rfind("]") + end_index = end_index_raw + 1 - if start_index != -1 and end_index != -1: + if start_index != -1 and end_index_raw != -1:recipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py-505-511 (1)
505-511:⚠️ Potential issue | 🟡 MinorMissing validation:
output_dirandoutput_filecan beNonefor their respective modes.When
parse_types == "compile_reasoning",output_diris passed tocompile_reasoningwhich callsos.makedirs(output_dir, ...)— this crashes withTypeErrorifNone. Similarly,output_filecan beNoneforsteps_extraction. Add upfront validation like the existing check forreasoning_jsonl.Proposed fix
def main(jsonl_file, input_file, output_file, parse_types, reasoning_jsonl=None, output_dir=None): if parse_types == "steps_extraction": + if not output_file: + raise ValueError("Please specify an output file by specifying --output") prepare_data_for_reasoning_traces(jsonl_file, input_file, output_file) elif parse_types == "compile_reasoning": if not reasoning_jsonl: raise ValueError("Please specify a reasoning jsonl file by specifying --reasoning_jsonl") + if not output_dir: + raise ValueError("Please specify an output directory by specifying --output_dir") compile_reasoning(jsonl_file, input_file, output_dir, reasoning_jsonl)
| skip = True | ||
| if skip and os.path.exists(output_path): | ||
| os.remove(output_path) | ||
| processed_indices = load_processed_indices(output_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hardcoded skip = True deletes the output file on every run, completely breaking resume logic.
Lines 132-134 unconditionally delete the output file before load_processed_indices is called on line 135. This means the resume logic (lines 57-71, 144-148) will never find any previously processed rows — the file is always gone. All the resume infrastructure becomes dead code.
This looks like a debug artifact that was accidentally left in.
- skip = True
- if skip and os.path.exists(output_path):
- os.remove(output_path)
processed_indices = load_processed_indices(output_path)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| skip = True | |
| if skip and os.path.exists(output_path): | |
| os.remove(output_path) | |
| processed_indices = load_processed_indices(output_path) | |
| processed_indices = load_processed_indices(output_path) |
🤖 Prompt for AI Agents
In `@recipes/noc-reasoning-agent/scripts/create_agent_with_tools_batch.py` around
lines 132 - 135, The code currently hardcodes skip = True which causes deletion
of output_path every run and breaks resume logic; change this so skip is not
hardcoded—make it a CLI flag, environment variable, or default False—and only
call os.remove(output_path) when skip is explicitly set by the user; ensure the
block around skip, os.path.exists(output_path), and os.remove(output_path) is
gated by the configurable skip variable and leave
load_processed_indices(output_path) unchanged so resume logic
(load_processed_indices / processed_indices) can work when skip is False.
| model = AutoModelForCausalLM.from_pretrained( | ||
| args.weights_dir, dtype=dtype, device_map=device_map, low_cpu_mem_usage=True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# Find and read the batch variant file
fd "create_agent_with_tools_batch.py" --type fRepository: NVIDIA-NeMo/Skills
Length of output: 130
🏁 Script executed:
# Find the non-batch variant for comparison
fd "create_agent_with_tools.py" --type fRepository: NVIDIA-NeMo/Skills
Length of output: 124
🌐 Web query:
AutoModelForCausalLM.from_pretrained torch_dtype parameter transformers library
💡 Result:
AutoModelForCausalLM.from_pretrained(..., torch_dtype=...) (Hugging Face Transformers) controls the PyTorch floating-point dtype the model is instantiated/loaded into, which can significantly reduce memory usage for large LMs.
Common values:
- A specific dtype (e.g.
torch.float16,torch.bfloat16,torch.float32) to force loading in that dtype. Example:AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16).[1] "auto"to automatically choose a dtype based on the checkpoint/model metadata (the docs describe it as deriving dtype automatically, commonly from the weights / config where available). Example:AutoModelForCausalLM.from_pretrained(..., torch_dtype="auto", device_map="auto").[2]
Notes:
- This is only available for floating dtypes (per the model docs’ “Model Instantiation dtype” section).[1]
- It’s often paired with
device_map="auto"when loading big models to fit across available devices.[2]
Sources:
[1] Transformers “Models” / “Model Instantiation dtype” documentation (examples using torch_dtype=torch.float16 and "auto", and noting floating dtypes only). (huggingface.co)
[2] Transformers “Loading models” guide showing AutoModelForCausalLM.from_pretrained(..., torch_dtype="auto", device_map="auto"). (huggingface.co)
Citations:
- 1: https://huggingface.co/docs/transformers/v4.15.0/en/main_classes/model?utm_source=openai
- 2: https://huggingface.co/docs/transformers/v4.53.1/models?utm_source=openai
🏁 Script executed:
cat -n recipes/noc-reasoning-agent/scripts/create_agent_with_tools_batch.py | sed -n '150,165p'Repository: NVIDIA-NeMo/Skills
Length of output: 921
🏁 Script executed:
cat -n recipes/noc-reasoning-agent/scripts/create_agent_with_tools.py | sed -n '70,80p'Repository: NVIDIA-NeMo/Skills
Length of output: 499
Incorrect parameter name: dtype= should be torch_dtype=.
AutoModelForCausalLM.from_pretrained() uses torch_dtype as the parameter name, not dtype. Using dtype will silently ignore the parameter, and the model will load in the default precision (float32) instead of the intended bfloat16/float16, causing significant memory and performance degradation.
The non-batch variant (create_agent_with_tools.py line 74) correctly uses torch_dtype=torch.float16.
Fix
model = AutoModelForCausalLM.from_pretrained(
- args.weights_dir, dtype=dtype, device_map=device_map, low_cpu_mem_usage=True
+ args.weights_dir, torch_dtype=dtype, device_map=device_map, low_cpu_mem_usage=True
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| model = AutoModelForCausalLM.from_pretrained( | |
| args.weights_dir, dtype=dtype, device_map=device_map, low_cpu_mem_usage=True | |
| model = AutoModelForCausalLM.from_pretrained( | |
| args.weights_dir, torch_dtype=dtype, device_map=device_map, low_cpu_mem_usage=True | |
| ) |
🤖 Prompt for AI Agents
In `@recipes/noc-reasoning-agent/scripts/create_agent_with_tools_batch.py` around
lines 156 - 157, The call to AutoModelForCausalLM.from_pretrained in
create_agent_with_tools_batch.py is using the wrong parameter name (dtype=) so
the intended precision is ignored; update the call to pass the dtype via
torch_dtype (e.g., torch_dtype=dtype) so the model loads in the specified
bfloat16/float16 precision, ensuring you modify the
AutoModelForCausalLM.from_pretrained(...) invocation accordingly.
| # Parse arguments for input JSONL path | ||
| parser = argparse.ArgumentParser(description="Evaluation Pipeline for Agent Responses") | ||
| parser.add_argument("input_jsonl", help="Path to agent_responses.jsonl containing expected_answer and agent_response") | ||
| parser.add_argument("--output_file", help="Path to output") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--output_file is neither required nor has a default — will crash at line 148.
If --output_file is omitted, args.output_file is None, and pd.DataFrame.to_json(None, ...) will raise a ValueError. Add required=True or a sensible default.
🐛 Proposed fix
- parser.add_argument("--output_file", help="Path to output")
+ parser.add_argument("--output_file", required=True, help="Path to output JSONL")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| parser.add_argument("--output_file", help="Path to output") | |
| parser.add_argument("--output_file", required=True, help="Path to output JSONL") |
🤖 Prompt for AI Agents
In `@recipes/noc-reasoning-agent/scripts/evaluation/evaluation_with_judge.py` at
line 12, The parser.add_argument("--output_file", help="Path to output") can
leave args.output_file as None and cause pd.DataFrame.to_json(args.output_file,
...) to raise a ValueError; update the argument declaration
(parser.add_argument) to either set required=True or provide a sensible default
(e.g., default="evaluation_results.json"), and/or add a guard before writing
that validates args.output_file is not None and raises a clear error message;
ensure references to args.output_file and the DataFrame.to_json call are updated
accordingly.
| m = re.match(r"^\s*([A-Za-z_]\w*)\s*\[(.*)\]\s*$", str(string), re.DOTALL) | ||
|
|
||
| if not m: | ||
| # Handle case with no arguments, e.g., Check_Time[] | ||
| m_no_args = re.match(r"^\s*([A-Za-z_]\w*)\s*\[\s*\]\s*$", str(string)) | ||
| if m_no_args: | ||
| tool_name = m_no_args.group(1) | ||
| raw_args = "" | ||
| else: | ||
| # If it doesn't match the syntax, return original string or raise error | ||
| # returning string allows the LLM to fail gracefully or retry | ||
| return string | ||
|
|
||
| tool_name, raw_args = m.groups() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unreachable branch + latent crash: m.groups() on None if the dead-code path were taken.
The first regex \[(.*)\] with re.DOTALL already matches empty brackets Tool[], making the m_no_args branch (lines 157-160) dead code. More critically, if it were reached (e.g., due to a future regex change), execution falls through to line 166 which calls m.groups() on None — an AttributeError.
Restructure to eliminate the dead path and the latent crash:
Proposed fix
m = re.match(r"^\s*([A-Za-z_]\w*)\s*\[(.*)\]\s*$", str(string), re.DOTALL)
if not m:
- # Handle case with no arguments, e.g., Check_Time[]
- m_no_args = re.match(r"^\s*([A-Za-z_]\w*)\s*\[\s*\]\s*$", str(string))
- if m_no_args:
- tool_name = m_no_args.group(1)
- raw_args = ""
- else:
- # If it doesn't match the syntax, return original string or raise error
- # returning string allows the LLM to fail gracefully or retry
- return string
-
- tool_name, raw_args = m.groups()
+ return string
+
+ tool_name, raw_args = m.groups()🤖 Prompt for AI Agents
In `@recipes/noc-reasoning-agent/scripts/utils/format_reasoning_json.py` around
lines 153 - 166, The code currently uses two regex matches (m and m_no_args) and
then unconditionally calls m.groups(), which creates dead code and a latent
crash; instead, after performing the initial match m =
re.match(r"^\s*([A-Za-z_]\w*)\s*\[(.*)\]\s*$", ... , re.DOTALL), check if m is
truthy and if not immediately return the original string (or handle the error),
and only then call m.groups() to assign tool_name and raw_args; remove the
separate m_no_args branch entirely so empty brackets are handled by the first
regex and m.groups() is always called on a valid match.
| if __name__ == "__main__": | ||
| # === Configuration === | ||
| # csv_file = "data/top_10_categorized_incidents.csv" # Path to your CSV | ||
| csv_file = "data/prompt_incident.yaml" | ||
| column_name = "action_chronicle" # default; overridden after loading data | ||
| model = "openai/gpt-oss-120b" # Tokenizer model | ||
| output_csv = "outputs/work_notes_token_lengths.csv" | ||
| output_plot = "outputs/work_notes_token_hist.png" | ||
|
|
||
| # === Load Tokenizer === | ||
| tokenizer = AutoTokenizer.from_pretrained(model) | ||
| print(f"[info] Tokenizer loaded for model: {model}") | ||
|
|
||
| if csv_file.endswith((".yaml", ".yml")): | ||
| with open(csv_file, "r", encoding="utf-8") as f: | ||
| yaml_data = yaml.safe_load(f) | ||
|
|
||
| # You might want to join everything into one big string, or tokenize each value | ||
| if isinstance(yaml_data, dict): | ||
| prompt_text = yaml.dump(yaml_data) | ||
| else: | ||
| prompt_text = str(yaml_data) | ||
|
|
||
| token_count = len(tokenizer.encode(prompt_text, add_special_tokens=False)) | ||
| print(f"[info] YAML file {csv_file} contains {token_count} tokens.") | ||
| exit(0) | ||
|
|
||
| # === Load CSV === | ||
| df = pd.read_csv(csv_file, encoding="utf-8") | ||
| if column_name not in df.columns: | ||
| raise ValueError(f"Column '{column_name}' not found in CSV. Available columns: {df.columns.tolist()}") | ||
|
|
||
| # === Count Tokens === | ||
| token_lengths = [] | ||
| zero_tokens = 0 | ||
| for idx, text in tqdm(enumerate(df[column_name]), total=len(df), desc="Tokenizing work_notes"): | ||
| if pd.isna(text): # Handle NaN values | ||
| token_lengths.append(0) | ||
| zero_tokens += 1 | ||
| continue | ||
| # Count tokens for this row | ||
| token_count = len(tokenizer.encode(str(text), add_special_tokens=False)) | ||
| token_lengths.append(token_count) | ||
|
|
||
| # Count rows with 0 tokens | ||
| print(f"\n[info] Number of rows with 0 tokens: {zero_tokens}") | ||
|
|
||
| # Add results back to DataFrame | ||
| df["work_notes_token_length"] = token_lengths | ||
|
|
||
| # === Save CSV with token counts === | ||
| df.to_csv(output_csv, index=False) | ||
| print(f"[info] Token lengths saved to {output_csv}") | ||
|
|
||
| # === Summary Statistics === | ||
| print("\n=== Token Length Summary ===") | ||
| print(df["work_notes_token_length"].describe()) | ||
|
|
||
| # === Plot Histogram === | ||
| plt.figure(figsize=(10, 6)) | ||
| plt.hist(df["work_notes_token_length"], bins=50) | ||
| plt.title(f"Distribution of Token Lengths for {column_name}") | ||
| plt.xlabel("Token Length") | ||
| plt.ylabel("Frequency") | ||
| plt.tight_layout() | ||
| plt.savefig(output_plot, dpi=150) | ||
| plt.close() | ||
| print(f"[info] Histogram saved to {output_plot}") | ||
| exit() | ||
| main() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
main() is unreachable — debug code left in __main__ block.
exit() on line 275 terminates the process before main() on line 276 is ever called. The entire __main__ block (lines 207–275) appears to be leftover development code with hardcoded paths (data/prompt_incident.yaml, openai/gpt-oss-120b) and duplicated logic. It also bypasses the defensive AutoTokenizer import guard — line 217 will crash with TypeError if transformers is not installed.
Replace the block with a clean entry point:
🐛 Proposed fix
if __name__ == "__main__":
- # === Configuration ===
- # csv_file = "data/top_10_categorized_incidents.csv" # Path to your CSV
- csv_file = "data/prompt_incident.yaml"
- column_name = "action_chronicle" # default; overridden after loading data
- model = "openai/gpt-oss-120b" # Tokenizer model
- output_csv = "outputs/work_notes_token_lengths.csv"
- output_plot = "outputs/work_notes_token_hist.png"
-
- # === Load Tokenizer ===
- tokenizer = AutoTokenizer.from_pretrained(model)
- print(f"[info] Tokenizer loaded for model: {model}")
-
- if csv_file.endswith((".yaml", ".yml")):
- with open(csv_file, "r", encoding="utf-8") as f:
- yaml_data = yaml.safe_load(f)
-
- # You might want to join everything into one big string, or tokenize each value
- if isinstance(yaml_data, dict):
- prompt_text = yaml.dump(yaml_data)
- else:
- prompt_text = str(yaml_data)
-
- token_count = len(tokenizer.encode(prompt_text, add_special_tokens=False))
- print(f"[info] YAML file {csv_file} contains {token_count} tokens.")
- exit(0)
-
- # === Load CSV ===
- df = pd.read_csv(csv_file, encoding="utf-8")
- if column_name not in df.columns:
- raise ValueError(f"Column '{column_name}' not found in CSV. Available columns: {df.columns.tolist()}")
-
- # === Count Tokens ===
- token_lengths = []
- zero_tokens = 0
- for idx, text in tqdm(enumerate(df[column_name]), total=len(df), desc="Tokenizing work_notes"):
- if pd.isna(text): # Handle NaN values
- token_lengths.append(0)
- zero_tokens += 1
- continue
- # Count tokens for this row
- token_count = len(tokenizer.encode(str(text), add_special_tokens=False))
- token_lengths.append(token_count)
-
- # Count rows with 0 tokens
- print(f"\n[info] Number of rows with 0 tokens: {zero_tokens}")
-
- # Add results back to DataFrame
- df["work_notes_token_length"] = token_lengths
-
- # === Save CSV with token counts ===
- df.to_csv(output_csv, index=False)
- print(f"[info] Token lengths saved to {output_csv}")
-
- # === Summary Statistics ===
- print("\n=== Token Length Summary ===")
- print(df["work_notes_token_length"].describe())
-
- # === Plot Histogram ===
- plt.figure(figsize=(10, 6))
- plt.hist(df["work_notes_token_length"], bins=50)
- plt.title(f"Distribution of Token Lengths for {column_name}")
- plt.xlabel("Token Length")
- plt.ylabel("Frequency")
- plt.tight_layout()
- plt.savefig(output_plot, dpi=150)
- plt.close()
- print(f"[info] Histogram saved to {output_plot}")
- exit()
- main()
+ main()🧰 Tools
🪛 Ruff (0.15.0)
[warning] 237-237: Avoid specifying long messages outside the exception class
(TRY003)
[warning] 242-242: Loop control variable idx not used within loop body
Rename unused idx to _idx
(B007)
🤖 Prompt for AI Agents
In `@recipes/noc-reasoning-agent/scripts/utils/token_usage.py` around lines 207 -
276, The __main__ block contains leftover debug logic that hardcodes
csv_file/model, calls AutoTokenizer directly (unsafe if transformers missing),
and calls exit() before main(), making main() unreachable; remove or replace the
entire if __name__ == "__main__": block so it simply parses CLI/default args and
invokes main(), ensure any tokenizer creation uses the existing guarded
import/creation logic (e.g., reuse the module's AutoTokenizer-safe factory or
wrap AutoTokenizer.from_pretrained in the same try/except used elsewhere),
eliminate hardcoded paths and stray exit() calls, and wire csv_file,
column_name, model, output_csv, output_plot into main() parameters or argparse
so the script runs via main() correctly.
| def parse_final_reasoning(generation_text): | ||
| """ | ||
| Parses the 'generation' field to extract only the final | ||
| Thought, Action, and Observation steps. | ||
| """ | ||
| # 1. Isolate the final, clean reasoning trace | ||
| # This regex looks for the block starting with "Question:" and ending with "Finish[...]" | ||
| trace_block_match = re.search( | ||
| r"Question:.*?(Finish\[.*?\])", | ||
| generation_text, | ||
| re.DOTALL, # Allows '.' to match newlines | ||
| ) | ||
|
|
||
| # Find the last occurrence of "Finish" | ||
| last_index = generation_text.rfind("Finish") | ||
|
|
||
| if last_index != -1: | ||
| finish_text = generation_text[last_index:].strip() | ||
| print(finish_text) | ||
| else: | ||
| print("No 'Finish' found!") | ||
|
|
||
| return finish_text | ||
|
|
||
| if not trace_block_match: | ||
| return "Final reasoning trace not found." | ||
|
|
||
| reasoning_trace = trace_block_match.group(0) | ||
|
|
||
| # 2. Extract each individual Thought, Action, and Observation | ||
| # This regex finds all lines starting with the keywords | ||
| step_pattern = re.compile( | ||
| r"^(Thought|Action|Observation)\s+\d+:\s*(.*)$", | ||
| re.MULTILINE, # Allows '^' to match the start of each line | ||
| ) | ||
|
|
||
| steps = step_pattern.findall(reasoning_trace) | ||
|
|
||
| # 3. Format the results for clarity | ||
| parsed_steps = [] | ||
| for step in steps: | ||
| kind = step[0] # "Thought", "Action", or "Observation" | ||
| content = step[1] # The text of the step | ||
| parsed_steps.append(f"**{kind}:** {content}") | ||
|
|
||
| return "\n".join(parsed_steps) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dead code: lines 50–71 are unreachable after return on line 48.
parse_final_reasoning always returns at line 48 (or crashes if last_index == -1 due to the same unbound-variable pattern). Everything from line 50 onward is dead code. Additionally, lines 44 and 46 contain debug print statements that should be removed.
Either remove the dead code or restructure the function so the intended logic is reachable.
🤖 Prompt for AI Agents
In
`@recipes/noc-reasoning-agent/scripts/visualization/generate_trace_visualization.py`
around lines 26 - 71, The function parse_final_reasoning contains an early
return and debug prints that make the intended parsing logic unreachable; remove
the premature return and the print statements and restructure the function to
(1) first find the final "Finish" block using trace_block_match or last_index as
a fallback, (2) set finish_text deterministically (return a clear error string
if neither is found), (3) run the step_pattern regex against that finish_text to
extract Thought/Action/Observation entries, format them into parsed_steps, and
finally return the joined string; refer to parse_final_reasoning,
trace_block_match, last_index, finish_text, step_pattern, and parsed_steps when
making these changes.
| def find_finish_action(generation_text): | ||
| # Regex to capture everything inside Finish[ ... ] | ||
| # match = re.search(r"Finish\[(.*?)\]", generation_text, re.DOTALL) | ||
|
|
||
| # finish_action = match.group(1).strip() | ||
| # return finish_action | ||
| last_index = generation_text.rfind("Finish") | ||
|
|
||
| if last_index != -1: | ||
| finish_text = generation_text[last_index:].strip() | ||
| # print(finish_text) | ||
| else: | ||
| print("No 'Finish' found!") | ||
|
|
||
| return finish_text |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UnboundLocalError when "Finish" is not found.
When generation_text doesn't contain "Finish", finish_text is never assigned, but line 88 returns it unconditionally. This will crash at runtime.
🐛 Proposed fix
def find_finish_action(generation_text):
- # Regex to capture everything inside Finish[ ... ]
- # match = re.search(r"Finish\[(.*?)\]", generation_text, re.DOTALL)
-
- # finish_action = match.group(1).strip()
- # return finish_action
last_index = generation_text.rfind("Finish")
-
if last_index != -1:
- finish_text = generation_text[last_index:].strip()
- # print(finish_text)
+ return generation_text[last_index:].strip()
else:
- print("No 'Finish' found!")
-
- return finish_text
+ return ""🤖 Prompt for AI Agents
In
`@recipes/noc-reasoning-agent/scripts/visualization/generate_trace_visualization.py`
around lines 74 - 88, The function find_finish_action assigns finish_text only
when last_index != -1 but always returns finish_text, causing UnboundLocalError
when "Finish" isn't found; update find_finish_action to handle the not-found
case by returning a safe default (e.g., None or empty string) or raising a clear
exception, and ensure finish_text is defined in both branches (use last_index,
generation_text, and finish_text to locate/construct the return value or the
fallback).
@gwarmstrong This is the Telco Team. Per our conversation two weeks ago, we would like to add this Tutorial to create a reasoning model for Telco NoC Incidences. Thanks
Summary by CodeRabbit
New Features
Documentation