Skip to content

Commit d80a7ce

Browse files
committed
refine the status and report
1 parent 33d338b commit d80a7ce

File tree

17 files changed

+556
-192
lines changed

17 files changed

+556
-192
lines changed

src/sleepless_agent/config.yaml

Lines changed: 79 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -3,13 +3,14 @@ claude_code:
33
model: claude-sonnet-4-5-20250929
44
night_start_hour: 1
55
night_end_hour: 9
6-
threshold_day: 100.0
7-
threshold_night: 100.0
6+
threshold_day: 20.0
7+
threshold_night: 80.0
88
usage_command: claude /usage
99

1010
git:
1111
use_remote_repo: true
1212
remote_repo_url: [email protected]:TimeLovercc/sleepless-agent.git
13+
auto_create_repo: true
1314

1415
agent:
1516
workspace_root: ./workspace
@@ -29,33 +30,82 @@ multi_agent_workflow:
2930
auto_generation:
3031
enabled: true
3132
prompts:
32-
- name: default_improvement
33+
- name: refine_focused
3334
prompt: |-
34-
You are a software development assistant. Generate ONE specific, actionable improvement idea for a Generic Python project.
35-
36-
Generate task ideas in categories like:
37-
- Code quality (refactoring, optimization, testing)
38-
- Documentation (docstrings, README, examples)
39-
- Features (new functionality, enhancements)
40-
- Architecture (design improvements, modularity)
41-
- Performance (caching, algorithms, database queries)
42-
- Security (input validation, authentication, encryption)
43-
44-
IMPORTANT: Classify the task type and prefix your response with [NEW] or [REFINE].
45-
Respond with the type prefix followed by a single task description in 1-2 sentences.
46-
weight: 0.7
47-
- name: bug_fix_maintenance
35+
The workspace has multiple ongoing projects and tasks that need attention.
36+
37+
## Current State
38+
- Active tasks: {task_count} ({pending_count} pending, {in_progress_count} in progress)
39+
- Many tasks are already in progress or pending
40+
41+
## Recent Work & Context
42+
{recent_work}
43+
44+
## Task Generation
45+
Generate ONE REFINE task to continue or improve existing work:
46+
- Complete partial/incomplete tasks mentioned above
47+
- Follow up on outstanding items and recommendations
48+
- Enhance or improve existing projects in the workspace
49+
- Fix issues or improve quality of current work
50+
- Add missing components to existing implementations
51+
- Expand documentation or analysis from previous tasks
52+
53+
IMPORTANT: Your response MUST start with [REFINE] followed by a specific, actionable task description in 1-2 sentences.
54+
55+
Focus on completing or improving what already exists in the workspace rather than starting new projects.
56+
weight: 0.45
57+
- name: balanced
58+
prompt: |-
59+
Review the workspace state and generate a valuable task.
60+
61+
## Current State
62+
- Active tasks: {task_count} ({pending_count} pending, {in_progress_count} in progress)
63+
64+
## Recent Work & Context
65+
{recent_work}
66+
67+
## Task Generation
68+
Generate ONE valuable task (NEW or REFINE):
69+
- For REFINE: improve existing work, complete partial tasks, enhance current projects
70+
- For NEW: create something useful, interesting, or educational
71+
72+
Task categories to consider:
73+
- Software development (applications, scripts, tools, APIs)
74+
- Data analysis and visualization projects
75+
- Research and documentation (technical guides, comparisons, best practices)
76+
- Creative writing (stories, tutorials, technical articles)
77+
- System design and architecture documents
78+
- Educational content and examples
79+
- Automation and productivity improvements
80+
- Analysis and evaluation reports
81+
82+
IMPORTANT: Prefix your response with [NEW] or [REFINE] followed by a specific, actionable task description in 1-2 sentences.
83+
weight: 0.35
84+
- name: new_friendly
4885
prompt: |-
49-
You are a software development assistant. Generate ONE specific bug fix or maintenance task for a Generic Python project.
50-
51-
Focus on areas like:
52-
- Bug fixes (edge cases, error handling, race conditions)
53-
- Technical debt (outdated dependencies, deprecated APIs)
54-
- Code maintenance (cleanup, refactoring for clarity)
55-
- Robustness (input validation, error recovery)
56-
- Edge case handling (boundary conditions, null checks)
57-
58-
IMPORTANT: Classify the task type and prefix your response with [NEW] or [REFINE].
59-
Respond with the type prefix followed by a single task description in 1-2 sentences.
60-
weight: 0.3
86+
Generate an interesting and valuable task for the workspace.
87+
88+
## Current State
89+
- Active tasks: {task_count} ({pending_count} pending, {in_progress_count} in progress)
90+
- Few tasks in queue - good time for new projects!
91+
92+
## Task Generation
93+
Generate ONE innovative task that creates value.
94+
95+
Areas to explore:
96+
- Build practical tools and utilities
97+
- Create educational content and tutorials
98+
- Develop software applications or scripts
99+
- Write comprehensive documentation or guides
100+
- Design systems and architectures
101+
- Analyze and compare technologies or approaches
102+
- Generate creative content (technical writing, examples)
103+
- Research and summarize complex topics
104+
- Create data visualizations or analysis
105+
- Develop proof-of-concepts or experiments
106+
107+
Can be NEW (fresh project) or REFINE (improve existing work) - choose what would be most valuable.
108+
109+
IMPORTANT: Prefix your response with [NEW] or [REFINE] followed by a specific, actionable task description in 1-2 sentences.
110+
weight: 0.20
61111

src/sleepless_agent/core/daemon.py

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
import asyncio
66
import signal
77
import sys
8-
from datetime import datetime, timedelta
8+
from datetime import datetime, timedelta, timezone
99
from pathlib import Path
1010

1111
from sqlalchemy.orm import sessionmaker
@@ -99,7 +99,11 @@ def __init__(self) -> None:
9999
str(self.config.agent.results_path),
100100
)
101101

102-
self.git = GitManager(workspace_root=str(self.config.agent.workspace_root))
102+
auto_create_repo = git_config.get("auto_create_repo", False) if git_config else False
103+
self.git = GitManager(
104+
workspace_root=str(self.config.agent.workspace_root),
105+
auto_create_repo=auto_create_repo,
106+
)
103107
self.git.init_repo()
104108
if self.use_remote_repo and self.remote_repo_url:
105109
try:
@@ -210,7 +214,13 @@ async def run(self) -> None:
210214
logger.info("Sleepless Agent starting...")
211215

212216
try:
213-
self.bot.start()
217+
# Start bot in background thread to avoid blocking the async event loop
218+
# The Slack SDK's connect() is synchronous and would block forever
219+
import threading
220+
bot_thread = threading.Thread(target=self.bot.start, daemon=True, name="SlackBot")
221+
bot_thread.start()
222+
await asyncio.sleep(0.5) # Give bot time to initialize
223+
logger.info("Slack bot started in background thread")
214224
except Exception as exc:
215225
logger.error(f"Failed to start bot: {exc}")
216226
return
@@ -259,13 +269,12 @@ async def _process_tasks(self) -> None:
259269
break
260270

261271
await self.task_runtime.execute(task)
262-
self.scheduler.log_task_execution(task.id)
263272
await asyncio.sleep(1)
264273
except Exception as exc:
265274
logger.error(f"Error in task processing loop: {exc}")
266275

267276
def _check_and_summarize_daily_reports(self) -> None:
268-
now = datetime.utcnow()
277+
now = datetime.now(timezone.utc).replace(tzinfo=None)
269278
end_of_day = now.replace(hour=23, minute=59, second=0, microsecond=0)
270279

271280
if self.last_daily_summarization is None or self.last_daily_summarization.date() != now.date():

src/sleepless_agent/core/executor.py

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
import subprocess
66
import time
77
from collections import OrderedDict
8-
from datetime import datetime
8+
from datetime import datetime, timezone
99
from pathlib import Path
1010
from typing import Optional, Tuple, List, Dict
1111
import shutil
@@ -243,7 +243,7 @@ def _ensure_readme_exists(self, workspace: Path, task_id: int, task_description:
243243
PRIORITY="serious" if project_id else "random",
244244
PRIORITY_LABEL="SERIOUS" if project_id else "RANDOM",
245245
PROJECT_NAME=project_name or "None",
246-
CREATED_AT=datetime.utcnow().isoformat(),
246+
CREATED_AT=datetime.now(timezone.utc).replace(tzinfo=None).isoformat(),
247247
)
248248

249249
readme_path.write_text(content)
@@ -340,7 +340,7 @@ def _update_readme_task_history(self, workspace: Path, task_id: int,
340340
status_icon = "✅" if status == "completed" else "❌"
341341
git_line = f"\n- Git: {git_info}" if git_info else ""
342342

343-
update = f"\n\n### Execution {datetime.utcnow().strftime('%Y-%m-%d %H:%M:%S')}\n"
343+
update = f"\n\n### Execution {datetime.now(timezone.utc).replace(tzinfo=None).strftime('%Y-%m-%d %H:%M:%S')}\n"
344344
update += f"- Status: {status_icon} {status.upper()}\n"
345345
update += f"- Files Modified: {files_modified}\n"
346346
update += f"- Duration: {execution_time}s"
@@ -1176,7 +1176,7 @@ async def execute_task(
11761176
project_id: Optional[str] = None,
11771177
project_name: Optional[str] = None,
11781178
workspace_task_type: Optional[str] = None,
1179-
) -> Tuple[str, List[str], List[str], int]:
1179+
) -> Tuple[str, List[str], List[str], int, Dict, Optional[str]]:
11801180
"""Execute task with Claude Code SDK
11811181
11821182
Args:
@@ -1190,7 +1190,8 @@ async def execute_task(
11901190
workspace_task_type: Workspace task type ("new" or "refine") - for workspace initialization
11911191
11921192
Returns:
1193-
Tuple of (output_text, files_modified, commands_executed, exit_code, usage_metrics)
1193+
Tuple of (output_text, files_modified, commands_executed, exit_code, usage_metrics, eval_status)
1194+
eval_status can be: "COMPLETE", "PARTIAL", "INCOMPLETE", "FAILED", or None if evaluator disabled
11941195
"""
11951196
timeout = timeout or self.default_timeout
11961197

@@ -1472,7 +1473,7 @@ async def execute_task(
14721473
commands=len(all_commands_executed),
14731474
)
14741475

1475-
return output_text, all_modified_files, all_commands_executed, final_exit_code, combined_metrics
1476+
return output_text, all_modified_files, all_commands_executed, final_exit_code, combined_metrics, eval_status
14761477

14771478
except CLINotFoundError:
14781479
self._live_update(

src/sleepless_agent/core/queue.py

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
from __future__ import annotations
44

55
import json
6-
from datetime import datetime, timedelta
6+
from datetime import datetime, timedelta, timezone
77
from typing import List, Optional
88

99
from sqlalchemy import case
@@ -112,7 +112,7 @@ def _op(session: Session) -> Optional[Task]:
112112
task = session.query(Task).filter(Task.id == task_id).first()
113113
if task:
114114
task.status = TaskStatus.IN_PROGRESS
115-
task.started_at = datetime.utcnow()
115+
task.started_at = datetime.now(timezone.utc).replace(tzinfo=None)
116116
task.attempt_count += 1
117117
return task
118118

@@ -128,7 +128,7 @@ def _op(session: Session) -> Optional[Task]:
128128
task = session.query(Task).filter(Task.id == task_id).first()
129129
if task:
130130
task.status = TaskStatus.COMPLETED
131-
task.completed_at = datetime.utcnow()
131+
task.completed_at = datetime.now(timezone.utc).replace(tzinfo=None)
132132
task.result_id = result_id
133133
return task
134134

@@ -146,7 +146,7 @@ def _op(session: Session) -> Optional[Task]:
146146
task.status = TaskStatus.FAILED
147147
task.error_message = error_message
148148
if not task.completed_at:
149-
task.completed_at = datetime.utcnow()
149+
task.completed_at = datetime.now(timezone.utc).replace(tzinfo=None)
150150
return task
151151

152152
task = self._run_write(_op)
@@ -161,7 +161,7 @@ def _op(session: Session) -> Optional[Task]:
161161
task = session.query(Task).filter(Task.id == task_id).first()
162162
if task and task.status == TaskStatus.PENDING:
163163
task.status = TaskStatus.CANCELLED
164-
task.deleted_at = datetime.utcnow()
164+
task.deleted_at = datetime.now(timezone.utc).replace(tzinfo=None)
165165
return task
166166

167167
task = self._run_write(_op)
@@ -247,7 +247,7 @@ def timeout_expired_tasks(self, max_age_seconds: int) -> List[Task]:
247247
return []
248248

249249
def _op(session: Session) -> List[Task]:
250-
cutoff = datetime.utcnow() - timedelta(seconds=max_age_seconds)
250+
cutoff = datetime.now(timezone.utc).replace(tzinfo=None) - timedelta(seconds=max_age_seconds)
251251
tasks = (
252252
session.query(Task)
253253
.filter(
@@ -261,7 +261,7 @@ def _op(session: Session) -> List[Task]:
261261
if not tasks:
262262
return []
263263

264-
now = datetime.utcnow()
264+
now = datetime.now(timezone.utc).replace(tzinfo=None)
265265
for task in tasks:
266266
task.status = TaskStatus.FAILED
267267
task.completed_at = now
@@ -360,7 +360,7 @@ def _op(session: Session) -> int:
360360
for task in tasks:
361361
if task.status == TaskStatus.PENDING:
362362
task.status = TaskStatus.CANCELLED
363-
task.deleted_at = datetime.utcnow()
363+
task.deleted_at = datetime.now(timezone.utc).replace(tzinfo=None)
364364
count += 1
365365
return count
366366

src/sleepless_agent/core/task_runtime.py

Lines changed: 39 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
import asyncio
44
import os
55
import time
6-
from datetime import datetime
6+
from datetime import datetime, timezone
77
from pathlib import Path
88
from typing import Iterable, List, Optional, Set, TYPE_CHECKING
99

@@ -92,6 +92,7 @@ async def execute(self, task) -> None:
9292
commands_executed,
9393
exit_code,
9494
usage_metrics,
95+
eval_status,
9596
) = await self._run_task_with_timeout(task)
9697

9798
processing_time = int(time.time() - start_time)
@@ -106,6 +107,7 @@ async def execute(self, task) -> None:
106107
duration_s=processing_time,
107108
total_cost_usd=usage_metrics.get("total_cost_usd"),
108109
turns=usage_metrics.get("num_turns"),
110+
eval_status=eval_status,
109111
)
110112

111113
if exit_code != 0:
@@ -157,23 +159,41 @@ async def execute(self, task) -> None:
157159
else:
158160
task_log.warning("task.git.skipped", reason="workspace_missing")
159161

160-
self.task_queue.mark_completed(task.id, result_id=result.id)
161-
self._log_success_metrics(
162-
task=task,
163-
processing_time=processing_time,
164-
files_modified=files_modified,
165-
commands_executed=commands_executed,
166-
git_commit_sha=git_commit_sha,
167-
git_pr_url=git_pr_url,
168-
usage_metrics=usage_metrics,
169-
result_output=result_output,
170-
)
171-
task_log.info(
172-
"task.complete",
173-
status="completed",
174-
duration_s=processing_time,
175-
git_commit=git_commit_sha,
176-
)
162+
# Check evaluator status before marking as completed
163+
# Only mark as completed if evaluator says COMPLETE, or if evaluator is disabled
164+
if eval_status and eval_status.upper() in ["INCOMPLETE", "FAILED"]:
165+
task_log.warning(
166+
"task.evaluator_incomplete",
167+
eval_status=eval_status,
168+
message="Task marked as failed due to evaluator status"
169+
)
170+
self.task_queue.mark_failed(task.id, f"Evaluator status: {eval_status}")
171+
self._log_failure_metrics(task=task, duration=processing_time, error=f"Evaluator: {eval_status}")
172+
task_log.info(
173+
"task.complete",
174+
status="failed",
175+
duration_s=processing_time,
176+
eval_status=eval_status,
177+
)
178+
else:
179+
self.task_queue.mark_completed(task.id, result_id=result.id)
180+
self._log_success_metrics(
181+
task=task,
182+
processing_time=processing_time,
183+
files_modified=files_modified,
184+
commands_executed=commands_executed,
185+
git_commit_sha=git_commit_sha,
186+
git_pr_url=git_pr_url,
187+
usage_metrics=usage_metrics,
188+
result_output=result_output,
189+
)
190+
task_log.info(
191+
"task.complete",
192+
status="completed",
193+
duration_s=processing_time,
194+
git_commit=git_commit_sha,
195+
eval_status=eval_status,
196+
)
177197
except PauseException as pause:
178198
await self._handle_pause_exception(
179199
task=task,
@@ -529,7 +549,7 @@ async def _handle_pause_exception(
529549

530550
sleep_seconds = 0.0
531551
if pause.reset_time:
532-
now = datetime.utcnow()
552+
now = datetime.now(timezone.utc).replace(tzinfo=None)
533553
sleep_seconds = max(0.0, (pause.reset_time - now).total_seconds())
534554
task_log.info("task.pause.reset_time", reset_at=reset_time_iso)
535555
else:

0 commit comments

Comments
 (0)