Skip to content

Commit d448eaf

Browse files
Change reorganization logic
1 parent 9bd6f9f commit d448eaf

27 files changed

+3484
-257
lines changed
Lines changed: 177 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,177 @@
1+
[prompts]
2+
plan_prompt = """
3+
You are an expert software architect analyzing repository structures. Generate a reorganization plan as JSON.
4+
5+
# REPOSITORY NAME
6+
{repo_name}
7+
8+
# REPOSITORY STRUCTURE
9+
{tree_structure}
10+
11+
# ANALYSIS GUIDELINES
12+
13+
## 1. Determine Project Type & Standards
14+
- **Python**: Should have tests/, pyproject.toml/setup.py, __init__.py in packages
15+
- **JavaScript/Node**: Should have src/, package.json
16+
- **Go**: Should have cmd/, pkg/, internal/, go.mod
17+
- **Java**: Should have src/main/java/, src/test/java/
18+
- **LaTeX**: Should have latex/, figures/, references/, data/ folders
19+
- **Generic**: Logical grouping, clear separation of concerns
20+
21+
## 2. Evaluate Current Structure (1-5 scale)
22+
- 5: Perfect - follows all conventions
23+
- 4: Good - minor improvements possible
24+
- 3: Acceptable - works but could be better
25+
- 2: Poor - scattered files, confusing structure
26+
- 1: Chaotic - no clear organization
27+
28+
## 3. Critical Issues to Always Fix
29+
- Missing essential files (__init__.py in Python packages, .gitignore)
30+
- Configuration files in wrong locations
31+
- Source files mixed with documentation/tests
32+
- Invalid file extensions
33+
- Duplicate or redundant files
34+
- Empty directories with no purpose
35+
36+
## 4. Reorganization Principles
37+
- **Minimal changes**: Only reorganize if clear benefit exists
38+
- **Follow conventions**: Use standard patterns for the project type
39+
- **Preserve build/config**: Don't break build scripts or configuration paths
40+
- **Group similar files**: When many files of the same type need to be moved, use a single `move_files` action with a glob pattern instead of listing each file individually.
41+
- **Move directories**: To move an entire directory, use `move_directory` action.
42+
43+
## 5. Repository Name Suggestions
44+
- Based on the project type and purpose visible from the structure, suggest **up to 3 alternative names** that would better reflect the content.
45+
- Provide suggestions only if the current name seems ambiguous or non‑standard.
46+
47+
# DECISION RULES
48+
1. If structure is already good (score 4-5), return empty actions list
49+
2. Only suggest moving files if they're clearly in wrong location
50+
3. Always fix critical issues regardless of overall structure quality
51+
4. Preserve build scripts and deployment configurations
52+
5. Use `move_files` for bulk operations (e.g., all .png, all .sty) to keep the plan concise.
53+
6. Use `move_directory` for moving whole folders (e.g., a dataset folder to data/).
54+
55+
# TASK
56+
Analyze the repository and generate a JSON object with reorganization actions. If no changes are needed, return {{"actions": []}}. Optionally, suggest better repository names.
57+
58+
# RESPONSE FORMAT
59+
Return ONLY valid JSON without any additional text.
60+
61+
{{
62+
"analysis_summary": {{
63+
"project_type": "python|javascript|go|java|latex|mixed|unknown",
64+
"quality_score": 1-5,
65+
"critical_issues_count": number,
66+
"recommendation": "no_changes|minor_fixes|moderate_reorg|major_reorg"
67+
}},
68+
"actions": [
69+
{{
70+
"type": "create_directory",
71+
"path": "relative/path/to/folder",
72+
"reason": "Standard convention for project type"
73+
}},
74+
{{
75+
"type": "move_files",
76+
"source_pattern": "*.png",
77+
"destination_dir": "figures/",
78+
"reason": "Group all figure images in figures/ folder"
79+
}},
80+
{{
81+
"type": "move_directory",
82+
"source": "Dataset_Extendido1_FigPaper",
83+
"destination": "data/Dataset_Extendido1_FigPaper",
84+
"reason": "Move dataset folder under data/ for better organization"
85+
}},
86+
{{
87+
"type": "move_file",
88+
"source": "current/path/to/file.ext",
89+
"destination": "new/path/to/file.ext",
90+
"reason": "File belongs with similar functionality files"
91+
}},
92+
{{
93+
"type": "delete_file",
94+
"path": "path/to/obsolete/file.ext",
95+
"reason": "Duplicate/obsolete file"
96+
}},
97+
{{
98+
"type": "delete_directory",
99+
"path": "path/to/empty/directory",
100+
"reason": "Empty directory with no purpose"
101+
}},
102+
{{
103+
"type": "create_file",
104+
"path": "path/to/new/file.ext",
105+
"content": "File content here",
106+
"reason": "Missing essential project file"
107+
}},
108+
{{
109+
"type": "rename_file",
110+
"old_path": "current/path/to/file.ext",
111+
"new_path": "new/path/to/file.ext",
112+
"reason": "Incorrect file extension or naming"
113+
}}
114+
],
115+
"suggested_names": [
116+
"alternative_name_1",
117+
"alternative_name_2",
118+
"alternative_name_3"
119+
]
120+
}}
121+
"""
122+
123+
validation_prompt = """
124+
You are a senior software engineer validating repository reorganization plans.
125+
126+
# ORIGINAL REPOSITORY STRUCTURE
127+
{tree_structure}
128+
129+
# PROPOSED REORGANIZATION PLAN
130+
{proposed_plan}
131+
132+
# TASK
133+
Validate the plan based only on structural information (file paths, names, extensions). If you find issues, return a corrected plan. The corrected plan must follow the same JSON format as the original plan. If the plan is already correct, return it unchanged.
134+
135+
# RESPONSE FORMAT
136+
Return ONLY valid JSON without any additional text.
137+
138+
{{
139+
"corrected_plan": {{
140+
"analysis_summary": {{ ... }},
141+
"actions": [ ... ],
142+
"suggested_names": [ ... ]
143+
}}
144+
}}
145+
"""
146+
147+
fix_prompt = """
148+
You are an expert software engineer tasked with fixing compilation/syntax errors in a codebase. You will receive error output and the content of relevant files. Your goal is to produce corrected file content that resolves the errors.
149+
150+
# Project Type
151+
{project_type}
152+
153+
# Error Output
154+
{error_output}
155+
156+
157+
# File Contents
158+
{files_context}
159+
160+
# Instructions
161+
- Analyze the errors and the provided file contents.
162+
- For each file that needs changes, provide the corrected full content.
163+
- Do not change unrelated parts of the code.
164+
- Ensure your fixes are minimal and correct.
165+
- Return a JSON object containing a list of fixes.
166+
167+
# Response Format
168+
Return ONLY valid JSON without any additional text.
169+
{{
170+
"fixes": [
171+
{{
172+
"file": "relative/path/to/file",
173+
"new_content": "corrected file content"
174+
}}
175+
]
176+
}}
177+
"""

osa_tool/organization/core/__init__.py

Whitespace-only changes.

osa_tool/organization/core/analyzers/__init__.py

Whitespace-only changes.
Lines changed: 147 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,147 @@
1+
"""Base analyzer class for all language-specific analyzers."""
2+
3+
import os
4+
from pathlib import Path
5+
from typing import Dict, List, Set, Tuple
6+
from collections import defaultdict
7+
from concurrent.futures import ThreadPoolExecutor, as_completed
8+
9+
from osa_tool.utils.logger import logger
10+
11+
12+
class BaseAnalyzer:
13+
"""
14+
Abstract base class for all language‑specific analyzers.
15+
16+
This class provides a common interface and shared functionality for analyzing
17+
source code files in different programming languages. Subclasses must implement
18+
language-specific methods for discovering files, extracting imports, and
19+
updating import statements.
20+
21+
Attributes:
22+
base_path (Path): Root directory path for analysis
23+
file_extensions (List[str]): List of file extensions this analyzer handles
24+
discovered_files (List[str]): List of discovered files relative to base_path
25+
import_map (Dict[str, Set[str]]): Mapping from module names to files that import them
26+
"""
27+
28+
def __init__(self, base_path: str):
29+
"""
30+
Initialize the BaseAnalyzer with a base path.
31+
32+
Args:
33+
base_path: Root directory path for analysis
34+
"""
35+
self.base_path = Path(base_path)
36+
self.file_extensions: List[str] = []
37+
self.discovered_files: List[str] = []
38+
self.import_map: Dict[str, Set[str]] = {}
39+
40+
def discover_files(self) -> List[str]:
41+
"""
42+
Walk the base_path and collect all files with the configured extensions.
43+
Hidden directories (starting with '.') are skipped.
44+
45+
Returns:
46+
List[str]: List of discovered file paths relative to base_path
47+
"""
48+
self.discovered_files = []
49+
for ext in self.file_extensions:
50+
for path in self.base_path.rglob(f"*{ext}"):
51+
if path.is_file() and not any(part.startswith(".") for part in path.parts):
52+
rel_path = str(path.relative_to(self.base_path))
53+
self.discovered_files.append(rel_path)
54+
return self.discovered_files
55+
56+
def extract_imports(self, file_path: str) -> Set[str]:
57+
"""
58+
Return a set of imported module names found in the given file.
59+
60+
Args:
61+
file_path: Path to the file relative to base_path
62+
63+
Returns:
64+
Set[str]: Set of module names imported in the file
65+
66+
Raises:
67+
NotImplementedError: Must be implemented by subclasses
68+
"""
69+
raise NotImplementedError
70+
71+
def get_import_key(self, file_path: str) -> str:
72+
"""
73+
Return a canonical key (e.g. dotted module path) for the file.
74+
75+
Args:
76+
file_path: Path to the file relative to base_path
77+
78+
Returns:
79+
str: Canonical import key for the file
80+
81+
Raises:
82+
NotImplementedError: Must be implemented by subclasses
83+
"""
84+
raise NotImplementedError
85+
86+
def update_imports_in_file(self, file_path: str, old_import: str, new_import: str) -> str | None:
87+
"""
88+
Return the updated content of the file with old_import replaced by new_import,
89+
or None if no changes were made.
90+
91+
Args:
92+
file_path: Path to the file relative to base_path
93+
old_import: Original import string to replace
94+
new_import: New import string to use
95+
96+
Returns:
97+
Optional[str]: Updated file content or None if no changes needed
98+
99+
Raises:
100+
NotImplementedError: Must be implemented by subclasses
101+
"""
102+
raise NotImplementedError
103+
104+
def build_import_map(self):
105+
"""
106+
Populate import_map: for each imported module, list of files that import it.
107+
Uses parallel processing with ThreadPoolExecutor for performance.
108+
"""
109+
import_map = defaultdict(set)
110+
111+
def process_file(fpath: str) -> Tuple[str, Set[str]]:
112+
"""
113+
Process a single file to extract its imports.
114+
115+
Args:
116+
fpath: File path relative to base_path
117+
118+
Returns:
119+
Tuple[str, Set[str]]: File path and its imports
120+
"""
121+
try:
122+
imports = self.extract_imports(fpath)
123+
return fpath, imports
124+
except Exception as e:
125+
logger.error(f"Error extracting imports from {fpath}: {e}")
126+
return fpath, set()
127+
128+
with ThreadPoolExecutor(max_workers=os.cpu_count()) as executor:
129+
future_to_file = {executor.submit(process_file, f): f for f in self.discovered_files}
130+
for future in as_completed(future_to_file):
131+
fpath, imports = future.result()
132+
for imp in imports:
133+
import_map[imp].add(fpath)
134+
135+
self.import_map = dict(import_map)
136+
137+
def get_files_importing_module(self, module_path: str) -> Set[str]:
138+
"""
139+
Return all files that import the given module.
140+
141+
Args:
142+
module_path: Module path to look up
143+
144+
Returns:
145+
Set[str]: Set of file paths that import the module
146+
"""
147+
return self.import_map.get(module_path, set())

0 commit comments

Comments
 (0)