Skip to content

Commit 6178f6f

Browse files
committed
update readme and design
1 parent 0d2c29e commit 6178f6f

File tree

3 files changed

+28
-19
lines changed

3 files changed

+28
-19
lines changed

README.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -85,10 +85,14 @@ This is a tutorial project of [Pocket Flow](https://github.com/The-Pocket/Pocket
8585

8686
7. Generate a complete codebase tutorial by running the main script:
8787
```bash
88-
python main.py https://github.com/username/repo --include "*.py" "*.js" --exclude "tests/*" --max-size 50000
88+
# Analyze a GitHub repository
89+
python main.py --repo https://github.com/username/repo --include "*.py" "*.js" --exclude "tests/*" --max-size 50000
90+
91+
# Or, analyze a local directory
92+
python main.py --dir /path/to/your/codebase --include "*.py" --exclude "*test*"
8993
```
90-
- `repo_url` - URL of the GitHub repository (required)
91-
- `-n, --name` - Project name (optional, derived from URL if omitted)
94+
- `--repo` or `--dir` - Specify either a GitHub repo URL or a local directory path (required, mutually exclusive)
95+
- `-n, --name` - Project name (optional, derived from URL/directory if omitted)
9296
- `-t, --token` - GitHub token (or set GITHUB_TOKEN environment variable)
9397
- `-o, --output` - Output directory (default: ./output)
9498
- `-i, --include` - Files to include (e.g., "*.py" "*.js")

docs/design.md

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -68,8 +68,12 @@ flowchart TD
6868
1. **`crawl_github_files`** (`utils/crawl_github_files.py`) - *External Dependency: requests*
6969
* *Input*: `repo_url` (str), `token` (str, optional), `max_file_size` (int, optional), `use_relative_paths` (bool, optional), `include_patterns` (set, optional), `exclude_patterns` (set, optional)
7070
* *Output*: `dict` containing `files` (dict[str, str]) and `stats`.
71-
* *Necessity*: Required by `FetchRepo` to download and read the source code from GitHub. Handles cloning logic implicitly via API calls, filtering, and file reading.
72-
2. **`call_llm`** (`utils/call_llm.py`) - *External Dependency: LLM Provider API (e.g., OpenAI, Anthropic)*
71+
* *Necessity*: Required by `FetchRepo` to download and read source code from GitHub if a `repo_url` is provided. Handles cloning logic implicitly via API calls, filtering, and file reading.
72+
2. **`crawl_local_files`** (`utils/crawl_local_files.py`) - *External Dependency: None*
73+
* *Input*: `directory` (str), `max_file_size` (int, optional), `use_relative_paths` (bool, optional), `include_patterns` (set, optional), `exclude_patterns` (set, optional)
74+
* *Output*: `dict` containing `files` (dict[str, str]).
75+
* *Necessity*: Required by `FetchRepo` to read source code from a local directory if a `local_dir` path is provided. Handles directory walking, filtering, and file reading.
76+
3. **`call_llm`** (`utils/call_llm.py`) - *External Dependency: LLM Provider API (e.g., OpenAI, Anthropic)*
7377
* *Input*: `prompt` (str)
7478
* *Output*: `response` (str)
7579
* *Necessity*: Used by `IdentifyAbstractions`, `AnalyzeRelationships`, `OrderChapters`, and `WriteChapters` for code analysis and content generation. Needs careful prompt engineering and YAML validation (implicit via `yaml.safe_load` which raises errors).
@@ -105,11 +109,11 @@ shared = {
105109
> Notes for AI: Carefully decide whether to use Batch/Async Node/Flow. Removed explicit try/except in exec, relying on Node's built-in fault tolerance.
106110
107111
1. **`FetchRepo`**
108-
* *Purpose*: Download the repository code and load relevant files into memory using the crawler utility.
112+
* *Purpose*: Download the repository code (from GitHub) or read from a local directory, loading relevant files into memory using the appropriate crawler utility.
109113
* *Type*: Regular
110114
* *Steps*:
111-
* `prep`: Read `repo_url`, optional `github_token`, `output_dir` from shared store. Define `include_patterns` (e.g., `{"*.py", "*.js", "*.md"}`) and `exclude_patterns` (e.g., `{"*test*", "docs/*"}`). Set `max_file_size` and `use_relative_paths` flags. Determine `project_name` from `repo_url` if not present in shared.
112-
* `exec`: Call `crawl_github_files(shared["repo_url"], token=shared["github_token"], include_patterns=..., exclude_patterns=..., max_file_size=..., use_relative_paths=True)`. Convert the resulting `files` dictionary into a list of `(path, content)` tuples.
115+
* `prep`: Read `repo_url` (if provided), `local_dir` (if provided), optional `github_token`, `output_dir` from shared store. Define `include_patterns` (e.g., `{"*.py", "*.js", "*.md"}`) and `exclude_patterns` (e.g., `{"*test*", "docs/*"}`). Set `max_file_size` and `use_relative_paths` flags. Determine `project_name` from `repo_url` or `local_dir` if not present in shared.
116+
* `exec`: If `repo_url` is present, call `crawl_github_files(...)`. Otherwise, call `crawl_local_files(...)`. Convert the resulting `files` dictionary into a list of `(path, content)` tuples.
113117
* `post`: Write the list of `files` tuples and the derived `project_name` (if applicable) to the shared store.
114118

115119
2. **`IdentifyAbstractions`**

nodes.py

Lines changed: 12 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -5,17 +5,6 @@
55
from utils.call_llm import call_llm
66
from utils.crawl_local_files import crawl_local_files
77

8-
# Helper to create context from files, respecting limits (basic example)
9-
def create_llm_context(files_data):
10-
context = ""
11-
file_info = [] # Store tuples of (index, path)
12-
for i, (path, content) in enumerate(files_data):
13-
entry = f"--- File Index {i}: {path} ---\n{content}\n\n"
14-
context += entry
15-
file_info.append((i, path))
16-
17-
return context, file_info # file_info is list of (index, path)
18-
198
# Helper to get content for specific file indices
209
def get_content_for_indices(files_data, indices):
2110
content_map = {}
@@ -87,6 +76,18 @@ class IdentifyAbstractions(Node):
8776
def prep(self, shared):
8877
files_data = shared["files"]
8978
project_name = shared["project_name"] # Get project name
79+
80+
# Helper to create context from files, respecting limits (basic example)
81+
def create_llm_context(files_data):
82+
context = ""
83+
file_info = [] # Store tuples of (index, path)
84+
for i, (path, content) in enumerate(files_data):
85+
entry = f"--- File Index {i}: {path} ---\n{content}\n\n"
86+
context += entry
87+
file_info.append((i, path))
88+
89+
return context, file_info # file_info is list of (index, path)
90+
9091
context, file_info = create_llm_context(files_data)
9192
# Format file info for the prompt (comment is just a hint for LLM)
9293
file_listing_for_prompt = "\n".join([f"- {idx} # {path}" for idx, path in file_info])

0 commit comments

Comments
 (0)