Skip to content

Commit 91cc250

Browse files
committed
feat: update Dockerfile for frontend build process and adjust output directory in vite.config.ts
1 parent 9ff06f0 commit 91cc250

File tree

3 files changed

+350
-9
lines changed

3 files changed

+350
-9
lines changed

.github/copilot-instructions.md

Lines changed: 345 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,345 @@
1+
# OpenDeepWiki Development Guide for AI Agents
2+
3+
## Project Overview
4+
5+
**OpenDeepWiki** is an AI-driven code knowledge base system built on **.NET 9** and **Semantic Kernel**. It analyzes code repositories, generates documentation, creates directory structures, and supports MCP (Model Context Protocol) for AI integration.
6+
7+
### Core Purpose
8+
- Convert GitHub/GitLab/Gitee repositories into searchable knowledge bases
9+
- Auto-generate documentation, READMEs, and code analysis via LLM
10+
- Support multiple AI providers (OpenAI, AzureOpenAI, Anthropic)
11+
- Provide MCP endpoints for AI agents to query repository knowledge
12+
13+
---
14+
15+
## Architecture
16+
17+
### Full-Stack Structure
18+
```
19+
Backend: .NET 9 ASP.NET Core + Entity Framework Core + Semantic Kernel
20+
Frontend: React 19 + TypeScript + Vite + TailwindCSS + Shadcn/ui
21+
Database: SQLite/PostgreSQL/MySQL/SQL Server (configurable)
22+
Deployment: Docker Compose or Sealos
23+
```
24+
25+
### Backend Layer Breakdown
26+
27+
**`src/KoalaWiki/`** - Main ASP.NET Core application
28+
- **`BackendService/`** - Background task orchestration (warehouse sync, document processing)
29+
- **`KoalaWarehouse/`** - Core document analysis engine:
30+
- **`Pipeline/`** - Resilient document processing pipeline with 5 ordered steps
31+
- **`GenerateThinkCatalogue/`** - AI-powered directory structure generation
32+
- **`DocumentPending/`** - Incomplete document task handling
33+
- **`MiniMapService.cs`** - Generates knowledge graphs via Mermaid
34+
35+
**`KoalaWiki.Core/`** - Data access layer
36+
- **`DataAccess/IKoalaWikiContext.cs`** - DbSet definitions for 18+ entity types
37+
- **`ServiceExtensions.cs`** - DI registration for database providers
38+
39+
**`KoalaWiki.Domains/`** - Domain models
40+
- **`Warehouse.cs`** - Repository metadata and configuration
41+
- **`Document.cs`** - Document content and metadata
42+
- **`DocumentFile/`** - File structure and catalog definitions
43+
- **`FineTuning/`** - Training dataset generation
44+
- **`MCP/`** - Model Context Protocol entities
45+
46+
**`Provider/`** - Database implementations
47+
- `KoalaWiki.Provider.PostgreSQL`
48+
- `KoalaWiki.Provider.MySQL`
49+
- `KoalaWiki.Provider.SqlServer`
50+
51+
### Frontend Layer Breakdown
52+
53+
**`web-site/src/`** - React application
54+
- **`pages/`** - Route-based page components: `home`, `auth`, `admin`, `repository`, `chat`
55+
- **`components/`** - Reusable UI components (RepositoryLayout, AdminLayout)
56+
- **`services/`** - HTTP API clients and API wrappers
57+
- **`stores/`** - Zustand state management stores
58+
- **`i18n/`** - Internationalization (Chinese, English, French)
59+
- **`routes/`** - React Router configuration with lazy loading
60+
61+
---
62+
63+
## Critical Data Flows
64+
65+
### 1. Repository Analysis Flow (README from README.md)
66+
```
67+
Clone Repository → .gitignore Filtering → Directory Scanning →
68+
AI Smart Filter (if file count > threshold) → Directory JSON →
69+
Generate README → Project Classification → Project Overview →
70+
Save to Database → Generate Task List (Think Catalogue) →
71+
Process Documents Recursively → Generate Commit Log
72+
```
73+
74+
### 2. Document Processing Pipeline (5-Step Architecture)
75+
Located in `KoalaWarehouse/Extensions/ServiceCollectionExtensions.cs`:
76+
77+
**Execution Order:**
78+
1. **ReadmeGenerationStep** - Generate README.md
79+
2. **CatalogueGenerationStep** - Create directory structure
80+
3. **ProjectClassificationStep** - Classify project type
81+
4. **DocumentStructureGenerationStep** - Build document TOC
82+
5. **DocumentContentGenerationStep** - Generate document content
83+
84+
**Key Classes:**
85+
- `ResilientDocumentProcessingPipeline` - Wraps pipeline with retry/fallback logic
86+
- `DocumentProcessingContext` - Carries data through steps
87+
- `DocumentProcessingOrchestrator` - Orchestrates with OpenTelemetry tracing
88+
89+
### 3. AI Kernel Initialization (KernelFactory Pattern)
90+
`KernelFactory.GetKernel()` initializes Semantic Kernel with:
91+
- **LLM Provider Selection**: OpenAI or AzureOpenAI via `OpenAIOptions.ModelProvider`
92+
- **Plugins Loaded**:
93+
- Code Analysis plugins (in `plugins/CodeAnalysis/`) with `.skprompt.txt` prompts
94+
- FileTool plugin - reads repository files with token limits
95+
- AgentTool plugin - MCP integration
96+
- Dynamic MCP service loading from `DocumentOptions.McpStreamable`
97+
- **Custom HttpClient** - Handles gzip/brotli decompression
98+
99+
---
100+
101+
## Key Development Workflows
102+
103+
### Build & Run
104+
105+
**Frontend:**
106+
```bash
107+
cd web-site
108+
npm install
109+
npm run dev # Dev server at localhost:5173
110+
npm run build # Build to ../src/KoalaWiki/wwwroot
111+
npm run build:analyze # Bundle analysis
112+
npm run lint # ESLint check
113+
```
114+
115+
**Backend:**
116+
```bash
117+
dotnet build KoalaWiki.sln
118+
dotnet run --project src/KoalaWiki/KoalaWiki.csproj
119+
# API at http://localhost:5085, OpenAPI at /scalar
120+
```
121+
122+
**Docker (with make/Makefile):**
123+
```bash
124+
make build # Build all images
125+
make build-frontend # Frontend only
126+
make dev # Run all services with logs
127+
make dev-backend # Backend only
128+
make build-arm # ARM64 architecture
129+
make build-amd # AMD64 architecture
130+
```
131+
132+
### Database Migrations
133+
134+
Entity Framework Core migrations (in `KoalaWiki.Core/`):
135+
```bash
136+
dotnet ef migrations add <MigrationName> --project KoalaWiki.Core --startup-project src/KoalaWiki/KoalaWiki.csproj
137+
dotnet ef database update --project KoalaWiki.Core --startup-project src/KoalaWiki/KoalaWiki.csproj
138+
```
139+
140+
### Environment Configuration
141+
142+
Critical environment variables in `docker-compose.yml`:
143+
- **`CHAT_MODEL`** (required) - Must support function calling (DeepSeek-V3, GPT-4-turbo)
144+
- **`ANALYSIS_MODEL`** (optional) - Defaults to CHAT_MODEL; recommend GPT-4.1 for better dir structure
145+
- **`CHAT_API_KEY`** - LLM API credential
146+
- **`ENDPOINT`** - API base URL (e.g., https://api.openai.com/v1)
147+
- **`MODEL_PROVIDER`** - OpenAI or AzureOpenAI
148+
- **`DB_TYPE`** - sqlite, postgres, mysql, sqlserver
149+
- **`DB_CONNECTION_STRING`** - Database connection
150+
- **`LANGUAGE`** - Document generation language (default: Chinese)
151+
- **`READ_MAX_TOKENS`** - Token limit for file reading (recommended: 70% of model max)
152+
- **`MCP_STREAMABLE`** - Format: `serviceName=url` (e.g., `claude=http://localhost:8080/api/mcp`)
153+
154+
---
155+
156+
## Project-Specific Patterns & Conventions
157+
158+
### 1. FastAPI Service Pattern
159+
Services inherit from `FastApi` (from FastService NuGet):
160+
```csharp
161+
public class RepositoryService(IKoalaWikiContext db) : FastApi
162+
{
163+
[HttpGet("/repos")]
164+
public async Task<List<Warehouse>> GetRepositories()
165+
{
166+
// Endpoint auto-exposed via FastService
167+
}
168+
}
169+
```
170+
- Automatically registers routes without explicit Route attributes
171+
- DI via constructor parameters
172+
- Response mapping via Mapster
173+
174+
### 2. Entity & Domain Model Structure
175+
Base entity in `KoalaWiki.Domains/Entity.cs`:
176+
```csharp
177+
public class Entity<TKey> : IEntity<TKey>, ICreateEntity
178+
{
179+
public TKey Id { get; set; }
180+
public DateTime CreatedAt { get; set; }
181+
}
182+
```
183+
- All domain entities inherit this with generic TKey (usually int/string)
184+
- `ICreateEntity` marks automatic timestamp tracking
185+
- Models in `KoalaWiki.Domains/` mapped to database via EF Core
186+
187+
### 3. Semantic Kernel Prompt Files
188+
Located in `src/KoalaWiki/plugins/CodeAnalysis/`:
189+
```
190+
plugins/
191+
├── GenerateReadme/
192+
│ ├── config.json # Plugin metadata
193+
│ └── skprompt.txt # Semantic Kernel prompt template
194+
├── CommitAnalyze/
195+
├── GenerateDescription/
196+
└── FunctionPrompt/
197+
```
198+
- `config.json` - Defines function signature, input/output schema
199+
- `skprompt.txt` - Template with `{{$variable}}` syntax (Semantic Kernel format, NOT Handlebars)
200+
- Loaded dynamically in `KernelFactory.GetKernel()`
201+
202+
### 4. Pipeline Context Flow Pattern
203+
```csharp
204+
// DocumentProcessingContext carries state through pipeline steps
205+
public class DocumentProcessingContext
206+
{
207+
public Document Document { get; init; }
208+
public Warehouse Warehouse { get; init; }
209+
public IKoalaWikiContext DbContext { get; init; }
210+
public Kernel? KernelInstance { get; set; } // Set in pipeline
211+
public string? GeneratedReadme { get; set; }
212+
public DocumentCatalog? Catalogue { get; set; }
213+
}
214+
```
215+
- Each step reads input, modifies context, passes to next step
216+
- Stored kernel instance reused across steps to save initialization overhead
217+
218+
### 5. i18n Convention (Frontend)
219+
`web-site/src/i18n/` structure:
220+
- **`locales/`** - JSON translation files (en.json, zh.json, fr.json)
221+
- **`mergeBundles.ts`** - Combines namespace bundles into single files
222+
- **`i18n.ts`** - i18next initialization
223+
- Usage: `const { t } = useTranslation('common')`
224+
- Build command: `npm run merge-i18n`
225+
226+
### 6. Component Lazy Loading (Frontend)
227+
Routes use `lazy()` + `Suspense`:
228+
```tsx
229+
const RepositoryLayout = lazy(() => import('@/components/layout/RepositoryLayout'))
230+
231+
<Suspense fallback={<Loading />}>
232+
<RepositoryLayout />
233+
</Suspense>
234+
```
235+
- Reduces initial bundle size
236+
- Fallback component shows during load
237+
238+
### 7. State Management (Frontend)
239+
Zustand stores in `web-site/src/stores/`:
240+
```typescript
241+
const useAuthStore = create((set) => ({
242+
isAuthenticated: false,
243+
setAuthenticated: (value) => set({ isAuthenticated: value }),
244+
}))
245+
```
246+
- Lightweight, zero-boilerplate state
247+
- Avoid Redux complexity
248+
249+
### 8. MCP Integration Points
250+
- **Backend MCP Server**: `src/KoalaWiki/MCP/` exposes repository knowledge
251+
- **MCP Client Tools**: `KernelFactory.GetKernel()` loads tools from external MCPs
252+
- **Streamable Config**: `DocumentOptions.McpStreamable` parses `MCP_STREAMABLE` env var
253+
254+
---
255+
256+
## Integration Points & External Dependencies
257+
258+
### LLM Providers
259+
- **OpenAI / AzureOpenAI** - Via Semantic Kernel connectors
260+
- **Anthropic** - Planned support
261+
- **DeepSeek** - Tested with DeepSeek-V3 model
262+
- **Custom Endpoints** - Use `ENDPOINT` env var for API-compatible services
263+
264+
### Git Integration
265+
- **LibGit2Sharp** - Clone, read .gitignore, commit history
266+
- **Octokit** - GitHub API for repo metadata (optional)
267+
- Repository cloned to `KOALAWIKI_REPOSITORIES` directory
268+
269+
### Data Storage
270+
- **Entity Framework Core** - ORM with provider abstraction
271+
- **4 Database Backends** - Pluggable at compile time via Provider projects
272+
273+
### Frontend UI Framework
274+
- **Shadcn/ui** - Headless component library (based on Radix UI)
275+
- **TailwindCSS** - Utility-first styling with Vite plugin
276+
- **Lucide React** - Icon library
277+
- **React Hook Form** + **Zod** - Form handling & validation
278+
279+
### Build Tools
280+
- **Vite 7.x** - Frontend bundler with gzip/brotli compression
281+
- **SWC** - Faster TypeScript compilation (via `@vitejs/plugin-react-swc`)
282+
- **.NET 9** - C# 13 language features
283+
- **Docker** - Multi-stage builds for production
284+
285+
---
286+
287+
## Common Commands Quick Reference
288+
289+
| Task | Command |
290+
|------|---------|
291+
| **Frontend dev** | `cd web-site && npm run dev` |
292+
| **Frontend build** | `cd web-site && npm run build` |
293+
| **Backend run** | `dotnet run --project src/KoalaWiki/KoalaWiki.csproj` |
294+
| **Build all Docker** | `make build` (or `docker-compose build`) |
295+
| **Run all services** | `make dev` (shows logs) |
296+
| **Stop services** | `docker-compose down` |
297+
| **View logs** | `docker-compose logs -f` |
298+
| **DB migration** | `dotnet ef migrations add MigrationName --project KoalaWiki.Core` |
299+
| **Lint frontend** | `cd web-site && npm run lint` |
300+
| **Clean build** | `make clean` |
301+
302+
---
303+
304+
## Debugging & Tracing
305+
306+
### OpenTelemetry Integration
307+
- **`DocumentProcessingOrchestrator`** uses `ActivitySource` for tracing
308+
- **Dashboard**: Aspire Dashboard at `http://localhost:18888` (in docker-compose)
309+
- Tags automatically captured: warehouse ID, document ID, processing duration
310+
311+
### Logging
312+
- **Serilog** configured in `Program.cs`
313+
- **Sinks**: Console, File
314+
- **Configuration**: `appsettings.json`, `appsettings.Development.json`
315+
- Backend logs shown in: `docker-compose logs -f koalawiki`
316+
317+
### Frontend DevTools
318+
- **React DevTools** - Component inspection
319+
- **Network tab** - API calls to `/api/` proxied to backend
320+
- **Console** - Error/warning output
321+
- **Vite HMR** - Hot module replacement on file save
322+
323+
---
324+
325+
## File Structure Reference
326+
327+
**Key Files for Common Tasks:**
328+
- **Add database entity**: `KoalaWiki.Domains/` + migration in `KoalaWiki.Core/`
329+
- **Add API endpoint**: Create `Services/*.cs` inheriting `FastApi`
330+
- **Add frontend page**: Create in `web-site/src/pages/` + route in `web-site/src/routes/index.tsx`
331+
- **Update prompts**: Edit `src/KoalaWiki/plugins/CodeAnalysis/*/skprompt.txt`
332+
- **Add i18n strings**: Update `web-site/src/i18n/locales/*.json`
333+
- **Configure build**: `web-site/vite.config.ts` for frontend, `src/KoalaWiki/KoalaWiki.csproj` for backend
334+
335+
---
336+
337+
## Notes for AI Agents
338+
339+
1. **Token Budget**: Set `READ_MAX_TOKENS` to 70% of model max tokens to leave headroom for processing
340+
2. **Model Requirements**: CHAT_MODEL must support function calling (GPT-4, DeepSeek-V3, Claude 3.5)
341+
3. **MCP Extensibility**: Add tools to pipeline by registering MCPs in `DocumentOptions.McpStreamable`
342+
4. **Database Flexibility**: Each database provider is a separate project; migrate to new DB by swapping reference
343+
5. **Frontend Caching**: Built frontend deployed as static files in `wwwroot/`; no need to rebuild frontend for backend-only changes
344+
6. **Async-First**: Most services use `async/await`; pipeline steps must be async
345+
7. **Error Handling**: Pipeline has resilient wrapper (`ResilientDocumentProcessingPipeline`); step failures logged but may fall back

src/KoalaWiki/Dockerfile

Lines changed: 5 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -27,19 +27,16 @@ FROM build AS publish
2727
ARG BUILD_CONFIGURATION=Release
2828
RUN dotnet publish "./KoalaWiki.csproj" -c $BUILD_CONFIGURATION -o /app/publish /p:UseAppHost=false
2929

30-
FROM mcr.microsoft.com/dotnet/sdk:9.0 AS frontend-build
31-
WORKDIR /src
32-
# Install Node.js
33-
RUN curl -fsSL https://deb.nodesource.com/setup_20.x | bash - && apt-get install -y nodejs
34-
# Copy web-site and build frontend
35-
COPY web-site ./web-site
36-
WORKDIR "/src/web-site"
30+
FROM node:22-bullseye AS frontend-build
31+
WORKDIR /app
32+
COPY web-site/ .
33+
WORKDIR "/app"
3734
RUN npm install
3835
RUN npm run build
3936

4037
FROM base AS final
4138
WORKDIR /app
4239
COPY --from=publish /app/publish .
4340
# Copy built frontend files to wwwroot
44-
COPY --from=frontend-build /src/src/KoalaWiki/wwwroot ./wwwroot
41+
COPY --from=frontend-build /app/static ./wwwroot
4542
ENTRYPOINT ["dotnet", "KoalaWiki.dll"]

web-site/vite.config.ts

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,6 @@ export default defineConfig({
4343
},
4444
},
4545
build: {
46-
outDir: '../src/KoalaWiki/wwwroot',
4746
assetsDir: 'static',
4847
// 生成 source map(开发阶段可以设为 true)
4948
sourcemap: false,

0 commit comments

Comments
 (0)