Skip to content

Commit b450d0e

Browse files
committed
Merge branch 'canary'
2 parents fdaa2f0 + 0987ee4 commit b450d0e

File tree

132 files changed

+9706
-4349
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

132 files changed

+9706
-4349
lines changed

CONTRIBUTING.md

Lines changed: 46 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -11,33 +11,63 @@ Perplexica's codebase is organized as follows:
1111
- **UI Components and Pages**:
1212
- **Components (`src/components`)**: Reusable UI components.
1313
- **Pages and Routes (`src/app`)**: Next.js app directory structure with page components.
14-
- Main app routes include: home (`/`), chat (`/c`), discover (`/discover`), library (`/library`), and settings (`/settings`).
15-
- **API Routes (`src/app/api`)**: API endpoints implemented with Next.js API routes.
16-
- `/api/chat`: Handles chat interactions.
17-
- `/api/search`: Provides direct access to Perplexica's search capabilities.
18-
- Other endpoints for models, files, and suggestions.
14+
- Main app routes include: home (`/`), chat (`/c`), discover (`/discover`), and library (`/library`).
15+
- **API Routes (`src/app/api`)**: Server endpoints implemented with Next.js route handlers.
1916
- **Backend Logic (`src/lib`)**: Contains all the backend functionality including search, database, and API logic.
20-
- The search functionality is present inside `src/lib/search` directory.
21-
- All of the focus modes are implemented using the Meta Search Agent class in `src/lib/search/metaSearchAgent.ts`.
17+
- The search system lives in `src/lib/agents/search`.
18+
- The search pipeline is split into classification, research, widgets, and writing.
2219
- Database functionality is in `src/lib/db`.
23-
- Chat model and embedding model providers are managed in `src/lib/providers`.
24-
- Prompt templates and LLM chain definitions are in `src/lib/prompts` and `src/lib/chains` respectively.
20+
- Chat model and embedding model providers are in `src/lib/models/providers`, and models are loaded via `src/lib/models/registry.ts`.
21+
- Prompt templates are in `src/lib/prompts`.
22+
- SearXNG integration is in `src/lib/searxng.ts`.
23+
- Upload search lives in `src/lib/uploads`.
24+
25+
### Where to make changes
26+
27+
If you are not sure where to start, use this section as a map.
28+
29+
- **Search behavior and reasoning**
30+
31+
- `src/lib/agents/search` contains the core chat and search pipeline.
32+
- `classifier.ts` decides whether research is needed and what should run.
33+
- `researcher/` gathers information in the background.
34+
35+
- **Add or change a search capability**
36+
37+
- Research tools (web, academic, discussions, uploads, scraping) live in `src/lib/agents/search/researcher/actions`.
38+
- Tools are registered in `src/lib/agents/search/researcher/actions/index.ts`.
39+
40+
- **Add or change widgets**
41+
42+
- Widgets live in `src/lib/agents/search/widgets`.
43+
- Widgets run in parallel with research and show structured results in the UI.
44+
45+
- **Model integrations**
46+
47+
- Providers live in `src/lib/models/providers`.
48+
- Add new providers there and wire them into the model registry so they show up in the app.
49+
50+
- **Architecture docs**
51+
- High level overview: `docs/architecture/README.md`
52+
- High level flow: `docs/architecture/WORKING.md`
2553

2654
## API Documentation
2755

28-
Perplexica exposes several API endpoints for programmatic access, including:
56+
Perplexica includes API documentation for programmatic access.
2957

30-
- **Search API**: Access Perplexica's advanced search capabilities directly via the `/api/search` endpoint. For detailed documentation, see `docs/api/search.md`.
58+
- **Search API**: For detailed documentation, see `docs/API/SEARCH.md`.
3159

3260
## Setting Up Your Environment
3361

3462
Before diving into coding, setting up your local environment is key. Here's what you need to do:
3563

36-
1. In the root directory, locate the `sample.config.toml` file.
37-
2. Rename it to `config.toml` and fill in the necessary configuration fields.
38-
3. Run `npm install` to install all dependencies.
39-
4. Run `npm run db:migrate` to set up the local sqlite database.
40-
5. Use `npm run dev` to start the application in development mode.
64+
1. Run `npm install` to install all dependencies.
65+
2. Use `npm run dev` to start the application in development mode.
66+
3. Open http://localhost:3000 and complete the setup in the UI (API keys, models, search backend URL, etc.).
67+
68+
Database migrations are applied automatically on startup.
69+
70+
For full installation options (Docker and non Docker), see the installation guide in the repository README.
4171

4272
**Please note**: Docker configurations are present for setting up production environments, whereas `npm run dev` is used for development purposes.
4373

README.md

Lines changed: 9 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -18,9 +18,11 @@ Want to know more about its architecture and how it works? You can read it [here
1818

1919
🤖 **Support for all major AI providers** - Use local LLMs through Ollama or connect to OpenAI, Anthropic Claude, Google Gemini, Groq, and more. Mix and match models based on your needs.
2020

21-
**Smart search modes** - Choose Balanced Mode for everyday searches, Fast Mode when you need quick answers, or wait for Quality Mode (coming soon) for deep research.
21+
**Smart search modes** - Choose Speed Mode when you need quick answers, Balanced Mode for everyday searches, or Quality Mode for deep research.
2222

23-
🎯 **Six specialized focus modes** - Get better results with modes designed for specific tasks: Academic papers, YouTube videos, Reddit discussions, Wolfram Alpha calculations, writing assistance, or general web search.
23+
🧭 **Pick your sources** - Search the web, discussions, or academic papers. More sources and integrations are in progress.
24+
25+
🧩 **Widgets** - Helpful UI cards that show up when relevant, like weather, calculations, stock prices, and other quick lookups.
2426

2527
🔍 **Web search powered by SearxNG** - Access multiple search engines while keeping your identity private. Support for Tavily and Exa coming soon for even better results.
2628

@@ -81,7 +83,7 @@ There are mainly 2 ways of installing Perplexica - With Docker, Without Docker.
8183
Perplexica can be easily run using Docker. Simply run the following command:
8284

8385
```bash
84-
docker run -d -p 3000:3000 -v perplexica-data:/home/perplexica/data -v perplexica-uploads:/home/perplexica/uploads --name perplexica itzcrazykns1337/perplexica:latest
86+
docker run -d -p 3000:3000 -v perplexica-data:/home/perplexica/data --name perplexica itzcrazykns1337/perplexica:latest
8587
```
8688

8789
This will pull and start the Perplexica container with the bundled SearxNG search engine. Once running, open your browser and navigate to http://localhost:3000. You can then configure your settings (API keys, models, etc.) directly in the setup screen.
@@ -93,7 +95,7 @@ This will pull and start the Perplexica container with the bundled SearxNG searc
9395
If you already have SearxNG running, you can use the slim version of Perplexica:
9496

9597
```bash
96-
docker run -d -p 3000:3000 -e SEARXNG_API_URL=http://your-searxng-url:8080 -v perplexica-data:/home/perplexica/data -v perplexica-uploads:/home/perplexica/uploads --name perplexica itzcrazykns1337/perplexica:slim-latest
98+
docker run -d -p 3000:3000 -e SEARXNG_API_URL=http://your-searxng-url:8080 -v perplexica-data:/home/perplexica/data --name perplexica itzcrazykns1337/perplexica:slim-latest
9799
```
98100

99101
**Important**: Make sure your SearxNG instance has:
@@ -120,7 +122,7 @@ If you prefer to build from source or need more control:
120122

121123
```bash
122124
docker build -t perplexica .
123-
docker run -d -p 3000:3000 -v perplexica-data:/home/perplexica/data -v perplexica-uploads:/home/perplexica/uploads --name perplexica perplexica
125+
docker run -d -p 3000:3000 -v perplexica-data:/home/perplexica/data --name perplexica perplexica
124126
```
125127

126128
5. Access Perplexica at http://localhost:3000 and configure your settings in the setup screen.
@@ -237,13 +239,8 @@ Perplexica runs on Next.js and handles all API requests. It works right away on
237239

238240
## Upcoming Features
239241

240-
- [x] Add settings page
241-
- [x] Adding support for local LLMs
242-
- [x] History Saving features
243-
- [x] Introducing various Focus Modes
244-
- [x] Adding API support
245-
- [x] Adding Discover
246-
- [ ] Finalizing Copilot Mode
242+
- [ ] Adding more widgets, integrations, search sources
243+
- [ ] Adding authentication
247244

248245
## Support Us
249246

docker-compose.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
services:
22
perplexica:
33
image: itzcrazykns1337/perplexica:latest
4+
build:
5+
context: .
46
ports:
57
- '3000:3000'
68
volumes:

docs/API/SEARCH.md

Lines changed: 13 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@ Use the `id` field as the `providerId` and the `key` field from the models array
5757

5858
### Request
5959

60-
The API accepts a JSON object in the request body, where you define the focus mode, chat models, embedding models, and your query.
60+
The API accepts a JSON object in the request body, where you define the enabled search `sources`, chat models, embedding models, and your query.
6161

6262
#### Request Body Structure
6363

@@ -72,7 +72,7 @@ The API accepts a JSON object in the request body, where you define the focus mo
7272
"key": "text-embedding-3-large"
7373
},
7474
"optimizationMode": "speed",
75-
"focusMode": "webSearch",
75+
"sources": ["web"],
7676
"query": "What is Perplexica",
7777
"history": [
7878
["human", "Hi, how are you?"],
@@ -87,24 +87,25 @@ The API accepts a JSON object in the request body, where you define the focus mo
8787

8888
### Request Parameters
8989

90-
- **`chatModel`** (object, optional): Defines the chat model to be used for the query. To get available providers and models, send a GET request to `http://localhost:3000/api/providers`.
90+
- **`chatModel`** (object, required): Defines the chat model to be used for the query. To get available providers and models, send a GET request to `http://localhost:3000/api/providers`.
9191

9292
- `providerId` (string): The UUID of the provider. You can get this from the `/api/providers` endpoint response.
9393
- `key` (string): The model key/identifier (e.g., `gpt-4o-mini`, `llama3.1:latest`). Use the `key` value from the provider's `chatModels` array, not the display name.
9494

95-
- **`embeddingModel`** (object, optional): Defines the embedding model for similarity-based searching. To get available providers and models, send a GET request to `http://localhost:3000/api/providers`.
95+
- **`embeddingModel`** (object, required): Defines the embedding model for similarity-based searching. To get available providers and models, send a GET request to `http://localhost:3000/api/providers`.
9696

9797
- `providerId` (string): The UUID of the embedding provider. You can get this from the `/api/providers` endpoint response.
9898
- `key` (string): The embedding model key (e.g., `text-embedding-3-large`, `nomic-embed-text`). Use the `key` value from the provider's `embeddingModels` array, not the display name.
9999

100-
- **`focusMode`** (string, required): Specifies which focus mode to use. Available modes:
100+
- **`sources`** (array, required): Which search sources to enable. Available values:
101101

102-
- `webSearch`, `academicSearch`, `writingAssistant`, `wolframAlphaSearch`, `youtubeSearch`, `redditSearch`.
102+
- `web`, `academic`, `discussions`.
103103

104104
- **`optimizationMode`** (string, optional): Specifies the optimization mode to control the balance between performance and quality. Available modes:
105105

106106
- `speed`: Prioritize speed and return the fastest answer.
107107
- `balanced`: Provide a balanced answer with good speed and reasonable quality.
108+
- `quality`: Prioritize answer quality (may be slower).
108109

109110
- **`query`** (string, required): The search query or question.
110111

@@ -132,14 +133,14 @@ The response from the API includes both the final message and the sources used t
132133
"message": "Perplexica is an innovative, open-source AI-powered search engine designed to enhance the way users search for information online. Here are some key features and characteristics of Perplexica:\n\n- **AI-Powered Technology**: It utilizes advanced machine learning algorithms to not only retrieve information but also to understand the context and intent behind user queries, providing more relevant results [1][5].\n\n- **Open-Source**: Being open-source, Perplexica offers flexibility and transparency, allowing users to explore its functionalities without the constraints of proprietary software [3][10].",
133134
"sources": [
134135
{
135-
"pageContent": "Perplexica is an innovative, open-source AI-powered search engine designed to enhance the way users search for information online.",
136+
"content": "Perplexica is an innovative, open-source AI-powered search engine designed to enhance the way users search for information online.",
136137
"metadata": {
137138
"title": "What is Perplexica, and how does it function as an AI-powered search ...",
138139
"url": "https://askai.glarity.app/search/What-is-Perplexica--and-how-does-it-function-as-an-AI-powered-search-engine"
139140
}
140141
},
141142
{
142-
"pageContent": "Perplexica is an open-source AI-powered search tool that dives deep into the internet to find precise answers.",
143+
"content": "Perplexica is an open-source AI-powered search tool that dives deep into the internet to find precise answers.",
143144
"metadata": {
144145
"title": "Sahar Mor's Post",
145146
"url": "https://www.linkedin.com/posts/sahar-mor_a-new-open-source-project-called-perplexica-activity-7204489745668694016-ncja"
@@ -158,7 +159,7 @@ Example of streamed response objects:
158159

159160
```
160161
{"type":"init","data":"Stream connected"}
161-
{"type":"sources","data":[{"pageContent":"...","metadata":{"title":"...","url":"..."}},...]}
162+
{"type":"sources","data":[{"content":"...","metadata":{"title":"...","url":"..."}},...]}
162163
{"type":"response","data":"Perplexica is an "}
163164
{"type":"response","data":"innovative, open-source "}
164165
{"type":"response","data":"AI-powered search engine..."}
@@ -174,9 +175,9 @@ Clients should process each line as a separate JSON object. The different messag
174175

175176
### Fields in the Response
176177

177-
- **`message`** (string): The search result, generated based on the query and focus mode.
178+
- **`message`** (string): The search result, generated based on the query and enabled `sources`.
178179
- **`sources`** (array): A list of sources that were used to generate the search result. Each source includes:
179-
- `pageContent`: A snippet of the relevant content from the source.
180+
- `content`: A snippet of the relevant content from the source.
180181
- `metadata`: Metadata about the source, including:
181182
- `title`: The title of the webpage.
182183
- `url`: The URL of the webpage.
@@ -185,5 +186,5 @@ Clients should process each line as a separate JSON object. The different messag
185186

186187
If an error occurs during the search process, the API will return an appropriate error message with an HTTP status code.
187188

188-
- **400**: If the request is malformed or missing required fields (e.g., no focus mode or query).
189+
- **400**: If the request is malformed or missing required fields (e.g., no `sources` or `query`).
189190
- **500**: If an internal server error occurs during the search.

docs/architecture/README.md

Lines changed: 35 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,38 @@
1-
# Perplexica's Architecture
1+
# Perplexica Architecture
22

3-
Perplexica's architecture consists of the following key components:
3+
Perplexica is a Next.js application that combines an AI chat experience with search.
44

5-
1. **User Interface**: A web-based interface that allows users to interact with Perplexica for searching images, videos, and much more.
6-
2. **Agent/Chains**: These components predict Perplexica's next actions, understand user queries, and decide whether a web search is necessary.
7-
3. **SearXNG**: A metadata search engine used by Perplexica to search the web for sources.
8-
4. **LLMs (Large Language Models)**: Utilized by agents and chains for tasks like understanding content, writing responses, and citing sources. Examples include Claude, GPTs, etc.
9-
5. **Embedding Models**: To improve the accuracy of search results, embedding models re-rank the results using similarity search algorithms such as cosine similarity and dot product distance.
5+
For a high level flow, see [WORKING.md](WORKING.md). For deeper implementation details, see [CONTRIBUTING.md](../../CONTRIBUTING.md).
106

11-
For a more detailed explanation of how these components work together, see [WORKING.md](https://github.com/ItzCrazyKns/Perplexica/tree/master/docs/architecture/WORKING.md).
7+
## Key components
8+
9+
1. **User Interface**
10+
11+
- A web based UI that lets users chat, search, and view citations.
12+
13+
2. **API Routes**
14+
15+
- `POST /api/chat` powers the chat UI.
16+
- `POST /api/search` provides a programmatic search endpoint.
17+
- `GET /api/providers` lists available providers and model keys.
18+
19+
3. **Agents and Orchestration**
20+
21+
- The system classifies the question first.
22+
- It can run research and widgets in parallel.
23+
- It generates the final answer and includes citations.
24+
25+
4. **Search Backend**
26+
27+
- A meta search backend is used to fetch relevant web results when research is enabled.
28+
29+
5. **LLMs (Large Language Models)**
30+
31+
- Used for classification, writing answers, and producing citations.
32+
33+
6. **Embedding Models**
34+
35+
- Used for semantic search over user uploaded files.
36+
37+
7. **Storage**
38+
- Chats and messages are stored so conversations can be reloaded.

0 commit comments

Comments
 (0)