Skip to content

Commit 98b4fc7

Browse files
committed
DOCS-1276 - Typesense search proof of concept
1 parent 2ff524c commit 98b4fc7

File tree

4 files changed

+293
-19
lines changed

4 files changed

+293
-19
lines changed

docusaurus.config.js

Lines changed: 12 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -239,6 +239,7 @@ module.exports = {
239239
},
240240
],
241241
],
242+
themes: ['docusaurus-theme-search-typesense'],
242243
themeConfig:
243244
({
244245
docs: {
@@ -277,16 +278,18 @@ module.exports = {
277278
defaultMode: 'light',
278279
disableSwitch: false,
279280
},
280-
algolia: {
281-
appId: '2SJPGMLW1Q',
282-
apiKey: 'fb2f4e1fb40f962900631121cb365549',
283-
indexName: 'crawler_sumodocs',
284-
contextualSearch: false,
285-
insights: true,
286-
insightsConfig: {
287-
useCookie: true, // alt to useCookie: true,
281+
typesense: {
282+
typesenseCollectionName: 'sumo-docs_1764148676', // the collection name from the scraper output
283+
typesenseServerConfig: {
284+
nodes: [
285+
{
286+
host: 'localhost', // you'll change this for production
287+
port: 8108,
288+
protocol: 'http',
289+
},
290+
],
291+
apiKey: 'xyz', // use a search-only API key in production
288292
},
289-
useCookie: true, // alt to insightsConfig: {useCookie: true,},
290293
},
291294
prism: {
292295
theme: lightCodeTheme,

package.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -89,6 +89,7 @@
8989
"cssnano": "6.1.2",
9090
"csso": "5.0.5",
9191
"docusaurus-plugin-sass": "^0.2.1",
92+
"docusaurus-theme-search-typesense": "^0.26.0",
9293
"docusaurus2-dotenv": "^1.4.0",
9394
"electron-to-chromium": "1.4.755",
9495
"eventsource-parser": "3.0.6",

typesense-local-setup.md

Lines changed: 130 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,130 @@
1+
<!-- Required steps to access test env -->
2+
3+
# Local Typesense Search Setup
4+
5+
## Prerequisites
6+
7+
- Homebrew
8+
- Docker
9+
- Yarn
10+
11+
## Setup Steps
12+
13+
### 1. Install Typesense
14+
15+
```bash
16+
brew install typesense
17+
brew services start typesense
18+
```
19+
20+
### 2. Verify it's running
21+
22+
```bash
23+
curl http://localhost:8108/health
24+
```
25+
26+
Should return `{"ok":true}`
27+
28+
### 3. Install Docker (needed for the scraper)
29+
30+
```bash
31+
brew install --cask docker
32+
```
33+
34+
Then open Docker from Applications to start it.
35+
36+
### 4. Build and serve the docs locally
37+
38+
```bash
39+
yarn build
40+
yarn serve
41+
```
42+
43+
### 5. In a new terminal, create the scraper config
44+
45+
```bash
46+
cat > /tmp/docsearch-config.json << 'EOF'
47+
{
48+
"index_name": "sumo-docs",
49+
"start_urls": ["http://host.docker.internal:3000/"],
50+
"selectors": {
51+
"lvl0": "article h1",
52+
"lvl1": "article h2",
53+
"lvl2": "article h3",
54+
"lvl3": "article h4",
55+
"text": "article p, article li"
56+
}
57+
}
58+
EOF
59+
```
60+
61+
### 6. Run the scraper
62+
63+
```bash
64+
cd /tmp
65+
docker run -it --rm \
66+
-e TYPESENSE_API_KEY=xyz \
67+
-e TYPESENSE_HOST=host.docker.internal \
68+
-e TYPESENSE_PORT=8108 \
69+
-e TYPESENSE_PROTOCOL=http \
70+
-e CONFIG="$(cat docsearch-config.json)" \
71+
typesense/docsearch-scraper
72+
```
73+
74+
This takes a while (~140k records).
75+
76+
### 7. Test the search
77+
78+
Open `http://localhost:3000` and try the search box.
79+
80+
---
81+
82+
## Troubleshooting
83+
84+
### Docker install error: "already a Binary at '/usr/local/bin/hub-tool'"
85+
86+
This is a leftover from a previous install. You can ignore it or force reinstall:
87+
88+
```bash
89+
brew reinstall --cask docker --force
90+
```
91+
92+
### Docker not in Applications folder
93+
94+
Try opening it directly:
95+
96+
```bash
97+
open /opt/homebrew/Caskroom/docker-desktop/4.52.0,210994/Docker.app
98+
```
99+
100+
### Scraper returns "Connection refused"
101+
102+
Your local docs server isn't running. Make sure `yarn serve` is running in another terminal before starting the scraper.
103+
104+
### Scraper returns 403 Forbidden
105+
106+
This happens when scraping production (sumologic.com/help) due to bot protection. Always scrape against your local build instead.
107+
108+
### Scraper shows "nbHits 0" or "Ignored: from start url"
109+
110+
The selectors aren't matching the HTML structure. Make sure you're using the config above, which is tuned for our Docusaurus setup.
111+
112+
### Typesense not running
113+
114+
Check status with:
115+
116+
```bash
117+
brew services list | grep typesense
118+
```
119+
120+
If stopped, start it:
121+
122+
```bash
123+
brew services start typesense
124+
```
125+
126+
### Find your Typesense API key
127+
128+
```bash
129+
cat /opt/homebrew/etc/typesense/typesense.ini
130+
```

0 commit comments

Comments
 (0)