Skip to content

Commit 311bb9a

Browse files
committed
Updated doc for render type
1 parent 04008cb commit 311bb9a

File tree

5 files changed

+262
-89
lines changed

5 files changed

+262
-89
lines changed

docs/media-urls.md

Lines changed: 0 additions & 83 deletions
Original file line numberDiff line numberDiff line change
@@ -1,67 +1,5 @@
11
# Asset URL Handling
22

3-
This document explains how SmooSense handles asset URLs (images, videos, audio) in data tables.
4-
5-
## Overview
6-
7-
SmooSense supports multiple URL formats for media assets:
8-
- **Absolute URLs**: `http://example.com/image.jpg`, `https://cdn.example.com/video.mp4`
9-
- **Cloud Storage URLs**: `s3://bucket/file.wav`
10-
- **Relative Paths**: `./images/photo.jpg`, `./audio/sound.wav`
11-
- **Absolute Paths**: `/path/to/file.mp3`, `~/home/user/image.png`
12-
13-
## URL Processing Pipeline
14-
15-
### 1. Load data by running a query
16-
`executeQueryAsListOfDict()` @ [useRowData.ts](../smoosense-gui/src/lib/hooks/useRowData.ts)
17-
18-
### 2. Process all cells and resolve media URLs
19-
`fetchProcessedRowDataFunction()` @ [processedRowDataSlice.ts](../smoosense-gui/src/lib/features/processedRowData/processedRowDataSlice.ts)
20-
21-
For **every cell in every row**:
22-
- Check if value needs resolution using `needToResolveMediaUrl()` @ [mediaUrlUtils.ts](../smoosense-gui/src/lib/utils/mediaUrlUtils.ts)
23-
- Resolve it using `resolveAssetUrl()` @ [mediaUrlUtils.ts](../smoosense-gui/src/lib/utils/mediaUrlUtils.ts)
24-
25-
26-
#### Relative Path (local tablePath)
27-
- **Input Example**: `./images/photo.jpg` with tablePath `/data/file.csv`
28-
- **Resolution**: Resolve relative to local `tablePath` directory, proxy through backend
29-
- **Final Output**: `{baseUrl}/api/get-file?path=/data/images/photo.jpg&redirect=false`
30-
31-
#### Relative Path (S3 tablePath)
32-
- **Input Example**: `./images/photo.jpg` with tablePath `s3://bucket/data/file.csv`
33-
- **Resolution**: Resolve relative to S3 `tablePath` directory, proxy through S3 proxy
34-
- **Final Output**: `{baseUrl}/api/s3-proxy?url=s3%3A%2F%2Fbucket%2Fdata%2Fimages%2Fphoto.jpg`
35-
36-
#### Relative Path (HTTP/HTTPS tablePath)
37-
- **Input Example**: `./images/photo.jpg` with tablePath `https://example.com/data/file.csv`
38-
- **Resolution**: Resolve relative to HTTP/HTTPS `tablePath` directory
39-
- **Final Output**: `https://example.com/data/images/photo.jpg`
40-
41-
#### Absolute Path
42-
- **Input Example**: `/home/user/image.png`
43-
- **Resolution**: Proxy through backend API
44-
- **Final Output**: `{baseUrl}/api/get-file?path=/home/user/image.png&redirect=false`
45-
46-
#### Home Path
47-
- **Input Example**: `~/Documents/file.wav`
48-
- **Resolution**: Proxy through backend API
49-
- **Final Output**: `{baseUrl}/api/get-file?path=~/Documents/file.wav&redirect=false`
50-
51-
#### S3 URL (with media extension)
52-
- **Input Example**: `s3://bucket/file.wav`
53-
- **Resolution**: Proxy through backend S3 proxy (only for media files: images, videos, audio)
54-
- **Final Output**: `{baseUrl}/api/s3-proxy?url=s3%3A%2F%2Fbucket%2Ffile.wav`
55-
56-
#### HTTP/HTTPS URL
57-
- **Input Example**: `https://cdn.example.com/image.jpg`
58-
- **Resolution**: No modification
59-
- **Final Output**: `https://cdn.example.com/image.jpg`
60-
61-
### 3. Return processed data
62-
[useProcessedRowData.ts](../smoosense-gui/src/lib/hooks/useProcessedRowData.ts)
63-
- Returns the processed data with all media URLs resolved
64-
653
## Query Tab Processing
664

675
The Query tab also processes media URLs in query results:
@@ -76,24 +14,3 @@ The Query tab also processes media URLs in query results:
7614
- For each cell, check `needToResolveMediaUrl(value)`
7715
- If true, call `resolveAssetUrl(value, tablePath, baseUrl)`
7816
5. Render in `BasicAGTable` with auto-detected cell renderers
79-
80-
**Media Rendering**:
81-
- Images displayed in `ImageCellRenderer`
82-
- Videos displayed in `VideoCellRenderer`
83-
- Audio displayed in `AudioCellRenderer`
84-
- Cell renderers auto-detected via `inferRenderTypeFromData()`
85-
86-
## Backend Integration
87-
88-
### File Serving Endpoint
89-
`/api/get-file`
90-
`get_file()` @ [fs.py](../smoosense-py/smoosense/handlers/fs.py)
91-
- Serves local files (relative paths, absolute paths, home paths)
92-
- Accepts `path` parameter with file location
93-
94-
### S3 Proxy Endpoint
95-
`/api/s3-proxy`
96-
`s3_proxy()` @ [fs.py](../smoosense-py/smoosense/handlers/fs.py)
97-
- Proxies S3 URLs through backend
98-
- Accepts `url` parameter with full S3 URL (e.g., `s3://bucket/path/to/file`)
99-

landing/app/docs/docs-config.ts

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,10 @@ export const docsConfig: DocSection[] = [
5353
label: 'Authentication',
5454
slug: 'authentication',
5555
},
56+
{
57+
label: 'RenderType',
58+
slug: 'render-type',
59+
},
5660
],
5761
},
5862
]

landing/components/markdown-components.tsx

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -297,6 +297,67 @@ export function createMarkdownComponents(headerMap?: Map<string, string>) {
297297
return <Tabs content={code} />
298298
}
299299

300+
if (language === 'codelink') {
301+
// Parse codelink format: key-value pairs separated by blank lines
302+
// Each block can have: anchor, path, line
303+
const baseUrl = 'https://github.com/SmooSenseAI/smoosense/blob/main'
304+
const blocks = code.trim().split(/\n\s*\n/).filter((block: string) => block.trim())
305+
306+
const links = blocks.map((block: string) => {
307+
const lines = block.trim().split('\n')
308+
const props: Record<string, string> = {}
309+
310+
for (const line of lines) {
311+
const match = line.match(/^(\w+):\s*(.+)$/)
312+
if (match) {
313+
props[match[1]] = match[2].trim()
314+
} else if (line.trim() && !props.path) {
315+
// Fallback: treat as simple path (backwards compatibility)
316+
props.path = line.trim()
317+
}
318+
}
319+
320+
return props
321+
})
322+
323+
return (
324+
<Box mb={4}>
325+
{links.map((props: Record<string, string>, index: number) => {
326+
const path = props.path || ''
327+
const fileName = path.split('/').pop() || path
328+
const anchor = props.anchor
329+
const line = props.line
330+
331+
let githubUrl = `${baseUrl}/${path}`
332+
if (line) {
333+
githubUrl += `#L${line}`
334+
}
335+
336+
const displayText = anchor ? `${anchor} @ ${fileName}` : fileName
337+
338+
return (
339+
<Box key={index} mb={1}>
340+
<Link
341+
href={githubUrl}
342+
target="_blank"
343+
color="primary.300"
344+
fontFamily="mono"
345+
fontSize="sm"
346+
_hover={{ textDecoration: 'underline' }}
347+
display="inline-flex"
348+
alignItems="center"
349+
gap={1}
350+
>
351+
<Text as="span" fontSize="xs">🔗</Text>
352+
{displayText}
353+
</Link>
354+
</Box>
355+
)
356+
})}
357+
</Box>
358+
)
359+
}
360+
300361
return <CodeBlock language={language}>{code}</CodeBlock>
301362
},
302363
code: ({ className, children }: any) => {
Lines changed: 196 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,196 @@
1+
# RenderType
2+
3+
SmooSense automatically infers and renders data in the most appropriate format for each column. This intelligent rendering system examines your data and chooses the best visualization, making tables intuitive and rich without any configuration.
4+
5+
## How RenderType Works
6+
```codelink
7+
anchor: inferColumnRenderType()
8+
path: smoosense-gui/src/lib/hooks/useRenderType.ts
9+
line: 15
10+
```
11+
When you load a table, SmooSense fetches `ColumnMeta` for each column and infers the most suitable RenderType. This inference happens automatically based on:
12+
13+
1. **ColumnMeta** - Type shortcuts (`isBoolean`, `isNumeric`, `isDatetime`, `isPrimitive`, `isNumericArray`) and DuckDB type
14+
2. **Content patterns** - URLs are identified and classified by file extension or domain
15+
3. **Column naming conventions** - Special column names like `bbox` or `image_mask` trigger specialized renderers
16+
17+
## Supported RenderTypes
18+
```codelink
19+
anchor: RenderType
20+
path: smoosense-gui/src/lib/utils/agGridCellRenderers.tsx
21+
line: 26
22+
```
23+
24+
### Basic Types
25+
26+
| Type | Description | Criteria |
27+
|------|-------------|----------|
28+
| **Text** | Plain text display | Default for strings that don't match other patterns |
29+
| **Number** | Numeric values | All values are numbers or numeric strings |
30+
| **Boolean** | True/false display | All values are boolean |
31+
| **Date** | Date formatting | Matches common date patterns (YYYY-MM-DD, ISO dates, etc.) |
32+
| **Null** | Empty state | Column contains only null/undefined values |
33+
34+
### Media Types
35+
36+
SmooSense renders media directly in table cells, providing instant visual context.
37+
38+
| Type | Description | Criteria |
39+
|------|-----------------------------------|----------|
40+
| **ImageUrl** | Inline image preview | URLs ending in `.jpg`, `.png`, `.gif`, `.webp`, etc. |
41+
| **VideoUrl** | Video player | URLs ending in `.mp4`, `.webm`, `.mov`, or YouTube/Vimeo links |
42+
| **AudioUrl** | Audio player with Mel-spectrogram | URLs ending in `.mp3`, `.wav`, `.ogg`, `.flac`, etc. |
43+
| **PdfUrl** | PDF preview | URLs ending in `.pdf` |
44+
45+
Supported image formats: `jpg`, `jpeg`, `png`, `gif`, `bmp`, `svg`, `webp`, `tiff`, `tif`, `ico`, `heic`, `heif`
46+
47+
Supported video formats: `mp4`, `avi`, `mov`, `wmv`, `flv`, `webm`, `mkv`, `m4v`, `3gp`, `ogv`
48+
49+
Supported audio formats: `mp3`, `wav`, `ogg`, `flac`, `m4a`, `aac`, `wma`, `opus`
50+
51+
### List Types
52+
53+
Arrays of media URLs are rendered as scrollable galleries within cells.
54+
55+
| Type | Description | Criteria |
56+
|------|-------------|----------|
57+
| **ImageList** | Multiple image thumbnails | Array where all elements are image URLs |
58+
| **VideoList** | Multiple video previews | Array where all elements are video URLs |
59+
| **AudioList** | Multiple audio players | Array where all elements are audio URLs |
60+
61+
### Structured Data Types
62+
63+
| Type | Description | Criteria |
64+
|------|-------------|----------|
65+
| **Json** | Interactive JSON viewer | Objects or arrays (excluding media lists) |
66+
| **HyperLink** | Clickable link | URLs that don't match specific media types |
67+
| **IFrame** | Embedded content | URLs prefixed with `iframe+http://` or `iframe+https://` |
68+
69+
### Embedding
70+
Array of float or double having the same length will be inferred as embedding.
71+
Similarity search will be triggered when an embedding cell is clicked.
72+
73+
### Specialized Types
74+
75+
These types handle domain-specific data formats.
76+
77+
| Type | Description | Criteria |
78+
|------|-------------|----------|
79+
| **Bbox** | Bounding box overlay | Column name contains `bbox` and values are 4-element number arrays |
80+
| **ImageMask** | Segmentation mask overlay | Column name contains `image_mask` and values are image URLs |
81+
| **WordScores** | Token-level score visualization | Column name contains `word_score` |
82+
| **HuggingFaceMedia** | Hugging Face dataset media | Hugging Face-specific media format |
83+
84+
## Inferring Logic
85+
86+
### URL Classification
87+
```codelink
88+
anchor: inferUrlType
89+
path: smoosense-gui/src/lib/utils/renderTypeUtils.ts
90+
line: 36
91+
```
92+
93+
When a string is identified as a URL, SmooSense classifies it based on:
94+
95+
1. **File extension** - The extension in the URL path determines the media type
96+
2. **Domain patterns** - `youtube.com`, `youtu.be`, and `vimeo.com` are treated as video URLs
97+
3. **IFrame prefix** - URLs prefixed with `iframe+http://` or `iframe+https://` are rendered in iframes
98+
99+
### Date Inference
100+
101+
SmooSense recognizes these date formats:
102+
- `YYYY-MM-DD` (e.g., `2024-01-15`)
103+
- ISO 8601 with time (e.g., `2024-01-15T10:30:00`)
104+
- `MM/DD/YYYY` or `M/D/YYYY` (e.g., `01/15/2024`)
105+
- `MM-DD-YYYY` or `M-D-YYYY` (e.g., `01-15-2024`)
106+
107+
### Column Name Conventions
108+
109+
Certain column names trigger specialized rendering:
110+
111+
- **`*bbox*`** - Enables bounding box visualization when values are `[x, y, width, height]` arrays
112+
- **`*image_mask*`** - Enables mask overlay on an associated `image_url` column
113+
- **`*word_score*`** - Enables word-level score visualization
114+
115+
116+
## Media URL Resolution
117+
118+
SmooSense automatically resolves media URLs to make them viewable in the browser. This allows you to use relative paths, local file paths, and S3 URLs directly in your data.
119+
120+
```codelink
121+
anchor: needToResolveMediaUrl()
122+
path: smoosense-gui/src/lib/utils/mediaUrlUtils.ts
123+
line: 12
124+
125+
anchor: resolveAssetUrl()
126+
path: smoosense-gui/src/lib/utils/mediaUrlUtils.ts
127+
line: 117
128+
```
129+
130+
### Supported URL Formats
131+
132+
| Format | Example | Resolution |
133+
|--------|---------|------------|
134+
| **Relative Path** | `./images/photo.jpg` | Resolved relative to the table file location |
135+
| **Absolute Path** | `/home/user/image.png` | Proxied through backend API |
136+
| **Home Path** | `~/Documents/file.wav` | Proxied through backend API |
137+
| **S3 URL** | `s3://bucket/file.wav` | Proxied through backend S3 proxy |
138+
| **HTTP/HTTPS URL** | `https://cdn.example.com/image.jpg` | Used directly (no modification) |
139+
140+
Resolution Examples:
141+
142+
- Relative path with local table:
143+
- Input: `./images/photo.jpg` with tablePath `/data/file.csv`
144+
- Output: `/api/get-file?path=/data/images/photo.jpg`
145+
146+
- Relative path with S3 table:
147+
- Input: `./images/photo.jpg` with tablePath `s3://bucket/data/file.csv`
148+
- Output: `/api/s3-proxy?url=s3://bucket/data/images/photo.jpg`
149+
150+
- Relative path with HTTP table:
151+
- Input: `./images/photo.jpg` with tablePath `https://example.com/data/file.csv`
152+
- Output: `https://example.com/data/images/photo.jpg`
153+
154+
### Serving media assets in backend
155+
156+
#### File Serving Endpoint
157+
158+
URL: `/api/get-file`
159+
160+
Serves local files (relative paths, absolute paths, home paths). Accepts `path` parameter with file location.
161+
162+
```codelink
163+
anchor: get_file()
164+
path: smoosense-py/smoosense/handlers/fs.py
165+
line: 53
166+
```
167+
168+
#### S3 Proxy Endpoint
169+
170+
URL: `/api/s3-proxy`
171+
172+
Proxies S3 URLs and redirects to a signed URL with temporary one-time credential contained. Accepts `url` parameter with full S3 URL (e.g., `s3://bucket/path/to/file`).
173+
174+
```codelink
175+
anchor: proxy()
176+
path: smoosense-py/smoosense/handlers/s3.py
177+
line: 18
178+
```
179+
180+
### When URLs Are Resolved
181+
182+
A URL is resolved only when all of these conditions are met:
183+
1. The value is a string
184+
2. It starts with `./`, `/`, `~/`, or `s3://`
185+
3. It has a media file extension (image, video, audio, or PDF)
186+
187+
HTTP/HTTPS URLs are used directly without modification.
188+
189+
190+
191+
## Performance Considerations
192+
193+
- Media content is loaded lazily as cells scroll into view
194+
- Embeddings numbers are not displayed since they make no sense to human anyway.
195+
- Large JSON objects are collapsed by default with expandable views
196+
- List types show limited previews with "show more" functionality

smoosense-gui/src/lib/utils/renderTypeUtils.ts

Lines changed: 1 addition & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ export function isVisualType(renderType: RenderType): boolean {
3434

3535
// Helper functions for string analysis
3636
function inferUrlType(str: string): RenderType {
37-
// Check for iframe+ prefix first
37+
// Check for iframe+http(s):// prefix first
3838
if (str.startsWith('iframe+http://') || str.startsWith('iframe+https://')) {
3939
return RenderType.IFrame
4040
}
@@ -66,11 +66,6 @@ function inferUrlType(str: string): RenderType {
6666
return RenderType.VideoUrl
6767
}
6868

69-
// Check for iframe-suitable URLs (e.g., embedded content)
70-
if (/embed|iframe/.test(str)) {
71-
return RenderType.IFrame
72-
}
73-
7469
// Default to hyperlink for other URLs
7570
return RenderType.HyperLink
7671
}

0 commit comments

Comments
 (0)