Skip to content

Commit 6d5025e

Browse files
authored
Merge pull request #14 from rdhyee/issue-13-parquet-duckdb
integrating Raymond's experiments so far with quarto
2 parents c85b674 + e75e42c commit 6d5025e

File tree

4 files changed

+159
-0
lines changed

4 files changed

+159
-0
lines changed

.github/workflows/hello-world.yml

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
name: Hello World Workflow
2+
3+
on:
4+
workflow_dispatch:
5+
push:
6+
branches:
7+
- '**'
8+
pull_request:
9+
types: [opened, synchronize, reopened]
10+
11+
jobs:
12+
say-hello:
13+
runs-on: ubuntu-latest
14+
steps:
15+
- name: Checkout code
16+
uses: actions/checkout@v3
17+
- name: Say Hello
18+
run: |
19+
echo "Hello, World!"
20+
echo "Running on branch: ${{ github.ref_name }}"
21+
echo "Triggered by: ${{ github.event_name }}"

_quarto.yml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,13 @@ website:
3838
- icon: github
3939
text: Github
4040
href: "https://github.com/isamplesorg/"
41+
- section: "Tutorials"
42+
contents:
43+
- text: "iSamples Tutorials Overview"
44+
href: tutorials/index.qmd
45+
- text: "iSamples Parquet Tutorial"
46+
href: tutorials/parquet.qmd
47+
4148

4249

4350
# configure for correct source repository

tutorials/index.qmd

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
---
2+
title: "Tutorials: Overview"
3+
---
4+
5+
Here's where we park our various tutorials!
6+
7+
Get the OpenAPI spec.
8+
9+
```{ojs}
10+
//| echo: true
11+
12+
// Get the OpenAPI specification and display detailed endpoint information
13+
viewof apiEndpointDetails = {
14+
// Show loading indicator
15+
const loadingElement = html`<div>Loading API endpoints...</div>`;
16+
document.body.appendChild(loadingElement);
17+
18+
try {
19+
const OPENAPI_URL = 'https://central.isample.xyz/isamples_central/openapi.json';
20+
21+
// Fetch the OpenAPI spec
22+
const response = await fetch(OPENAPI_URL);
23+
if (!response.ok) throw new Error(`Failed to fetch API spec: ${response.status}`);
24+
25+
const apiSpec = await response.json();
26+
27+
// Extract detailed information about each endpoint
28+
const endpointDetails = [];
29+
30+
for (const [path, pathMethods] of Object.entries(apiSpec.paths)) {
31+
for (const [method, details] of Object.entries(pathMethods)) {
32+
endpointDetails.push({
33+
endpoint: path,
34+
method: method.toUpperCase(),
35+
summary: details.summary || '',
36+
operationId: details.operationId || '',
37+
tags: (details.tags || []).join(', '),
38+
parameters: (details.parameters || [])
39+
.map(p => `${p.name} (${p.required ? 'required' : 'optional'})`)
40+
.join(', ')
41+
});
42+
}
43+
}
44+
45+
// Create a table with the detailed endpoint information
46+
return Inputs.table(
47+
endpointDetails,
48+
{
49+
label: "iSamples API Endpoints Details",
50+
width: {
51+
endpoint: 150,
52+
method: 80,
53+
summary: 200,
54+
operationId: 200,
55+
tags: 100,
56+
parameters: 300
57+
}
58+
}
59+
);
60+
} catch (error) {
61+
return html`<div style="color: red">Error fetching API endpoints: ${error.message}</div>`;
62+
} finally {
63+
// Remove loading indicator
64+
loadingElement.remove();
65+
}
66+
}
67+
```

tutorials/parquet.qmd

Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
---
2+
title: "Parquet"
3+
---
4+
5+
Let's query Eric's parquet file using duckdb+parquet
6+
7+
8+
simpler query:
9+
10+
```{ojs}
11+
//| echo: true
12+
13+
// Import Observable's libraries
14+
import {DuckDBClient} from "@observablehq/duckdb"
15+
16+
// Create a DuckDB instance
17+
db = DuckDBClient.of()
18+
19+
// Set the Parquet file path
20+
parquet_path = 'https://storage.googleapis.com/opencontext-parquet/oc_isamples_pqg.parquet'
21+
22+
// For testing, use a smaller dataset or limit rows
23+
// Option 1: Use LIMIT to reduce data transferred
24+
viewof testResults = {
25+
// Show loading indicator
26+
const loadingElement = html`<div>Running query...</div>`;
27+
document.body.appendChild(loadingElement);
28+
29+
try {
30+
// Test with a small LIMIT to verify connection works
31+
const data = await db.query(`
32+
SELECT otype, pid
33+
FROM read_parquet('${parquet_path}')
34+
LIMIT 10
35+
`);
36+
return Inputs.table(data);
37+
} finally {
38+
// Remove loading indicator when done (whether success or error)
39+
loadingElement.remove();
40+
}
41+
}
42+
43+
```
44+
45+
now the full query
46+
47+
48+
```{ojs}
49+
//| echo: true
50+
51+
52+
// Query the Parquet file
53+
viewof results = Inputs.table(
54+
await db.query(`
55+
SELECT COUNT(pid) as count, otype
56+
FROM read_parquet('${parquet_path}')
57+
GROUP BY otype
58+
ORDER BY count DESC
59+
`)
60+
)
61+
```
62+
63+
64+

0 commit comments

Comments
 (0)