Skip to content

Latest commit

 

History

History
405 lines (319 loc) · 6.22 KB

File metadata and controls

405 lines (319 loc) · 6.22 KB

API Reference

Base URLs

  • Producer: http://localhost:8081
  • Consumer: http://localhost:8082

Producer Endpoints

POST /ingest

Ingest a table from a URL.

Request:

POST /ingest HTTP/1.1
Host: localhost:8081
Content-Type: application/json

{
  "url": "https://en.wikipedia.org/wiki/List_of_countries_by_population_(United_Nations)"
}

Response (202 Accepted):

{
  "status": "success",
  "message": "table ingestion started"
}

Error Responses:

Status Body Cause
400 {"error": "url is required"} Missing URL
400 {"error": "invalid request body"} Malformed JSON
500 {"error": "invalid URL: ..."} SSRF blocked or invalid URL
500 {"error": "fetch failed: ..."} Network error
500 {"error": "parse failed: ..."} No table found

POST /ingest/all

Ingest ALL tables from a URL (multi-table extraction).

Request:

POST /ingest/all HTTP/1.1
Host: localhost:8081
Content-Type: application/json

{
  "url": "https://en.wikipedia.org/wiki/List_of_countries_by_population_(United_Nations)"
}

Response (202 Accepted):

{
  "status": "success",
  "message": "all tables ingestion started",
  "tables_found": 3
}

Behavior:

  • Extracts ALL tables found on the page
  • Tables are named with suffix: tablename, tablename_2, tablename_3
  • Skips tables that fail validation
  • Returns count of tables found

Error Responses: Same as /ingest


GET /health

Health check endpoint.

Request:

GET /health HTTP/1.1
Host: localhost:8081

Response (200 OK):

{
  "status": "healthy",
  "time": "2026-01-18T11:00:00Z"
}

GET /metrics

Producer metrics and statistics.

Request:

GET /metrics HTTP/1.1
Host: localhost:8081

Response (200 OK):

{
  "uptime_seconds": 3600.5,
  "producer": {
    "ingest_requests": 50,
    "ingest_success": 48,
    "ingest_errors": 2,
    "tables_fetched": 48,
    "rows_published": 12500
  },
  "consumer": {
    "messages_consumed": 0,
    "tables_created": 0,
    "rows_inserted": 0,
    "processing_errors": 0
  },
  "last_processed_at": "0001-01-01T00:00:00Z"
}

Consumer Endpoints

GET /health

Health check endpoint.

Request:

GET /health HTTP/1.1
Host: localhost:8082

Response (200 OK):

{
  "status": "healthy"
}

GET /stats

Consumer metrics and statistics.

Request:

GET /stats HTTP/1.1
Host: localhost:8082

Response (200 OK):

{
  "uptime_seconds": 3600.5,
  "producer": {
    "ingest_requests": 0,
    "ingest_success": 0,
    "ingest_errors": 0,
    "tables_fetched": 0,
    "rows_published": 0
  },
  "consumer": {
    "messages_consumed": 125,
    "tables_created": 48,
    "rows_inserted": 12500,
    "processing_errors": 0
  },
  "last_processed_at": "2026-01-18T11:30:00Z"
}

GET /tables

List all ingested tables.

Request:

GET /tables HTTP/1.1
Host: localhost:8082

Response (200 OK):

{
  "tables": [
    "list_of_countries_by_population_united_nations",
    "html_tables_tutorial"
  ],
  "count": 2
}

GET /tables/{table_name}

Get table metadata.

Request:

GET /tables/list_of_countries_by_population_united_nations HTTP/1.1
Host: localhost:8082

Response (200 OK):

{
  "name": "list_of_countries_by_population_united_nations",
  "row_count": 235,
  "columns": [
    {"name": "rank", "position": 0},
    {"name": "country", "position": 1},
    {"name": "population", "position": 2}
  ]
}

Error Response (404):

{
  "error": "sql: no rows in result set"
}

GET /tables/{table_name}/data

Query table data.

Request:

GET /tables/list_of_countries_by_population_united_nations/data?limit=5 HTTP/1.1
Host: localhost:8082

Query Parameters:

Parameter Type Default Max Description
limit int 100 1000 Number of rows to return

Response (200 OK):

{
  "table": "list_of_countries_by_population_united_nations",
  "rows": [
    {"rank": "1", "country": "China", "population": "1425887337"},
    {"rank": "2", "country": "India", "population": "1417173173"},
    {"rank": "3", "country": "United States", "population": "338289857"}
  ],
  "count": 3
}

GET /kafka/brokers

List Kafka brokers.

Request:

GET /kafka/brokers HTTP/1.1
Host: localhost:8082

Response (200 OK):

{
  "brokers": [
    {
      "id": 1,
      "address": "localhost:9092"
    }
  ],
  "count": 1
}

GET /kafka/topic

Get topic metadata including partitions.

Request:

GET /kafka/topic HTTP/1.1
Host: localhost:8082

Response (200 OK):

{
  "name": "lattice.table.records",
  "partitions": [
    {
      "id": 0,
      "leader": 1,
      "replicas": [1],
      "isr": [1]
    },
    {
      "id": 1,
      "leader": 1,
      "replicas": [1],
      "isr": [1]
    },
    {
      "id": 2,
      "leader": 1,
      "replicas": [1],
      "isr": [1]
    }
  ]
}

Fields:

  • leader: Broker ID handling reads/writes for this partition
  • replicas: List of broker IDs holding replicas
  • isr: In-Sync Replicas (brokers caught up with leader)

GET /kafka/consumer-group

Get consumer group offsets.

Request:

GET /kafka/consumer-group HTTP/1.1
Host: localhost:8082

Response (200 OK):

{
  "group": "lattice-table-writer",
  "topic": "lattice.table.records",
  "offsets": {
    "0": 45,
    "1": 38,
    "2": 42
  }
}

Fields:

  • offsets: Map of partition ID to committed offset

Error Response Format

All error responses follow this format:

{
  "error": "error message here"
}

HTTP Status Codes

Code Meaning
200 Success
202 Accepted (async processing started)
400 Bad Request (validation error)
404 Not Found
405 Method Not Allowed
500 Internal Server Error

CORS

CORS is enabled for all origins on the producer /ingest endpoint.

Headers:

Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET, POST, OPTIONS
Access-Control-Allow-Headers: Content-Type