Skip to content

Conversation

@neema2
Copy link

@neema2 neema2 commented Jul 1, 2025

This PR integrates SQLGlot into the legendql branch to enable conversion of SQL statements into Pure "Relation" code.

Changes

  • Added SQLGlot dependency for SQL parsing
  • Created SQLParser class that converts SQLGlot AST to LegendQL metamodel
  • Added Query.from_sql() method for creating queries from SQL
  • Added from_sql() function to the LegendQL API
  • Added comprehensive test suite for SQL parsing
  • Updated tests to assert against correct Legend Pure code format

Example Usage

import legendql as lq
from legendql.model.schema import Table, Database
import pyarrow as pa

# Create table and database
table = Table("tableC", [pa.field("colA", pa.utf8()), pa.field("colB", pa.utf8())])
database = Database("test::Database", [table])

# Parse SQL and generate Pure Relation code
sql = "select colA, colB from tableC"
query = lq.from_sql(sql, database)
pure_relation = query.to_string()
# Output: #>{test::Database.tableC}#->select(~[colA, colB])->from(legendql::Runtime)

Testing

  • Added unit tests for SQL parser functionality
  • Verified integration with existing LegendQL API
  • Tested with the example from the task: "select colA, colB from tableC"
  • Updated test assertions to verify exact Pure relation code format
  • All 29 tests pass successfully

Pure Relation Code Format

The SQL parser generates correct Pure relation syntax:

  • Basic SELECT: #>{database.table}#->select(~[col1, col2])->from(legendql::Runtime)
  • WHERE clauses: ->filter(column==value)
  • ORDER BY: ->sort([~column->ascending(), ~column->descending()])
  • LIMIT: ->limit(number)
  • OFFSET: ->drop(number)

Fixes the requirement to convert SQL statements into Pure "Relation" code using SQLGlot's AST parsing capabilities.

Link to Devin run: https://app.devin.ai/sessions/c47cbf6679c54bb98ab35459c946bf98

Requested by: [email protected]

- Add SQLGlot dependency to pyproject.toml
- Create SQLParser class to convert SQL AST to LegendQL metamodel
- Add Query.from_sql() class method for SQL-based query creation
- Add from_sql() function to LegendQL API
- Add comprehensive tests for SQL parsing functionality
- Support basic SELECT statements with FROM, WHERE, ORDER BY, LIMIT, OFFSET
- Generate Pure Relation code like: #>{database.table}#->select(~[col1, col2])->from(legendql::Runtime)

Co-Authored-By: [email protected] <[email protected]>
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Jul 1, 2025

CLA Not Signed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant