You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The from_string API parses a PyDough source code string and transforms it into a PyDough collection. You can then perform operations like explain(), to_sql(), or to_df() on the result.
See the [demo notebooks](../demos/notebooks/1_introduction.ipynb) for more instances of how to use the `to_df` API.
478
480
481
+
<!-- TOC --><aname="evaluation-apis"></a>
482
+
## Transformation APIs
483
+
484
+
This sections describes various APIs you can use to transform PyDough source code into a result that can be used as input for other evaluation or exploration APIs.
485
+
486
+
<!-- TOC --><aname="pydoughfrom_string"></a>
487
+
### `pydough.from_string`
488
+
489
+
The `from_string` API parses a PyDough source code string and transforms it into a PyDough collection. You can then perform operations like `explain()`, `to_sql()`, or `to_df()` on the result.
490
+
491
+
#### Syntax
492
+
```python
493
+
deffrom_string(
494
+
source: str,
495
+
answer_variable: str|None=None,
496
+
metadata: GraphMetadata |None=None,
497
+
environment: dict[str, Any] |None=None,
498
+
) -> UnqualifiedNode:
499
+
```
500
+
501
+
The first argument `source` is the source code string. It can be a single pydough command or a multi-line pydough code with intermediate results stored in variables. It can optionally take in the following keyword arguments:
502
+
503
+
-`answer_variable`: The name of the variable that stores the final result of the PyDough code. If not provided, the API expects the final result to be in a variable named `result`. The API returns a PyDough collection holding this value. It is assumed that the PyDough code string includes a variable definition where the name of the variable is the same as `answer_variable` and the value is valid PyDough code; if not it raises an exception.
504
+
-`metadata`: The PyDough knowledge graph to use for the transformation. If omitted, `pydough.active_session.metadata` is used.
505
+
-`environment`: A dictionary representing additional environment context. This serves as the local namespace where the PyDough code will be executed.
506
+
507
+
Below are examples of using `pydough.from_string`, and examples of the SQL that could be potentially generated from calling `pydough.to_sql` on the output. All these examples use the TPC-H dataset that can be downloaded [here](https://github.com/lovasoa/TPCH-sqlite/releases) with the [graph used in the demos directory](../demos/metadata/tpch_demo_graph.json).
508
+
509
+
This first example is of Python code using `pydough.from_string` to generate SQL to get the count of customers in the market segment `"AUTOMOBILE"`. The result will be returned in a variable named `pydough_query` instead of the default `result`, and the market segment `"AUTOMOBILE"` is passed in an environment variable `SEG`.:
510
+
```py
511
+
import pydough
512
+
513
+
# Setup demo metadata. Make sure you have the TPC-H dataset downloaded locally.
The value of `sql` is the following SQL query text as a Python string:
525
+
```sql
526
+
SELECT
527
+
COUNT(*) AS n
528
+
FROMmain.customer
529
+
WHERE
530
+
c_mktsegment ='AUTOMOBILE'
531
+
```
532
+
533
+
This next example is of Python code to generate SQL to get the top 5 suppliers with the highest revenue. The code snippet uses variables provided in the environment context to filter by nation, ship mode and year (`TARGET_NATION`, `DESIRED_SHIP_MODE` and `REQUESTED_SHIP_YEAR`):
534
+
```py
535
+
# Example of a multi-line pydough code snippet with intermetiate results
Logging is enabled and set to INFO level by default. We can change the log level by setting the environment variable `PYDOUGH_LOG_LEVEL` to the standard levels: DEBUG, INFO, WARNING, ERROR, CRITICAL.
0 commit comments