Skip to content

Commit c37160b

Browse files
authored
Add support for parameters via pf.Param(data) (#212)
* Main edit * Fix tests * Also update constraints * Introduce Param function * Shift examples to using Param * Add support for parameters from files * Remove unnecessary arguments in Param * Improve documentation * Fix annotations in Python 3.9 * Fix tests * Fix broken references * Improve testing * Fix failing tests * Increase test coverage * Increase test coverage
1 parent f3dc40f commit c37160b

33 files changed

+311
-329
lines changed

docs/contribute/index.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,9 @@ Pyoframe has several types of tests.
3939

4040
4. Documentation tests (in `docs/`). All Python code blocks in the documentation are run to ensure the documentation doesn't become outdated. This is done using Sybil. Refer to the [Sybil documentation](https://sybil.readthedocs.io/en/latest/markdown.html#code-blocks) to learn how to create setup code or skip code blocks you don't wish to test.
4141

42+
!!! warning "Non-breaking spaces"
43+
Be aware that Pyoframe uses non-breaking spaces to improve the formatting of expressions. If your Sybil tests are unexpectedly failing, make sure that the expected output contains all the needed non-breaking spaces.
44+
4245
## Writing documentation
4346

4447
You can preview the documentation website by running `mkdocs serve` and navigating to [`http://127.0.0.1:8000/pyoframe/`](http://127.0.0.1:8000/pyoframe/).

docs/examples/facility_location.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -26,12 +26,12 @@ model.customers = model.x_axis * model.y_axis # (1)!
2626

2727

2828
model.facility_position = pf.Variable(model.facilities, model.axis, lb=0, ub=1)
29-
model.customer_position_x = pd.DataFrame(
29+
model.customer_position_x = pf.Param(
3030
{"x": range(G), "x_pos": [step / (G - 1) for step in range(G)]}
31-
).to_expr()
32-
model.customer_position_y = pd.DataFrame(
31+
)
32+
model.customer_position_y = pf.Param(
3333
{"y": range(G), "y_pos": [step / (G - 1) for step in range(G)]}
34-
).to_expr()
34+
)
3535

3636
model.max_distance = pf.Variable(lb=0)
3737

docs/learn/advanced-concepts/internals.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,6 @@
11
# Internal details
22

3-
Pyoframe's inner workings involve a few tricks that you should be aware of
4-
if you wish to modify Pyoframe's internal code.
3+
Pyoframe's inner workings involve a few tricks that you should be aware of if you wish to contribute to Pyoframe's code base.
54

65
## The zero variable
76

@@ -16,7 +15,9 @@ constant terms and also simplifies the [handling of quadratics](#quadratics).
1615

1716
Internally, [Expression][pyoframe.Expression] is used to represent both linear and quadratic mathematical expressions. When a quadratic expression is formed, column `__quadratic_variable_id` is added to [Expression.data][pyoframe.Expression.data]. If an expression's quadratic terms happen to cancel out (e.g. `(ab + c) - ab`), this column is automatically removed.
1817

19-
Column `__quadratic_variable_id` records the ID of the _second_ variable in a quadratic term (the `b` in `3ab`). For linear terms, which have no second variable, this column contains the [Zero Variable](#the-zero-variable). Quadratic terms are always stored such that the first term's variable ID (in column `__variable_id`) is greater or equal to the second term's variable id (in column `__quadratic_variable_id`). For example, `var_7 * var_8` would be rearranged and stored as `var_8 * var_7`. This helps simplify expressions and provides a useful guarantee: If the variable in the first column (`__variable_id`) is the Zero Variable (`var_0`) we know the variable in the second column must also be the Zero Variable and, thus, the term must be a constant.
18+
Column `__quadratic_variable_id` records the ID of the _second_ variable in a quadratic term (the `b` in `3ab`). For linear terms, which have no second variable, this column contains the [Zero Variable](#the-zero-variable).
19+
20+
Quadratic terms are always stored such that the first term's variable ID (in column `__variable_id`) is greater or equal to the second term's variable id (in column `__quadratic_variable_id`). For example, `var_7 * var_8` would be rearranged and stored as `var_8 * var_7`. This helps simplify expressions and provides a useful guarantee: If the variable in the first column (`__variable_id`) is the Zero Variable (`var_0`) we know the variable in the second column must also be the Zero Variable and, thus, the term must be a constant.
2021

2122
## Division
2223

docs/learn/concepts/addition.md

Lines changed: 9 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -31,14 +31,13 @@ import pyoframe as pf
3131
import polars as pl
3232
3333
air_data = pl.DataFrame({"flight_no": ["A4543", "K937"], "emissions": [1.4, 2.4]})
34-
ground_data = pl.DataFrame(
35-
{"flight_number": ["A4543", "K937"], "emissions": [0.02, 0.05]}
36-
)
3734
3835
model = pf.Model()
3936
model.Fly = pf.Variable(air_data["flight_no"], vtype="binary")
4037
model.air_emissions = model.Fly * air_data
41-
model.ground_emissions = ground_data.to_expr()
38+
model.ground_emissions = pf.Param(
39+
{"flight_number": ["A4543", "K937"], "emissions": [0.02, 0.05]}
40+
)
4241
-->
4342

4443
```pycon
@@ -90,7 +89,7 @@ What we'd like to do is effectively 'copy' (aka. 'broadcast') `E_max` _over_ eve
9089

9190
```pycon
9291
>>> model.E_max.over("flight_no")
93-
<Expression terms=1 type=linear>
92+
<Expression (linear) terms=1>
9493
┌───────────┬────────────┐
9594
│ flight_no ┆ expression │
9695
╞═══════════╪════════════╡
@@ -104,7 +103,7 @@ Notice how applying `.over("flight_no")` added a dimension `flight_no` with valu
104103
```pycon
105104
>>> model.emission_constraint = model.E_max.over("flight_no") >= model.flight_emissions
106105
>>> model.emission_constraint
107-
<Constraint 'emission_constraint' height=2 terms=6 type=linear>
106+
<Constraint 'emission_constraint' (linear) height=2 terms=6>
108107
┌───────────┬───────────────────────────────┐
109108
│ flight_no ┆ constraint │
110109
│ (2) ┆ │
@@ -127,19 +126,16 @@ If one of the two expressions in an addition has extras labels not present in th
127126
import pyoframe as pf
128127
import polars as pl
129128
130-
air_data = pl.DataFrame(
129+
model = pf.Model()
130+
model.air_emissions = pf.Param(
131131
{
132132
"flight_no": ["A4543", "K937", "D2082", "D8432", "D1206"],
133133
"emissions": [1.4, 2.4, 4, 7.6, 4],
134134
}
135135
)
136-
ground_data = pl.DataFrame(
136+
model.ground_emissions = pf.Param(
137137
{"flight_no": ["A4543", "K937", "B3420"], "emissions": [0.02, 0.05, 0.001]}
138138
)
139-
140-
model = pf.Model()
141-
model.air_emissions = air_data.to_expr()
142-
model.ground_emissions = ground_data.to_expr()
143139
-->
144140

145141
Consider again [example 1](#example-1-catching-a-mistake) where we added air emissions and ground emissions.
@@ -205,7 +201,7 @@ Option 2 hardly seems reasonable this time considering that air emissions make u
205201

206202
```pycon
207203
>>> model.air_emissions.keep_extras() + model.ground_emissions.drop_extras()
208-
<Expression height=5 terms=5 type=constant>
204+
<Expression (parameter) height=5 terms=5>
209205
┌───────────┬────────────┐
210206
│ flight_no ┆ expression │
211207
│ (5) ┆ │

docs/learn/concepts/special-functions.md

Lines changed: 0 additions & 100 deletions
Original file line numberDiff line numberDiff line change
@@ -10,103 +10,3 @@ Pyoframe has a few special functions that make working with dataframes easy and
1010

1111
## `Expression.map()`
1212

13-
14-
## `DataFrame.to_expr()`
15-
16-
!!! abstract "Summary"
17-
18-
[`pandas.DataFrame.to_expr()`](../../reference/external/pandas.DataFrame.to_expr.md) and [`polars.DataFrame.to_expr()`](../../reference/external/polars.DataFrame.to_expr.md) allow users to manually convert their DataFrames to Pyoframe [Expressions][pyoframe.Expression] when Pyoframe is unable to perform an automatic conversation.
19-
20-
Pyoframe conveniently allows users to use [Polars DataFrames](https://docs.pola.rs/api/python/stable/reference/dataframe/index.html) and [Pandas DataFrames](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html) in their mathematical expressions. To do so, Pyoframe automatically detects these DataFrames and converts them to Pyoframe [Expressions][pyoframe.Expression] whenever there is a mathematical operation (e.g., `*`, `-`, `+`) involving at least one Pyoframe object (e.g. [Variable][pyoframe.Variable], [Set][pyoframe.Set], [Expression][pyoframe.Expression], etc.).
21-
22-
However, if **neither** the left or right terms of a mathematical operation is a Pyoframe object, Pyoframe will not automatically convert DataFrames[^2]. In these situations, users can manually convert their DataFrames to Pyoframe expressions using `.to_expr()`.
23-
24-
Additionally, users should use `.to_expr()` whenever they wish to use [over][pyoframe.Expression.over], [drop_extras][pyoframe.Expression.drop_extras], or [keep_extras][pyoframe.Expression.keep_extras] on a DataFrame.
25-
26-
!!! info "Under the hood"
27-
28-
How is `.to_expr()` a valid Pandas and Polars method? `import pyoframe` causes Pyoframe to [monkey patch](https://stackoverflow.com/questions/5626193/what-is-monkey-patching) the Pandas and Polars libraries. One of the patches adds the `.to_expr()` method to both `pandas.DataFrame` and `polars.DataFrame` (see [`monkey_patch.py`](https://github.com/Bravos-Power/pyoframe/tree/main/src/pyoframe)).
29-
30-
!!! tip "Working with Pandas Series"
31-
32-
You can call `.to_expr()` on a Pandas Series to produce an expression where the labels will be determined from the Series' index.
33-
34-
[^2]: After all, how could it? If a user decides to write code that adds two DataFrames together, Pyoframe shouldn't interfere.
35-
36-
### Example
37-
38-
Consider the following scenario where we have some population data on yearly births and deaths, as well as an immigration variable.
39-
40-
```python
41-
import pyoframe as pf
42-
import pandas as pd
43-
44-
population_data = pd.DataFrame(
45-
dict(year=[2025, 2026], births=[1e6, 1.1e6], deaths=[-1.2e6, -1.4e6])
46-
)
47-
48-
model = pf.Model()
49-
model.immigration = pf.Variable(dict(year=[2025, 2026]))
50-
```
51-
52-
Now, saw we wanted an expression representing the total yearly population change. The following works just fine:
53-
54-
```pycon
55-
>>> (
56-
... model.immigration
57-
... + population_data[["year", "births"]]
58-
... + population_data[["year", "deaths"]]
59-
... )
60-
<Expression height=2 terms=4 type=linear>
61-
┌──────┬───────────────────────────┐
62-
│ year ┆ expression │
63-
│ (2) ┆ │
64-
╞══════╪═══════════════════════════╡
65-
│ 2025 ┆ immigration[2025] -200000 │
66-
│ 2026 ┆ immigration[2026] -300000 │
67-
└──────┴───────────────────────────┘
68-
69-
```
70-
71-
But, if we simply change the order of the terms in our addition, we get an error:
72-
73-
```pycon
74-
>>> (
75-
... population_data[["year", "births"]]
76-
... + population_data[["year", "deaths"]]
77-
... + model.immigration
78-
... )
79-
Traceback (most recent call last):
80-
...
81-
ValueError: Cannot create an expression with duplicate labels:
82-
┌────────┬────────┬─────────┬───────────────┐
83-
│ births ┆ deaths ┆ __coeff ┆ __variable_id │
84-
│ --- ┆ --- ┆ --- ┆ --- │
85-
│ f64 ┆ f64 ┆ i64 ┆ i32 │
86-
╞════════╪════════╪═════════╪═══════════════╡
87-
│ null ┆ null ┆ 4050 ┆ 0 │
88-
│ null ┆ null ┆ 4052 ┆ 0 │
89-
└────────┴────────┴─────────┴───────────────┘.
90-
91-
```
92-
93-
What happened? Since Python computes additions from left to right, the second re-arranged version failed because, in the first addition, neither operand is a Pyoframe object. As such, the addition is done by Pandas, not Pyoframe, which leads to unexpected results.
94-
95-
How do we avoid these weird behaviors? Users can manually convert their DataFrames to Pyoframe expressions ahead of time with `.to_expr()`. For example:
96-
97-
```pycon
98-
>>> (
99-
... population_data[["year", "births"]].to_expr()
100-
... + population_data[["year", "deaths"]].to_expr()
101-
... + model.immigration
102-
... )
103-
<Expression height=2 terms=4 type=linear>
104-
┌──────┬─────────────────────────────┐
105-
│ year ┆ expression │
106-
│ (2) ┆ │
107-
╞══════╪═════════════════════════════╡
108-
│ 2025 ┆ -200000 + immigration[2025] │
109-
│ 2026 ┆ -300000 + immigration[2026] │
110-
└──────┴─────────────────────────────┘
111-
112-
```

docs/learn/get-started/basic-example/example-with-dimensions.md

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@ Load `food_data.csv` using [Polars](https://pola.rs/) or [Pandas](https://pandas
5050

5151
Pyoframe works the same whether you're using [Polars](https://pola.rs/) or [Pandas](https://pandas.pydata.org/), two similar libraries for manipulating data with DataFrames. We prefer using Polars because it is much faster (and generally better), but you can use whichever library you're most comfortable with.
5252

53-
Note that, internally, Pyoframe always uses Polars during computations to ensure the best performance. If you're using Pandas, your DataFrames will automatically be converted to Polars prior to computations. If needed, you can convert a Polars DataFrame back to Pandas using [`polars.DataFrame.to_pandas()`](https://docs.pola.rs/api/python/stable/reference/dataframe/api/polars.DataFrame.to_pandas.html#polars.DataFrame.to_pandas).
53+
Note that, internally, Pyoframe always uses Polars during computations to ensure the best performance. If you're using Pandas, your DataFrames will automatically be converted to Polars prior to computations.
5454

5555
## Step 2: Create the model
5656

@@ -101,7 +101,7 @@ First, multiply the variable by the protein amount.
101101

102102
```pycon
103103
>>> data[["food", "cost"]] * m.Buy
104-
<Expression height=2 terms=2 type=linear>
104+
<Expression (linear) height=2 terms=2>
105105
┌──────────────┬─────────────────────┐
106106
│ food ┆ expression │
107107
│ (2) ┆ │
@@ -120,7 +120,7 @@ Second, notice that the `Expression` still has the `food` dimension—it really
120120

121121
```pycon
122122
>>> (data[["food", "cost"]] * m.Buy).sum("food")
123-
<Expression terms=2 type=linear>
123+
<Expression (linear) terms=2>
124124
4 Buy[tofu_block] +3 Buy[chickpea_can]
125125

126126
```
@@ -146,6 +146,8 @@ assert m.Buy.solution["solution"].to_list() == [2, 1]
146146

147147
## Put it all together
148148

149+
If you've followed the steps above your code should look like:
150+
149151
<!-- clear-namespace -->
150152

151153
```python
@@ -162,7 +164,7 @@ m.protein_constraint = (data[["food", "protein"]] * m.Buy).sum() >= 50
162164
m.optimize()
163165
```
164166

165-
So you should buy:
167+
And you can retrieve the problem's solution as follows:
166168

167169
```pycon
168170
>>> m.Buy.solution
@@ -177,6 +179,6 @@ So you should buy:
177179

178180
```
179181

180-
Notice that since `m.Buy` is dimensioned, `m.Buy.solution` returned a DataFrame with the solution for each of the labels.
182+
Since `m.Buy` is dimensioned, `m.Buy.solution` returned a DataFrame with the solution for each of the labels!
181183

182184
<!-- -->

docs/learn/get-started/basics.md

Lines changed: 45 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -200,7 +200,7 @@ Notice how the order of the days in `Hours_Sleep` is reversed. This is no proble
200200

201201
```pycon
202202
>>> m.hours_remaining
203-
<Expression height=5 terms=15 type=linear>
203+
<Expression (linear) height=5 terms=15>
204204
┌─────┬───────────────────────────────────────────┐
205205
│ day ┆ expression │
206206
│ (5) ┆ │
@@ -214,7 +214,15 @@ Notice how the order of the days in `Hours_Sleep` is reversed. This is no proble
214214

215215
```
216216

217-
Expressions can also be formed by combining variables with DataFrames. For example:
217+
### Using parameters
218+
219+
Often, our models need to incorporate external data. To do this, we need to use **parameters**.
220+
221+
In Pyoframe, a parameter is actually just an [Expression][pyoframe.Expression] that does not contain any Variables (aka. a constant).
222+
223+
You can convert your data to a parameter by passing it to [`pf.Param(data)`][pyoframe.Param]. The last column of the data is always treated as the parameter value, and all other columns are treated as labels. (See [Param][pyoframe.Param] for other ways to create parameters.)
224+
225+
For example, consider the following code that integrates the `holidays` DataFrame into a pay calculation:
218226

219227
```python
220228
import pandas as pd
@@ -227,15 +235,34 @@ base_pay = 20
227235
holiday_bonus = 10
228236

229237
m = pf.Model()
238+
m.is_holiday = pf.Param(holidays)
230239
m.Hours_Worked = pf.Variable(holidays["day"], lb=0)
231-
m.pay = m.Hours_Worked * base_pay + m.Hours_Worked * holidays * holiday_bonus
240+
m.pay = m.Hours_Worked * (base_pay + m.is_holiday * holiday_bonus)
241+
```
242+
243+
Here, `m.is_holiday` is a parameter Expression:
244+
245+
```pycon
246+
>>> m.is_holiday
247+
<Expression (parameter) height=5 terms=5>
248+
┌─────┬────────────┐
249+
│ day ┆ expression │
250+
│ (5) ┆ │
251+
╞═════╪════════════╡
252+
│ Mon ┆ 0 │
253+
│ Tue ┆ 0 │
254+
│ Wed ┆ 0 │
255+
│ Thu ┆ 0 │
256+
│ Fri ┆ 1 │
257+
└─────┴────────────┘
258+
232259
```
233260

234-
Here the `holidays` DataFrame is converted into an Expression and then multiplied by the `Hours_Worked` variable. When DataFrames are converted to Expressions, the last column is always treated as the value column and all previous columns are treated as dimension columns. The result is:
261+
And the resulting `m.pay` Expression correctly incorporates the holiday bonus only on Fridays:
235262

236263
```pycon
237264
>>> m.pay
238-
<Expression height=5 terms=5 type=linear>
265+
<Expression (linear) height=5 terms=5>
239266
┌─────┬──────────────────────┐
240267
│ day ┆ expression │
241268
│ (5) ┆ │
@@ -249,38 +276,29 @@ Here the `holidays` DataFrame is converted into an Expression and then multiplie
249276

250277
```
251278

252-
The page on [Transforms](../concepts/special-functions.md) describes additional ways to manipulate Expressions (e.g. `.sum(…)`, `.map(…)`, `.next(…)`).
253-
254-
### Using `.to_expr()`
255-
256-
Note that previous example would not have worked if we had instead written:
279+
Note that often, you can skip defining parameters because whenever a Pyoframe object is combined with a DataFrame, Pyoframe will automatically convert the DataFrame to a parameter Expression. For example, the following works just fine:
257280

258281
```pycon
259-
>>> m.pay = m.Hours_Worked * (base_pay + holidays * holiday_bonus)
260-
Traceback (most recent call last):
261-
...
262-
TypeError: unsupported operand type(s) for +: 'int' and 'str'
263-
264-
```
265-
266-
The error occurs because you cannot simply multiply a DataFrame (`holidays`) by a number (`holiday_bonus`). In these cases, you need to explicitly convert the DataFrame to an Expression using `.to_expr()`:
267-
268-
```pycon
269-
>>> m.Hours_Worked * (base_pay + holidays.to_expr() * holiday_bonus)
270-
<Expression height=5 terms=5 type=linear>
282+
>>> m.bonus_pay = m.Hours_Worked * holidays * holiday_bonus
283+
>>> m.bonus_pay
284+
<Expression (linear) height=5 terms=5>
271285
┌─────┬──────────────────────┐
272286
│ day ┆ expression │
273287
│ (5) ┆ │
274288
╞═════╪══════════════════════╡
275-
│ Mon ┆ 20 Hours_Worked[Mon]
276-
│ Tue ┆ 20 Hours_Worked[Tue]
277-
│ Wed ┆ 20 Hours_Worked[Wed]
278-
│ Thu ┆ 20 Hours_Worked[Thu]
279-
│ Fri ┆ 30 Hours_Worked[Fri] │
289+
│ Mon ┆ 0
290+
│ Tue ┆ 0
291+
│ Wed ┆ 0
292+
│ Thu ┆ 0
293+
│ Fri ┆ 10 Hours_Worked[Fri] │
280294
└─────┴──────────────────────┘
281295

282296
```
283297

298+
### Transforms
299+
300+
The page on [Transforms](../concepts/special-functions.md) describes additional ways to formulate Expressions (e.g. using `.sum(…)`, `.map(…)`, `.next(…)`).
301+
284302
## Add constraints
285303

286304
Create constraints by using the `<=`, `>=`, and `==` operators between Expressions. For example:

docs/reference/.nav.yml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,5 +2,4 @@ nav:
22
- Overview: index.md
33
- Public API: public/
44
- Base Classes: bases/
5-
- External methods: external/
65
- Types: types/

docs/reference/external/.nav.yml

Lines changed: 0 additions & 4 deletions
This file was deleted.

0 commit comments

Comments
 (0)