Skip to content

Website: Add benchmark set summary slides and table#128

Merged
siddharth-krishna merged 22 commits intomainfrom
119-Benchmark-summary-table-from-slides-(or-nice-graphs)
Apr 11, 2025
Merged

Website: Add benchmark set summary slides and table#128
siddharth-krishna merged 22 commits intomainfrom
119-Benchmark-summary-table-from-slides-(or-nice-graphs)

Conversation

@jacek-oet
Copy link
Member

@jacek-oet jacek-oet commented Mar 3, 2025

closes #119

@vercel
Copy link

vercel bot commented Mar 3, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
solver-benchmark ✅ Ready (Inspect) Visit Preview 💬 Add feedback Apr 11, 2025 9:38am

@jacek-oet jacek-oet marked this pull request as ready for review March 11, 2025 15:50
Copy link
Member

@siddharth-krishna siddharth-krishna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Jacek! A few requests, please:

  • Can we have the text columns in the table aligned left and the numerical columns aligned right?
  • "Total n. of different problems" -> "Total number of benchmark problems", "Multiple size instances" -> "Total number of benchmark size instances", "MILP Feature" -> "MILP Features"
  • Could we add a column to the end called Total that sums up the numbers across the columns for each model framework? This way we can see e.g. the total number of LPs vs MILPs
  • Can we merge table cells vertically so that e.g. the cell saying "Technique" spans the two rows containing "LP" and "MILP"?
  • And can you please double check the numbers against the screenshot in #119 ? Right now the "Multiple size instances" row has the same numbers as the row above it, for instance.

@danielelerede-oet could you please also take a look at the table this PR adds to the "Benchmark details" dashboard? Just to check if it looks okay, but also to see if any of the data would be better presented by a graph. For instance, would it be better to have a stacked bar chart for "MILP Features" with one bar per feature on the x-axis, and different colors in the bar for each model framework?

@jacek-oet
Copy link
Member Author

And can you please double check the numbers against the screenshot in #119 ? Right now the "Multiple size instances" row has the same numbers as the row above it, for instance.

I checked

eg: Model: Tulipa

Total number of benchmark problems: 1

Total number of benchmark size instances: 6

tulipa-1_EU_investment_simple:
  Short description: European-level investment and operation model to consider integer
    investments and unit commitment variables
  Model name: Tulipa
  Version:
  Technique: MILP
  Kind of problem: Infrastructure
  Sectors: Power
  Time horizon: Single period (1 year)
  MILP features: Unit commitment
  Sizes:
  - Name: 28-24h
    Size: XS
    URL: https://raw.githubusercontent.com/jump-dev/open-energy-modeling-benchmarks/main/instances/TulipaEnergyModel_1_EU_investment_simple_24h-dc195d31e02ab753b5f047a4e754d2d387b59e15ac813c515b7fab21651779ea.mps.gz
    Spatial resolution: 28
    Temporal resolution: 24
    N. of constraints: 4746
    N. of variables:
    N. of continuous variables: 6658
    N. of integer variables: 251
    N. of binary variables: 7
  - Name: 28-52.1h
    Size: S
    URL: https://raw.githubusercontent.com/jump-dev/open-energy-modeling-benchmarks/main/instances/TulipaEnergyModel_1_EU_investment_simple_168h-a213cb9c037b9b9e0d0d38a1b7644a46407203a1f8e5f036c9e39870270c4a4f.mps.gz
    Spatial resolution: 28
    Temporal resolution: 52.1
    N. of constraints: 46592
    N. of variables:
    N. of continuous variables: 46594
    N. of integer variables: 251
    N. of binary variables: 7
  - Name: 28-13h
    Size: S
    URL: https://raw.githubusercontent.com/jump-dev/open-energy-modeling-benchmarks/main/instances/TulipaEnergyModel_1_EU_investment_simple_672h-336e9a77dd9deae1ef81069c4e3f0c4c39cbbe0bc24b4d4e3f666657cbc76ab6.mps.gz
    Spatial resolution: 28
    Temporal resolution: 13
    N. of constraints: 186368
    N. of variables:
    N. of continuous variables: 186370
    N. of integer variables: 251
    N. of binary variables: 7
  - Name: 28-4.3h
    Size: S
    URL: https://raw.githubusercontent.com/jump-dev/open-energy-modeling-benchmarks/main/instances/TulipaEnergyModel_1_EU_investment_simple_2016h-6f7e83e8d3d3d676153925691417a73265d784e70f52b3f70eaa9664921dcd85.mps.gz
    Spatial resolution: 28
    Temporal resolution: 4.3
    N. of constraints: 559104
    N. of variables:
    N. of continuous variables: 559106
    N. of integer variables: 251
    N. of binary variables: 7
  - Name: 28-2.2h
    Size: R
    URL: https://raw.githubusercontent.com/jump-dev/open-energy-modeling-benchmarks/main/instances/TulipaEnergyModel_1_EU_investment_simple_4032h-76d00dd60487dc27e60808bb6421b3091a937cfa503e37de4ef6927757f8ec11.mps.gz
    Spatial resolution: 28
    Temporal resolution: 2.2
    N. of constraints: 1118208
    N. of variables:
    N. of continuous variables: 1118210
    N. of integer variables: 251
    N. of binary variables: 7
  - Name: 28-1h
    Size: R
    URL: https://raw.githubusercontent.com/jump-dev/open-energy-modeling-benchmarks/main/instances/TulipaEnergyModel_1_EU_investment_simple_8760h-beb8fc0072ed50c7bf40d264ffb87ba24de4ae6736d55a7c0a78209f2c5883ef.mps.gz
    Spatial resolution: 28
    Temporal resolution: 1
    N. of constraints: 2429440
    N. of variables:
    N. of continuous variables: 2429442
    N. of integer variables: 251
    N. of binary variables: 7

Model: Power Models
Total number of benchmark problems: 5

Total number of benchmark size instances: 5

pglib_opf_case162_ieee_dtc:
  Short description: System stability study based on the IEEE 17-Generator Dynamic
    Test Case with 162 buses and 17 generators
  Model name: PowerModels
  Version:
  Technique: MILP
  Kind of problem: Steady-state optimal power flow
  Sectors: Power
  Time horizon: N/A
  MILP features: Transmission switching
  Sizes:
  - Name: 162-NA
    Size: L
    URL: https://raw.githubusercontent.com/jump-dev/open-energy-modeling-benchmarks/main/instances/PowerModelsOTS_pglib_opf_case162_ieee_dtc.m-e44d1cc9578fe03e1568c393ab3924265857b7a2383e6384df4df1a820185c73.mps.gz
    Spatial resolution: 162
    Temporal resolution: NA
    N. of constraints: 1867
    N. of variables:
    N. of continuous variables: 1869
    N. of integer variables: 0
    N. of binary variables: 284
pglib_opf_case1803_snem:
  Short description: Optimal Power Flow data for the Synthetic National Electricity
    Market (SNEM) Australia - Mainland subnetwork
  Model name: PowerModels
  Version:
  Technique: LP
  Kind of problem: Steady-state optimal power flow
  Sectors: Power
  Time horizon: N/A
  MILP features: Transmission switching
  Sizes:
  - Name: 1803-NA
    Size: L
    URL: https://raw.githubusercontent.com/jump-dev/open-energy-modeling-benchmarks/main/instances/PowerModelsOTS_pglib_opf_case1803_snem.m-43f8f35813a79b40f197737028098bf5eac73ef2776f3aa2687576ad1d439541.mps.gz
    Spatial resolution: 1803
    Temporal resolution: NA
    N. of constraints: 18574
    N. of variables:
    N. of continuous variables: 18576
    N. of integer variables: 0
    N. of binary variables: 2795
pglib_opf_case1951_rte:
  Short description:
  Model name: PowerModels
  Version:
  Technique: MILP
  Kind of problem: Steady-state optimal power flow
  Sectors: Power
  Time horizon: N/A
  MILP features: Transmission switching
  Sizes:
  - Name: 1951-NA
    Size: L
    URL: https://raw.githubusercontent.com/jump-dev/open-energy-modeling-benchmarks/main/instances/PowerModelsOTS_pglib_opf_case1951_rte.m-96db1a2bb31ca33d4fe3ccb1fa30358f157cd3493cff87ae8f16fa8a6a3b1e48.mps.gz
    Spatial resolution: 1951
    Temporal resolution: NA
    N. of constraints: 17528
    N. of variables:
    N. of continuous variables: 17530
    N. of integer variables: 0
    N. of binary variables: 2596
pglib_opf_case2848:
  Short description:
  Model name: PowerModels
  Version:
  Technique: MILP
  Kind of problem: Steady-state optimal power flow
  Sectors: Power
  Time horizon: N/A
  MILP features: Transmission switching
  Sizes:
  - Name: 2848-NA
    Size: L
    URL: https://raw.githubusercontent.com/jump-dev/open-energy-modeling-benchmarks/main/instances/PowerModelsOTS_pglib_opf_case2848_rte.m-8546f681e53f8663080268242343f948670d9ddcd40ed4b2da6ab94ec1e0fc05.mps.gz
    Spatial resolution: 2848
    Temporal resolution: NA
    N. of constraints: 25505
    N. of variables:
    N. of continuous variables: 25507
    N. of integer variables: 0
    N. of binary variables: 3776
pglib_opf_case2868:
  Short description:
  Model name: PowerModels
  Version:
  Technique: MILP
  Kind of problem: Steady-state optimal power flow
  Sectors: Power
  Time horizon: N/A
  MILP features: Transmission switching
  Sizes:
  - Name: 2868-NA
    Size: L
    URL: https://raw.githubusercontent.com/jump-dev/open-energy-modeling-benchmarks/main/instances/PowerModelsOTS_pglib_opf_case2868_rte.m-bf313f2631231ef9a5301580b317b461c49901a503fd898ee35cf791f1d8bea6.mps.gz
    Spatial resolution: 2868
    Temporal resolution: NA
    N. of constraints: 25717
    N. of variables:
    N. of continuous variables: 25719
    N. of integer variables: 0
    N. of binary variables: 3808

Base automatically changed from 115-Benchmark-details-add-relative-performance-plot-from-Matthias to main March 24, 2025 09:14
…er-benchmark into 119-Benchmark-summary-table-from-slides-(or-nice-graphs)
@danielelerede-oet
Copy link
Member

Hi @siddharth-krishna @jacek-oet . The logic behind the computation of "Total number of benchmark problems" and "Total number of benchmark size instances" seems to work well, but I'm not sure that the definition "Total number of benchmark size instances" is straightforward. Could it be better to have something life "N. of different size instances developed from the same benchmark"? In that case, e.g. for the pglib ones it would be "Total number of benchmark problems": 5, "N. of different size instances developed from the same benchmark": 0 because all the instances are separate ones (but the way Jacek developed it now is in agreement with the current definition).

Concerning representation, it's definitely worth to have a summary bar plot with the different MILP features as Sid was suggesting, and the same would be valid for "Time horizon", "Kind of problem". For sizes it could be useful to develop something more complex, comparing the n. of constraints and variables with spatial and time resolution (though the latter may be a varying concept depending on the modelling platforms, i.e., for instance, nodes in PyPSA, regions in TIMES, but solving a model for a single region in TIMES can be far complex than solving a model for a single node in PyPSA, as the two concepts do not overlap).

@jacek-oet jacek-oet force-pushed the 119-Benchmark-summary-table-from-slides-(or-nice-graphs) branch from e72c4c7 to 3dfd768 Compare March 24, 2025 20:04
@danielelerede-oet
Copy link
Member

Hi @siddharth-krishna @jacek-oet here's my proposal for the table to use for the website (please ignore the numbers, it's just to show the structure I would adopt). A row has been removed ("Kind of problem" as it doesn't add much on the specific modelling platform - users know they can perform both power and sector-coupled modelling on PyPSA while they can't do power sector analyses on TEMOA). I would also reduce the detail on MILP features. Now we have both single features and combinations, but I'd suggest we avoid this on the website and double count the benchmarks in each of the two cells corresponding to the adopted MILP feature.

image

@siddharth-krishna
Copy link
Member

Thank you, Daniele! That's helpful, as we now have less information to present. I'm also wondering if the data is better presented as plots compared to a table? E.g. for Technique LP vs MILP:
image

If that looks good, perhaps we could have a row of such bar charts on the page, above the table that lists all the benchmarks.

@siddharth-krishna
Copy link
Member

siddharth-krishna commented Apr 8, 2025

@danielelerede-oet @jacek-oet I propose a row of plots instead of Daniele's amended table from last week, something like this: (but with the legends showing on each subplot instead of all together on the right side -- I couldn't figure this out in plotly)
image

Note: I removed the MILP features plot for now, because it's hard to extract the categories Daniele proposed automatically. I'll make an issue for this, I think we may have to modify the metadata file to have comma separated values in the MILP Features: field, so that they can be extracted using code easily.

Note also: the size plot will change soon, as we will shift to using num vars to determine size, and have 'real'ness as a separate category. So let's not spend too much time on it in this PR.

@danielelerede-oet
Copy link
Member

Hi @siddharth-krishna , the graph looks great, especially with your proposed amendments.
I only notice a small problem. The y-axis label reports "Total number of benchmark instances" but that's true for plot 1 and 4 (according to the new definitions we agreed on for the summary table).
Confusion may arise especially when looking at the numbers in plot n. 1 and n. 2 as there's no specified distinction between "Total number of benchmark problems" and "Total n. of benchmark size instances".
My proposal would be to merge plot 1 and 2 in a stacked bar plot for the distinction between LP and MILP benchmarks reporting the "Total n. of benchmark size instances" on y-axis for all the plots (so that the total n. of size instances for each platform is also immediately visualized).
Also, the last plot would be modified by reporting the n. of different size-benchmarks according to "Total n. of benchmark size instances", thus summing up to something around 80 benchmarks instead of the 45 reported now.

@siddharth-krishna
Copy link
Member

Ah yes, you're right, I forgot to mention that in my previous comment. Indeed, there was a confusion between number of benchmarks and number of benchmark instances, which is why in my current proposal I'm suggesting we always use number of benchmark instances (including sizes, so the larger number) in all the plots, to simplify things. We can have a sentence above/below the row of plots that explain this. Do you think that's okay?

People can still use the filters and the table of all benchmarks below to answer queries like "how many benchmark problems (not instances) do we have that use PyPSA and MILP".

@siddharth-krishna
Copy link
Member

siddharth-krishna commented Apr 8, 2025

@jacek-oet I had a call with Daniele and we aligned on the following proposal: let's replace the current table (titled Model Distribution Matrix) with a row of plots that look like the above, but displaying the following info:

  1. A stacked bar chart showing number of instances of each model framework (on the x-axis), with each bar being split into 2 colors to show number of LP instances and number of MILP instances. (Example of a stacked bar chart: link)
  2. A stacked bar chart showing number of instances of each model framework (on the x-axis), using 2 colors to show the split into single-stage and multi-stage
  3. A bar chart that shows the number of instances in each size category. We will convert this into a stacked bar chart in a future PR that rethinks how we do size categorization.

For all plots, let's use the number of benchmark instances (including sizes) on the y-axis for simplicity.

If this makes sense, please can you amend the 'Benchmark details' page as follows:

  • Move the "Benchmarks" header and the paragraph "On this page ..." to the top of the page above the filters
  • Next, show the row of plots like the above, under a subheading "Summary of Benchmark Set"
  • Then have a link "See more details ->" that takes the user to a new page that has the Model Distribution Matrix table from the current version
  • Then have a subheading "List of All Benchmarks"
  • Then show the filters and the table listing all benchmarks. Note that the filters should only apply to the table, and not to the row of plots above.

…er-benchmark into 119-Benchmark-summary-table-from-slides-(or-nice-graphs)
…p comments in chart type definitions for clarity
Copy link
Member

@siddharth-krishna siddharth-krishna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much Jacek! It looks good to me on a high level. @danielelerede-oet can I request you run this branch on your laptop and take a look at the 'Benchmark details' page and the new page you get to when you click the 'See more details' button?

Jacek a few minor requests from me please:

  • Can we remove the DetailSection top bar from benchmark-details.tsx and benchmark-summary.tsx?
  • In benchmark-details, can we move the filters to be below the List of All Benchmarks header?
  • In benchmark-summary, can we have a H1 header at the top of the page saying Distribution of Model Features in Benchmark Set? (Daniele, please suggest a better title for that page if you have one!)
  • Can we have the breadcrumbs in benchmark-summary look like Benchmark Details > Feature Distribution (again, open to better suggestions here)

Thanks!

@siddharth-krishna
Copy link
Member

siddharth-krishna commented Apr 11, 2025

Thanks, Jacek, it looks good. Can you please remove this "Model Distribution Matrix" header and then merge it in?
image

Daniele, if you still have any comments, please do write them here and we'll fix it in the next PR. I'd like to get this one in so that Heba can start designing the new version of this page. Thanks

@siddharth-krishna siddharth-krishna changed the title Website: Improvements in Benchmark Summary Table Website: Add benchmark set summary slides and table Apr 11, 2025
@siddharth-krishna siddharth-krishna enabled auto-merge (squash) April 11, 2025 09:37
@siddharth-krishna siddharth-krishna merged commit f0b99b5 into main Apr 11, 2025
4 checks passed
@siddharth-krishna siddharth-krishna deleted the 119-Benchmark-summary-table-from-slides-(or-nice-graphs) branch April 11, 2025 09:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Benchmark summary table from slides (or nice graphs)

3 participants