Skip to content

Commit b9abd8c

Browse files
committed
additional information using previous studies
1 parent aac0e8e commit b9abd8c

File tree

3 files changed

+45
-6
lines changed

3 files changed

+45
-6
lines changed

src/main/rules/GCI404/python/GCI404.asciidoc

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,20 @@ image::carbone.png[]
6161

6262
For both metrics, the bigger the list, the greater is the gain is.
6363

64+
=== Additional Study
65+
66+
A complementary benchmark from the Creedengo Challenge Issue #113 further supports the recommendation to avoid list comprehensions in loop declarations.
67+
68+
In a controlled containerized test:
69+
70+
The "bad" implementation using a list comprehension consumed: 4,493,365,500 bytes (~4.19 GB)
71+
72+
The "good" implementation using a generator expression consumed: 4,478,423,217 bytes (~4.17 GB)
73+
74+
Memory savings: 14,942,283 bytes (~14.24 MB)
75+
76+
Credit: https://github.com/green-code-initiative/creedengo-challenge/issues/113
77+
6478
=== Conclusion
6579

6680
Our analysis clearly demonstrates that replacing list comprehensions with generator expressions in Python for-loops offers substantial benefits in terms of both memory efficiency and environmental impact. As the data size increases, the advantages become increasingly significant.
@@ -69,4 +83,4 @@ Our analysis clearly demonstrates that replacing list comprehensions with genera
6983

7084
Source: https://github.com/green-code-initiative/creedengo-rules-specifications/pull/152
7185

72-
https://docs.python.org/3/howto/functional.html#generator-expressions-and-list-comprehensions
86+
https://docs.python.org/3/howto/functional.html#generator-expressions-and-list-comprehensions

src/main/rules/GCI72/python/GCI72.asciidoc

Lines changed: 30 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -25,24 +25,34 @@ def foo():
2525
----
2626
== Relevance Analysis
2727

28-
The following results were obtained through local experiments.
28+
The following insights are derived from both local experiments and the study "Comparing Multiple Rows Insert vs Single Row Insert" by Redgate.
2929

3030
=== Configuration
31-
* SQLite Database: 5-6 GB
32-
* Processor: Intel(R) Core(TM) Ultra 5 135U, 2100 MHz, 12 cores, 14 logical processors
31+
* SQLite Database: 56 GB (local test)
32+
* Processor: Intel(R) Core(TM) Ultra 5 135U, 12 cores, 16 threads
3333
* RAM: 16 GB
34-
* CO2 Emissions Measurement: Using CodeCarbon
34+
* CO2 Emissions Measurement: CodeCarbon
35+
* Additional Reference System (Redgate):
36+
** SQL Server 2008 R2
37+
** Database and client application : Lenovo ThinkCentre M90, Windows XP
3538

3639
=== Context
3740

3841
This practice can significantly degrade performance, especially when processing large datasets or making repetitive database calls. By opting for batch processing instead of executing queries in loops, developers can improve overall system efficiency and reduce the carbon footprint of their applications.
3942

43+
The Redgate study demonstrated that **batch processing can outperform row-by-row operations by several orders of magnitude**, particularly in data load scenarios. Even with optimized systems like SSIS or high-speed disks, row-level operations remain significantly slower and more resource-intensive.
44+
45+
These results align with local benchmarks in Python using SQLite.
46+
4047
=== Test Execution
4148

42-
The performance analysis was conducted by executing 1000 queries for both the non-compliant and compliant solutions. For the non-compliant solution, each query was executed individually within a loop. For the compliant solution, a batch query with the same 1000 queries was executed.
49+
Local benchmark compared:
50+
- 1000 individual `SELECT` queries executed in a loop.
51+
- A single batched `SELECT` query using `IN (...)`.
4352

4453
=== Impact Analysis
4554

55+
*Local benchmark results:*
4656
[cols="1,1,1", options="header"]
4757
|===
4858
|Metric |Compliant Solution |Non-compliant Solution
@@ -53,12 +63,27 @@ The performance analysis was conducted by executing 1000 queries for both the no
5363

5464
*Converter: https://impactco2.fr/outils/comparateur
5565

66+
*Redgate study results:*
67+
68+
69+
image::image.png[width=600, align="center", alt="Redgate study results"]
70+
71+
72+
[cols="1,1,1", options="header"]
73+
|===
74+
|Insert Method |Execution Time (for 1M rows) |Relative Performance
75+
|Single-row insert in loop |57 seconds |Baseline (slowest)
76+
|Batch insert (multi-row) |9 seconds |6.3× faster
77+
|===
78+
5679
=== Conclusion
5780

5881
The performance analysis conducted in this study only measures the execution time and carbon emissions of the Python code executing the queries. It does not include emissions due to database processing.
5982

6083
The results show that the compliant solution, which avoids SQL queries in loops, is more efficient in terms of execution time and carbon emissions. By adopting batch query processing and avoiding queries in loops, developers can improve application performance and reduce their carbon footprint. Developers are encouraged to use batch query processing whenever possible to improve application performance.
6184

6285
=== References
86+
https://www.red-gate.com/simple-talk/databases/sql-server/performance-sql-server/comparing-multiple-rows-insert-vs-single-row-insert-with-three-data-load-methods/
87+
6388
:hide-uri-scheme:
6489
https://blogs.oracle.com/sql/post/avoid-writing-sql-inside-loops
23.5 KB
Loading

0 commit comments

Comments
 (0)