You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/main/rules/GCI404/python/GCI404.asciidoc
+15-1Lines changed: 15 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -61,6 +61,20 @@ image::carbone.png[]
61
61
62
62
For both metrics, the bigger the list, the greater is the gain is.
63
63
64
+
=== Additional Study
65
+
66
+
A complementary benchmark from the Creedengo Challenge Issue #113 further supports the recommendation to avoid list comprehensions in loop declarations.
67
+
68
+
In a controlled containerized test:
69
+
70
+
The "bad" implementation using a list comprehension consumed: 4,493,365,500 bytes (~4.19 GB)
71
+
72
+
The "good" implementation using a generator expression consumed: 4,478,423,217 bytes (~4.17 GB)
Our analysis clearly demonstrates that replacing list comprehensions with generator expressions in Python for-loops offers substantial benefits in terms of both memory efficiency and environmental impact. As the data size increases, the advantages become increasingly significant.
@@ -69,4 +83,4 @@ Our analysis clearly demonstrates that replacing list comprehensions with genera
** Database and client application : Lenovo ThinkCentre M90, Windows XP
35
38
36
39
=== Context
37
40
38
41
This practice can significantly degrade performance, especially when processing large datasets or making repetitive database calls. By opting for batch processing instead of executing queries in loops, developers can improve overall system efficiency and reduce the carbon footprint of their applications.
39
42
43
+
The Redgate study demonstrated that **batch processing can outperform row-by-row operations by several orders of magnitude**, particularly in data load scenarios. Even with optimized systems like SSIS or high-speed disks, row-level operations remain significantly slower and more resource-intensive.
44
+
45
+
These results align with local benchmarks in Python using SQLite.
46
+
40
47
=== Test Execution
41
48
42
-
The performance analysis was conducted by executing 1000 queries for both the non-compliant and compliant solutions. For the non-compliant solution, each query was executed individually within a loop. For the compliant solution, a batch query with the same 1000 queries was executed.
49
+
Local benchmark compared:
50
+
- 1000 individual `SELECT` queries executed in a loop.
51
+
- A single batched `SELECT` query using `IN (...)`.
image::image.png[width=600, align="center", alt="Redgate study results"]
70
+
71
+
72
+
[cols="1,1,1", options="header"]
73
+
|===
74
+
|Insert Method |Execution Time (for 1M rows) |Relative Performance
75
+
|Single-row insert in loop |57 seconds |Baseline (slowest)
76
+
|Batch insert (multi-row) |9 seconds |6.3× faster
77
+
|===
78
+
56
79
=== Conclusion
57
80
58
81
The performance analysis conducted in this study only measures the execution time and carbon emissions of the Python code executing the queries. It does not include emissions due to database processing.
59
82
60
83
The results show that the compliant solution, which avoids SQL queries in loops, is more efficient in terms of execution time and carbon emissions. By adopting batch query processing and avoiding queries in loops, developers can improve application performance and reduce their carbon footprint. Developers are encouraged to use batch query processing whenever possible to improve application performance.
0 commit comments