Skip to content

Commit f617019

Browse files
authored
Merge branch 'main' into PreferAppendLeft
2 parents 6b832b2 + 82b2b45 commit f617019

File tree

10 files changed

+166
-18
lines changed

10 files changed

+166
-18
lines changed

CHANGELOG.md

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,16 +10,21 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
1010
### Added
1111

1212
- [#381](https://github.com/green-code-initiative/creedengo-rules-specifications/pull/381) Add rule GCI 108 Prefer Append Left
13-
- [#400](https://github.com/green-code-initiative/creedengo-rules-specifications/pull/400) Add rule GCI535 - Prefer usage of Intl.NumberFormat
13+
- [#380](https://github.com/green-code-initiative/creedengo-rules-specifications/pull/380) Added rule GCI107 : DATA - Avoid Iterative Matrix Operations
1414

1515
### Changed
1616

1717
### Deleted
1818

19+
## [2.4.1] - 2025-07-24
20+
21+
### Changed
22+
23+
- [#417](https://github.com/green-code-initiative/creedengo-rules-specifications/pull/417) Fixed a typo in the tags GCI 100 and 104.
24+
1925
## [2.4.0] - 2025-07-20
2026

2127
### Added
22-
2328
- [#390](https://github.com/green-code-initiative/creedengo-rules-specifications/pull/390) Added rule GCI106 : Detect scalar sqrt usage in loops and suggest vectorized alternatives
2429
- [#389](https://github.com/green-code-initiative/creedengo-rules-specifications/pull/389) Add rule GCI105, Add a rule on Python String Concatenation
2530
- [#388](https://github.com/green-code-initiative/creedengo-rules-specifications/pull/388) Added rule GCI104 on Torch Tensor types
@@ -444,7 +449,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
444449

445450
## Comparison List
446451

447-
[unreleased](https://github.com/green-code-initiative/creedengo-rules-specifications/compare/2.4.0...HEAD)
452+
[unreleased](https://github.com/green-code-initiative/creedengo-rules-specifications/compare/2.4.1...HEAD)
453+
[2.4.1](https://github.com/green-code-initiative/creedengo-rules-specifications/compare/2.4.0...2.4.1)
448454
[2.4.0](https://github.com/green-code-initiative/creedengo-rules-specifications/compare/2.3.0...2.4.0)
449455
[2.3.0](https://github.com/green-code-initiative/creedengo-rules-specifications/compare/2.2.3...2.3.0)
450456
[2.2.3](https://github.com/green-code-initiative/creedengo-rules-specifications/compare/2.2.2...2.2.3)

RULES.md

Lines changed: 13 additions & 12 deletions
Large diffs are not rendered by default.

src/main/rules/GCI100/GCI100.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212
"performance",
1313
"memory",
1414
"ai",
15-
"PyTorch"
15+
"pytorch"
1616
],
1717
"defaultSeverity": "Minor"
1818
}

src/main/rules/GCI104/GCI104.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
"eco-design",
1212
"performance",
1313
"ai",
14-
"PyTorch"
14+
"pytorch"
1515
],
1616
"defaultSeverity": "Minor"
1717
}

src/main/rules/GCI107/GCI107.json

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
{
2+
"title": "DATA : Avoid Iterative Matrix Operations",
3+
"type": "CODE_SMELL",
4+
"status": "ready",
5+
"remediation": {
6+
"func": "Constant\/Issue",
7+
"constantCost": "10min"
8+
},
9+
"tags": [
10+
"creedengo",
11+
"eco-design",
12+
"performance",
13+
"data",
14+
"ai",
15+
"vector",
16+
"pandas",
17+
"numpy"
18+
],
19+
"defaultSeverity": "Minor"
20+
}
21+
Lines changed: 119 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,119 @@
1+
Before going into more detail, it's important to understand how vectorization works in Python. When performing a calculation on an array/matrix, there are several feasible methods:
2+
3+
The first is to go through the list and perform the calculation element by element, known as an iterative approach.
4+
The second method consists of applying the calculation to the entire array/matrix at once, which is known as vectorization.
5+
6+
Although it's not feasible to do this in all cases without applying real parallelism using a GPU, for example, we speak of vectorization when we use the built-in functions of TensorFlow, NumPy or Pandas.
7+
8+
We'll also have an iterative loop, but it will be executed in lower-level code (C). As with the use of built-in functions in general, since low-level languages like C are optimized, execution will be much faster and therefore emit less CO2.
9+
10+
== Non compliant Code Example
11+
12+
[source,python]
13+
----
14+
results = [[0 for _ in range(cols_B)] for _ in range(rows_A)]
15+
16+
17+
for i in range(len(A)):
18+
for j in range(len(B[0])):
19+
for k in range(len(B)):
20+
results[i][j] += A[i][k] * B[k][j]
21+
----
22+
23+
== Compliant Solution
24+
25+
[source,python]
26+
----
27+
results = np.dot(A, B)
28+
# np stands for NumPy, the Python library used to manipulate data series.
29+
----
30+
31+
== Relevance Analysis
32+
33+
The following results were obtained through local experiments.
34+
35+
=== Configuration
36+
37+
* Processor: Intel(R) Core(TM) Ultra 5 135U, 2100 MHz, 12 cores, 14 logical processors
38+
* RAM: 16 GB
39+
* CO2 Emissions Measurement: Using CodeCarbon
40+
41+
=== Context
42+
43+
This study is divided into 3 parts, comparing a vectorized and an iterative method:
44+
measuring the impact on a dot product between two vectors,
45+
measuring the impact on an outer product between two vectors,
46+
measuring the impact on a matrix calculation.
47+
48+
=== Impact Analysis
49+
50+
*1. dot product:*
51+
52+
*Non compliant*
53+
[source,python]
54+
----
55+
def iterative_dot_product(x,y):
56+
total = 0
57+
for i in range(len(x)):
58+
total += x[i] * y[i]
59+
return total
60+
----
61+
*Compliant*
62+
[source,python]
63+
----
64+
def vectorized_dot_product(x,y):
65+
return np.dot(x,y)
66+
----
67+
image::dot.png[]
68+
69+
*2. Outer product:*
70+
71+
*Non compliant*
72+
[source,python]
73+
----
74+
def iterative_outer_product(x, y):
75+
o = np.zeros((len(x), len(y)))
76+
for i in range(len(x)):
77+
for j in range(len(y)):
78+
o[i][j] = x[i] * y[j]
79+
return o
80+
----
81+
*Compliant*
82+
[source,python]
83+
----
84+
def vectorized_outer_product(x, y):
85+
return np.outer(x, y)
86+
----
87+
image::outer.png[]
88+
89+
*3. Matrix product:*
90+
91+
*Non compliant*
92+
[source,python]
93+
----
94+
def iterative_matrix_product(A, B):
95+
for i in range(len(A)):
96+
for j in range(len(B[0])):
97+
for k in range(len(B)):
98+
results[i][j] += A[i][k] * B[k][j]
99+
return results
100+
----
101+
*Compliant*
102+
[source,python]
103+
----
104+
def vectorized_outer_product(A, B):
105+
return np.dot(A, B)
106+
----
107+
image::matrix.png[]
108+
109+
=== Conclusion
110+
111+
The results show that the vectorized method is significantly faster than the iterative method. The CO2 emissions are also lower. This is a clear example of how using built-in functions can lead to more efficient code, both in terms of performance and environmental impact.
112+
113+
=== References
114+
115+
https://sciresol.s3.us-east-2.amazonaws.com/IJST/Articles/2024/Issue-24/IJST-2024-914.pdf
116+
117+
https://arxiv.org/pdf/2308.01269
118+
119+
https://www.db-thueringen.de/servlets/MCRFileNodeServlet/dbt_derivate_00062165/ilm1-2024200012.pdf
31.5 KB
Loading
31 KB
Loading
26.7 KB
Loading

src/main/rules/GCI96/python/GCI96.asciidoc

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,12 +37,14 @@ Local experiments were conducted to assess the environmental impact of reading C
3737
=== Context
3838

3939
We generated CSV files with the following row sizes:
40+
4041
* 1,000
4142
* 10,000
4243
* 100,000
4344
* 1,000,000
4445

4546
Each file contains 5 columns (`A`, `B`, `C`, `D`, `E`). We measured the carbon emissions required to read:
47+
4648
* 1 column
4749
* 2 columns
4850
* 3 columns
@@ -71,4 +73,3 @@ This is especially critical when working with large datasets or in environments
7173
== References
7274
https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html
7375
https://medium.com/@amit25173/what-is-usecols-in-pandas-7a6a43885f4b
74-

0 commit comments

Comments
 (0)