Skip to content

Commit 847d804

Browse files
authored
Merge pull request #389 from cleophass/StringConcatenation
GCI105 StringConcatenation #Python #DLG #RulesSpecifications
2 parents 02c2d13 + 0fd66df commit 847d804

File tree

5 files changed

+88
-0
lines changed

5 files changed

+88
-0
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
99

1010
### Added
1111

12+
- [#389](https://github.com/green-code-initiative/creedengo-rules-specifications/pull/389) Add a rule on Python String Concatenation
1213
- [#388](https://github.com/green-code-initiative/creedengo-rules-specifications/pull/388) Added rule GCI104 on Torch Tensor types
1314
- [#387](https://github.com/green-code-initiative/creedengo-rules-specifications/pull/387) Add rule GCI103, add specifications for a new rule on iteration method for dict in Python
1415
- [#386](https://github.com/green-code-initiative/creedengo-rules-specifications/pull/386) Add rule GCI102, recommending the use of pinned memory for the dataloader when transferring data from the CPU to the GPU.

RULES.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,7 @@ Some are applicable for different technologies.
8181
| GCI102 | Use pinned memory on DataLoader when using GPU | This rule applies to PyTorch data loading, where the use of pinned memory can significantly optimize data transfer between CPU and GPU. | | 🚫 | 🚫 | 🚫 | 🚀 | 🚫 | 🚫 | 🚫 |
8282
| GCI103 | Don't use .items() to iterate over a dictionary when only keys or values are needed | Avoid using `.items()` if you only use the key or the value, as this creates an unnecessary tuple, leading to increased memory allocation and slower execution. | |||| 🚀 ||||
8383
| GCI104 | DATA/AI PyTorch - Create tensors directly from Torch | In PyTorch, prefer creating tensors directly using `torch.rand`, `torch.tensor`, or other Torch methods instead of converting from NumPy arrays. Avoid using `torch.tensor(np.random.rand(...))` or similar patterns when the same result can be achieved directly with PyTorch. | | 🚫 | 🚫 | 🚫 | 🚀 | 🚫 | 🚫 | 🚫 |
84+
| GCI105 | Python String Concatenation | Python String Concatenation - Use Join Instead or f-Strings instead of += | | 🚫 | 🚫 | 🚫 | 🚀 | 🚫 | 🚫 | 🚫 |
8485
| GCI203 | Detect unoptimized file formats | When it is possible, to use svg format image over other image format | | 🚧 | 🚀 | 🚀 || 🚀 | 🚀 | 🚫 |
8586
| GCI404 | Avoid list comprehension in iterations | Use generator comprehension instead of list comprehension in for loop declaration | | 🚫 | 🚫 | 🚫 || 🚫 | 🚫 | 🚫 |
8687
| GCI522 | Sobriety: Brightness Override | To avoid draining the battery, iOS and Android devices adapt the brightness of the screen depending on the environment light. | | 🚫 | 🚫 || 🚫 | 🚫 | 🚫 | 🚫 |

src/main/rules/GCI105/GCI105.json

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
{
2+
"title": "Python String Concatenation - Use Join Instead or f-Strings instead of +=",
3+
"type": "CODE_SMELL",
4+
"status": "ready",
5+
"remediation": {
6+
"func": "Constant\/Issue",
7+
"constantCost": "1min"
8+
},
9+
"tags": [
10+
"creedengo",
11+
"eco-design",
12+
"performance",
13+
"python",
14+
"string"
15+
],
16+
"defaultSeverity": "Minor"
17+
}
Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
With Python, concatenating strings with `+=` in a loop creates many intermediate strings, leading to high memory usage and performance degradation. Prefer using f-strings (formatted string literals) or `str.join()` for better performance and lower environmental impact.
2+
3+
== Non Compliant Code Example
4+
5+
[source,python]
6+
----
7+
city = "New York"
8+
street = "5th Avenue"
9+
zip_code = "10001"
10+
address = ""
11+
address += city + ", " + street + ", " + zip_code # Noncompliant: inefficient string concatenation
12+
----
13+
14+
In this example, the `+=` operation creates a new string object for each concatenation. When done repeatedly (e.g., in loops), this leads to excessive memory allocations and slower performance.
15+
16+
== Compliant Solution
17+
18+
[source,python]
19+
----
20+
# Using f-string for readability and performance
21+
city = "New York"
22+
street = "5th Avenue"
23+
zip_code = "10001"
24+
address = f"{city}, {street}, {zip_code}"
25+
# or using str.join() for multiple string concatenations
26+
parts = ["New York", "5th Avenue", "10001"]
27+
address = ", ".join(parts)
28+
----
29+
30+
These approaches avoid the repeated creation of new string objects. `str.join()` builds the final string in a single operation, and f-strings are compiled into efficient bytecode.
31+
32+
== Relevance Analysis
33+
34+
This rule applies to any Python application performing repeated or large-scale string concatenation (e.g., log generation, data serialization, HTML template generation).
35+
36+
=== Configuration
37+
38+
* Processor: Intel(R) Core(TM) Ultra 5 135U, 2100 MHz, 12 cores, 14 logical processors
39+
* RAM: 16 GB
40+
* CO2 Emissions Measurement: Using CodeCarbon
41+
42+
=== Context
43+
44+
Two string-building techniques were benchmarked:
45+
- **Non-compliant:** `+=` string concatenation in a loop
46+
- **Compliant:** `str.join()` or f-string formatting
47+
48+
Metrics assessed:
49+
- Execution time
50+
- Carbon emissions
51+
52+
=== Impact Analysis
53+
54+
image::concat.png[]
55+
56+
- *CO₂ Emissions:* Reduced by over **10×** when using `str.join()` instead of `+=`
57+
- *Energy Efficiency:* Significantly improved due to lower memory allocations and faster execution
58+
59+
These results demonstrate that even small code patterns, like string concatenation, can have a measurable impact when scaled in production environments.
60+
61+
== Conclusion
62+
63+
Replacing `+=` in string concatenation with `f-strings` or `str.join()`:
64+
- Reduces memory overhead
65+
- Improves runtime performance
66+
- Decreases CO₂ emissions
67+
68+
== References
69+
- https://wiki.python.org/moin/PythonSpeed/PerformanceTips
21.6 KB
Loading

0 commit comments

Comments
 (0)