GCI104 AvoidCreatingTensorUsingNumpyOrNativePython #AI #Python #DLG #RulesSpecifications

cleophass · DataLabGroupe-CreditAgricole · cleophass · commit aa6cb6ae6412 · 2025-05-16T15:13:45.000+02:00
Co-authored-by: DataLabGroupe-CreditAgricole &lt;GITHUB.DATALABGROUPE@CREDIT-AGRICOLE-SA.FR&gt;
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -9,6 +9,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ### Added
 
+- Added rule GCI104 on Torch Tensor types
+
 ### Changed
 
 - Correction of various typos in rules documentations
diff --git a/RULES.md b/RULES.md
@@ -71,6 +71,8 @@ Some are applicable for different technologies.
 | GCI92    | Use string.Length instead of comparison with empty string     | Comparing a string to an empty string is unnecessary and can be replaced by a call to `string.Length` which is more performant and more readable.                                                                                                                                                                                                                                                                             |                                                                                                                                                                         | 🚫   | 🚫  | 🚫 | 🚫     | 🚫   | ✅  | 🚫   |
 | GCI93    | Return `Task` directly                                        | Consider returning a `Task` directly instead of a single `await`                                                                                                                                                                                                                                                                                                                                                              |                                                                                                                                                                         | ❓    | ❓   | ❓  | ❓      | ❓    | ✅  | ❓    |
 | GCI94    | Use orElseGet instead of orElse                               | Parameter of orElse() is evaluated, even when having a non-empty Optional. Supplier method of orElseGet passed as an argument is only executed when an Optional value isn’t present. Therefore, using orElseGet() will save computing time.                                                                                                                                                                                   | [Optimized use of Java Optional Else](https://github.com/green-code-initiative/creedengo-challenge/issues/77)                                                           | ✅    | 🚫  | 🚫 | 🚫     | 🚫   | 🚫 | 🚫   |
+ GCI104   | DATA/AI PyTorch - Create tensors directly from Torch                               | 
+In PyTorch, prefer creating tensors directly using `torch.rand`, `torch.tensor`, or other Torch methods instead of converting from NumPy arrays. Avoid using `torch.tensor(np.random.rand(...))` or similar patterns when the same result can be achieved directly with PyTorch.                                                                                                                                                                                                                                                                                                                                                          |                                                                                                                                                                         | 🚫   | 🚫  | 🚫 | 🚀       | 🚫   | 🚫 | 🚫   |
 | GCI203   | Detect unoptimized file formats                               | When it is possible, to use svg format image over other image format                                                                                                                                                                                                                                                                                                                                                          |                                                                                                                                                                         | 🚧   | 🚀  | 🚀 | ✅      | 🚀   | 🚀 | 🚫   |
 | GCI404   | Avoid list comprehension in iterations                        | Use generator comprehension instead of list comprehension in for loop declaration                                                                                                                                                                                                                                                                                                                                             |                                                                                                                                                                         | 🚫   | 🚫  | 🚫 | ✅      | 🚫   | 🚫 | 🚫   |
 | GCI522   | Sobriety: Brightness Override                                 | To avoid draining the battery, iOS and Android devices adapt the brightness of the screen depending on the environment light.                                                                                                                                                                                                                                                                                                                                                  |                                                                                                                                                                         | 🚫   | 🚫  | ✅ | 🚫      | 🚫   | 🚫 | 🚫   |
diff --git a/src/main/rules/GCI104/GCI104.json b/src/main/rules/GCI104/GCI104.json
@@ -0,0 +1,17 @@
+{
+  "title": "DATA/AI PyTorch - Create tensors directly from Torch`",
+  "type": "CODE_SMELL",
+  "status": "ready",
+  "remediation": {
+    "func": "Constant/Issue",
+    "constantCost": "10min"
+  },
+  "tags": [
+    "creedengo",
+    "eco-design",
+    "performance",
+    "ai",
+    "PyTorch"
+  ],
+  "defaultSeverity": "Minor"
+}
diff --git a/src/main/rules/GCI104/python/GCI104.asciidoc b/src/main/rules/GCI104/python/GCI104.asciidoc
@@ -0,0 +1,73 @@
+In PyTorch, prefer creating tensors directly using `torch.rand`, `torch.tensor`, or other Torch methods instead of converting from NumPy arrays. Avoid using `torch.tensor(np.random.rand(...))` or similar patterns when the same result can be achieved directly with PyTorch.
+
+== Non Compliant Code Example
+
+[source,python]
+----
+import torch
+import numpy as np
+
+def non_compliant_random_rand():
+    tensor = torch.tensor(np.random.rand(1000, 1000))  
+----
+
+
+== Compliant Solution
+
+[source,python]
+----
+import torch
+
+def compliant_random_rand():
+    tensor = torch.rand([1000, 1000])  
+----
+
+
+== Relevance Analysis
+
+Experiments were conducted to compare the performance and environmental impact of two tensor creation methods in PyTorch:
+
+- Using NumPy for random data generation followed by conversion to PyTorch tensor
+- Direct creation using native PyTorch tensor functions (`torch.rand`, `torch.tensor`, etc.)
+
+=== Configuration
+
+* Processor: Intel(R) Xeon(R) CPU 3.80GHz  
+* RAM: 64GB  
+* GPU: NVIDIA Quadro RTX 6000  
+* CO₂ Emissions Measurement: https://mlco2.github.io/codecarbon/[CodeCarbon]  
+* Framework: PyTorch  
+* Dataset: MNIST  
+* Model: Simple 2-layer fully connected network
+
+=== Context
+
+Two workflows were benchmarked:
+- *NumPy-based:* Data created using NumPy and converted to PyTorch
+- *Torch-based:* Data created natively using PyTorch tensor operations
+
+Metrics assessed:
+- Training execution time
+- CO₂ emissions
+- Final model accuracy
+
+=== Impact Analysis
+
+image::image.png[]
+
+- *Execution Time:* Torch-based method reduced total training time by more than **50%**
+- *Carbon Emissions:* Torch-based method lowered emissions by approximatively **50%**
+- *Accuracy:* Both approaches yielded **comparable model accuracy**
+
+== Conclusion
+
+Using native PyTorch methods to create tensors:
+
+- Significantly reduces training time
+- Minimizes unnecessary memory operations and conversions
+- Reduces carbon footprint
+- Maintains model performance
+== References
+
+- PyTorch Tensor Docs: https://pytorch.org/docs/stable/tensors.html  
+- Credit: https://github.com/AghilesAzzoug/GreenPyData
diff --git a/src/main/rules/GCI104/python/image.png b/src/main/rules/GCI104/python/image.png