Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 48 additions & 0 deletions src/main/rules/GCI74/python/GCI74.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -21,3 +21,51 @@ public void foo() {
...
}
----
== Interesting Sources

* https://www.kdnuggets.com/the-essential-guide-to-sql-execution-order[The Essential Guide to SQL Execution Order]
* https://minervadb.xyz/why-select-from-is-bad-for-sql-performance[Why SELECT * FROM is Bad for SQL Performance]
* https://www.baeldung.com/sql/select-all-columns-best-practice[Best Practices for Selecting All Columns in SQL]

== Configuration

* SQLite database: 5-6 GB
* Processor: Intel(R) Core(TM) Ultra 5 135U, 2100 MHz, 12 cores, 14 logical processors
* RAM: 16 GB
* CO2 Emission measurement: Using https://codecarbon.io/[CodeCarbon]
* Memory usage measurement: Using https://psutil.readthedocs.io/en/latest/[psutil]

== Impact Analysis

We investigated the correlation between column selection and carbon emissions.

The query structure used was:

[source,sql]
----
SELECT {selected_cols} FROM my_table WHERE col10 = random_word
----

The WHERE clause was included to increase query complexity and yield more meaningful results.

We observed carbon emission trends as we manipulated the number of selected columns.

image::carbon_emissions_graph.png[Carbon Emissions Graph]

After the first query, carbon emissions decreased slightly due to SQL indexing. Subsequently, emissions remained relatively stable. However, for the last query using `SELECT *`, carbon emissions increased as the query no longer utilized the index.

Similar trends were observed for execution time and memory usage:

image::Memory_time_SQL_measurement.png[Memory and Time Measurements]

== Conclusion

The number of selected columns has a direct impact on:

1. Carbon emissions
2. Execution time
3. Memory usage

Additionally, for security and readability reasons, it's preferable to specify column names rather than using `SELECT *`.

Selecting only necessary columns also results in lighter network traffic.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading