- Purpose
- Chat Summary
- Project Structure
- Usage
- Functions
- Data Cleaning
- Visualization
- Contributing
- License
This repository serves as a demonstration of how generative AI can assist in improving code quality, particularly in the context of R data analysis and visualization projects. The project highlights various improvements made to the code, including refactoring for efficiency, adopting coding conventions, and enhancing documentation.
During a chat session, we discussed and improved the code and documentation for this project. Notable changes and enhancements made during the chat include:
- Refactoring code to use more efficient vectorized operations.
- Adopting camel case naming conventions for variables and functions.
- Modifying the
savePlotsfunction to excludeNULLentries for months without data. - Naming elements of the
savedPlotslist after month names for clarity. - Providing detailed documentation for the project, including function descriptions and project structure.
- R/
- analyze_data.R # R script for data analysis
- data/
- airquality.csv # Air quality dataset
- output/
- plots/ # Directory for saved plot images
- .gitignore # Git ignore file
- Clone the repository to your local machine.
- Ensure you have R and the required libraries (ggplot2, tidyverse) installed.
- Run the
analyze_data.Rscript to analyze the data and generate plots.
Rscript R/analyze_data.RThis function calculates monthly averages for Solar Radiation in the given data.
data: Input data frame containing the dataset.- Returns a data frame with monthly averages.
This function calculates monthly correlations between Ozone and Solar Radiation in the given data.
data: Input data frame containing the dataset.- Returns a data frame with monthly correlations.
This function generates scatter plots for each month based on Solar Radiation and Ozone data. Plots are saved as both image files and objects.
data: Input data frame containing the dataset.filePrefix: A string to be used as a prefix for saved image files.- Returns a list of scatter plots named after the corresponding month.
The script removes rows with missing values (NA) before analysis.
Plots are saved as image files in the output/plots/ directory. Additionally, plots are stored as objects in the savedPlots list.
Contributions to this project are welcome! Feel free to submit pull requests or open issues.
This project is licensed under the MIT License - see the LICENSE file for details.
With this update, the TOC reflects the actual order of the contents in the README.