Throughout entire project analysis
- Project folder follows industry standards in terms of structure and naming conventions.
01 Project Brief
- Full description of project in detail with original datasets and dictionaries.
02 Scripts
-
Analysis has been conducted using Jupyter notebooks and the Anaconda libraries manager.
-
Analysis has been conducted using Python and relevant libraries (pandas, NumPy, os, matplotlib, scipy, and seaborn).
-
Python scripts are clean and easy to follow with headings and contents lists.
-
All code is consistent (e.g., with the use of quotation marks and spaces) and includes descriptive comments.
-
All column names are self-explanatory.
Note
Script description only indicates first use of each code. Code is used additionally throughout project entirety from first indication.
4.02
- All required libraries have been successfully installed and imported into each script.
4.03
- Descriptive checks have been conducted after importation of data, such as checking the top and the bottom of the dataframe.
4.04
- Whenever a dataframe is altered, checks have been conducted to determine its shape and basic statistics.
4.05
- Data has been cleaned. Duplicate data, missing data, and mixed-type columns have been checked and addressed.
4.07
- All identifier variables follow the industry standard data type.
- Samples have been exported whenever an exclusions flag has been created.
4.08
- Any new columns that have been derived are relevant to the needs of the analysis.
- All subsamples have been exported and saved in the proper folder following a consistent naming convention.
4.09
- All required data sets have been successfully installed and imported into each script.
- Data ethics have been kept in mind when dealing with data, especially in regards to customer information.
- All project data has been merged into a single data set. A frequency of the merge flag shows the merged data set is a 100% match to the combined original data sets.
4.10
- Merged data set only contains variables to be used in the analysis.
- At least 4 types of data visualizations have been generated to communicate insights to stakeholders. Visualizations are clearly labeled.
03 Visualizations
- Saved visualizations location from scripts.
04 Sent to Client
- Final report contains data citation for Instacart and customer data sets.
- Final report includes evidence of analysis methodology, clear answers to the questions in this brief, and recommendations for Instacart stakeholders.
Note
For scripts to run properly, some code may need altering for your PC. These codes were written in Python language for Mac. Additionally the folders must be created and put into the same export location for successful exporting.