Installation and usage guide for InsightfulPy v0.2.0.
- Installation
- Quick Start
- Basic Workflow
- Core Functions
- Data Quality Checks
- Visualizations
- Statistical Analysis
- Multi-Dataset Analysis
- Working with Batches
- Environment Compatibility
- Examples
- Python 3.8 or higher
- pip
pip install insightfulpygit clone https://github.com/dhaneshbb/insightfulpy.git
cd insightfulpy
pip install .For development installation, see Development Setup.
Core dependencies include pandas, numpy, matplotlib, seaborn, scipy, and others. See pyproject.toml for complete list.
import pandas as pd
import insightfulpy as ipy
# Load your data
df = pd.read_csv('your_data.csv')ipy.help() # Overview with function categories
ipy.list_all() # List all functions
ipy.quick_start() # Quick start guide
ipy.examples() # Usage examples# Dataset overview
ipy.columns_info('Sales Data', df)
# Numerical summary
num_stats = ipy.num_summary(df)
print(num_stats)
# Categorical summary
cat_stats = ipy.cat_summary(df)
print(cat_stats)ipy.columns_info('Sales Data', df) # Dataset structure
ipy.analyze_data(df) # General analysisipy.num_summary(df) # Numerical statistics
ipy.cat_summary(df) # Categorical statistics
ipy.grouped_summary(df, groupby='category') # Grouped summary# Missing and infinite values
ipy.missing_inf_values(df)
ipy.missing_inf_values(df, missing=True, df_table=True)
# Outlier detection
ipy.detect_outliers(df)
ipy.detect_outliers(df, max_display=20)
# Data type issues
ipy.detect_mixed_data_types(df)
ipy.cat_high_cardinality(df, threshold=100)# Missing data patterns
ipy.show_missing(df)
# Distribution plots (batched)
ipy.kde_batches(df, batch_num=1)
ipy.box_plot_batches(df, batch_num=1)
ipy.qq_plot_batches(df, batch_num=1)
ipy.plot_boxplots(df) # All columns
# Categorical plots (batched)
ipy.cat_bar_batches(df, batch_num=1, show_percentage=True)
ipy.cat_pie_chart_batches(df, batch_num=1)
# Relationship plots (batched)
ipy.num_vs_num_scatterplot_pair_batch(df, pair_num=0, batch_num=1)
ipy.cat_vs_cat_pair_batch(df, pair_num=0, batch_num=1)
ipy.num_vs_cat_box_violin_pair_batch(df, pair_num=0, batch_num=1)# Individual column analysis
ipy.num_analysis_and_plot(df, 'price')
ipy.num_analysis_and_plot(df, 'price', target='category')
ipy.cat_analyze_and_plot(df, 'category')
# Statistical calculations
ipy.calc_stats(df['price'])
ipy.iqr_trimmed_mean(df['price'])
ipy.mad(df['price'])
ipy.calculate_skewness_kurtosis(df)# Compare datasets
dfs = {'sales_2022': df1, 'sales_2023': df2}
base, linked = ipy.compare_df_columns('sales_2022', dfs)
ipy.linked_key(dfs)
ipy.display_key_columns('sales_2022', dfs)
# Interconnected outliers
ipy.interconnected_outliers(df, ['price', 'quantity', 'discount'])
# Comparison analysis
ipy.comp_num_analysis(df)
ipy.comp_num_analysis(df, missing_df=True)
ipy.comp_cat_analysis(df, missing_df=True)Batch functions split visualizations into manageable groups. Call without batch_num to see available batches:
batches = ipy.kde_batches(df) # Shows batch info
ipy.kde_batches(df, batch_num=1) # Plots batch 1InsightfulPy works in Jupyter, IPython, terminal, and scripts. DataFrames display automatically in interactive environments.
For scripts, add plt.show() to display plots:
import matplotlib.pyplot as plt
ipy.kde_batches(df, batch_num=1)
plt.show()Complete workflow example available at docs/examples/example.ipynb:
cd docs/examples/ && jupyter notebook example.ipynb- API Reference - Complete function reference
- Troubleshooting - Common issues
- Examples - Jupyter notebook examples
Version: 0.2.0 | Status: Beta | Python: 3.8-3.12
Copyright 2025 dhaneshbb | License: MIT | Homepage: https://github.com/dhaneshbb/insightfulpy