A powerful, user-friendly data analysis tool powered by Python that transforms CSV and Excel files into beautiful insights without requiring any coding knowledge.
- High-Performance Processing: Leverages Python's data science ecosystem for fast, reliable analysis
- Advanced Analytics: Statistical analysis, machine learning, and AI-powered insights
- Scalable Architecture: Handles large datasets efficiently with optimized algorithms
- Automatic Type Detection: Intelligently identifies numeric, categorical, datetime, and text columns
- Data Quality Assessment: Comprehensive analysis of completeness, duplicates, and outliers
- Statistical Insights: Mean, median, mode, standard deviation, and correlation analysis
- Interactive Charts: Bar charts, line graphs, scatter plots, and pie charts
- Real-time Updates: Dynamic visualizations that update as you filter and manipulate data
- Export Ready: Download charts and reports in multiple formats
- Advanced Filtering: Multiple filter rules with various operators
- Smart Search: Global search across all columns with debounced input
- Column Management: Hide, show, and reorder columns as needed
- Sorting: Multi-column sorting with intelligent type handling
- Duplicate Detection: Identify and remove duplicate rows
- Missing Value Handling: Fill, remove, or interpolate missing data
- Outlier Detection: Statistical outlier identification using IQR method
- Text Standardization: Clean and normalize text data
- Correlation Discovery: Automatic detection of relationships between variables
- Pattern Recognition: Identify trends and anomalies in your data
- Predictive Modeling: Simple linear regression and forecasting
- Quality Recommendations: Intelligent suggestions for data improvement
- Multiple Formats: Export to CSV, JSON, Excel, and PDF
- Custom Reports: Generate comprehensive analysis reports
- Shareable Links: Easy sharing with team members
- Email Integration: Send reports directly via email
- Node.js 18+ and npm
- Python 3.8+ with pip
- Modern web browser (Chrome, Firefox, Safari, Edge)
- Frontend: React + TypeScript + Vite
- Backend: Python + FastAPI
- Communication: REST API with JSON
-
Clone the repository
git clone https://github.com/yourusername/data-explorer.git cd data-explorer
-
Install frontend dependencies
npm install
-
Install Python backend dependencies
cd python_core pip install -r requirements.txt cd ..
-
Start both servers
# Option 1: Start both simultaneously npm run start:full # Option 2: Start separately npm run dev # Frontend (port 8080) npm run dev:python # Backend (port 8000) ```bash
-
Open your browser Navigate to
http://localhost:8080
- Ensure Backend is Running: Check the Python backend status indicator
- Upload Your Data: Drag and drop a CSV or Excel file onto the upload area
- Explore: Browse through the automatically generated data preview and statistics
- Visualize: Create charts using the visualization tab
- Clean: Use the data cleaning tools to improve data quality
- Analyze: Discover insights with AI-powered analytics
- Export: Download your results in your preferred format
src/
βββ components/ # React components
β βββ ui/ # Reusable UI components
β βββ DataPreview.tsx # Main data preview component
β βββ DataManipulationPython.tsx
β βββ PythonBackendStatus.tsx
β βββ ...
βββ hooks/ # React hooks for Python API
β βββ usePythonApi.ts
β βββ ...
βββ services/ # API services
β βββ pythonApi.ts # Python backend communication
β βββ ...
βββ types/ # TypeScript type definitions
βββ pages/ # Page components
βββ styles/ # CSS and styling
python_core/ # Python backend
βββ types.py # Data type definitions
βββ data_utils.py # Core data processing
βββ data_processing.py # Filtering and manipulation
βββ visualization.py # Chart data generation
βββ analytics.py # AI-powered insights
βββ main.py # FastAPI server
βββ requirements.txt # Python dependencies
- React 18 with TypeScript
- Vite for fast development and building
- UI Components: shadcn/ui, Radix UI, Tailwind CSS
- Visualizations: Recharts
- File Handling: React Dropzone
- Notifications: Sonner
- Python 3.8+ with FastAPI
- Data Processing: Pure Python (no external dependencies for core logic)
- API: RESTful JSON API with automatic documentation
- Type Safety: Pydantic models and Python type hints
- CSV: Papa Parse (frontend)
- Excel: SheetJS (frontend)
- Analysis: Python backend processing
# Run frontend tests
npm test
# Run tests with coverage
npm run test:coverage
# Run tests in watch mode
npm run test:watch
# Test Python backend
cd python_core
python -m pytest # If you add pytest later
# Build frontend
npm run build
# Preview the production build
npm run preview
# For production deployment, you'll need to:
# 1. Build the React app
# 2. Deploy Python backend to a server
# 3. Update VITE_PYTHON_API_URL to point to production backend
Create a .env
file in the root directory:
VITE_PYTHON_API_URL=http://localhost:8000
NODE_ENV=development
For production, update the Python API URL to your deployed backend.
We welcome contributions! Please see our Contributing Guide for details.
Prerequisites for Contributors:
- Node.js 18+ and npm
- Python 3.8+ and pip
- Familiarity with React/TypeScript and Python
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Set up both frontend and backend development environments
- Make your changes (frontend in
src/
, backend inpython_core/
) - Add tests for new functionality
- Ensure all tests pass (
npm test
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
- Frontend: ESLint and Prettier for TypeScript/React
- Backend: Follow PEP 8 for Python code
- Use meaningful commit messages
- Add type hints for all Python functions
- Write meaningful commit messages
- Add docstrings for Python functions and JSDoc for TypeScript
DataExplorer is optimized for performance with a hybrid architecture:
- React Optimizations: useMemo, useCallback, lazy loading
- Debounced Operations: Prevents excessive API calls
- Efficient Rendering: Virtual scrolling for large tables
- Pure Python: Fast processing without heavy dependencies
- Memory Efficient: Optimized data structures and algorithms
- Async Processing: FastAPI with async/await support
- Scalable: Can handle large datasets efficiently
- Local Processing: Data analysis happens on your local Python backend
- No External Data Transfer: Your data stays on your machine
- Secure File Handling: Robust validation and error handling
- API Security: Input validation and error handling
- CORS Protection: Configured for local development
- Chrome 90+
- Firefox 88+
- Safari 14+
- Edge 90+
- Python 3.8 or higher
- See
python_core/requirements.txt
for dependencies
This project is licensed under the MIT License - see the LICENSE file for details.
- FastAPI for the excellent Python web framework
- shadcn/ui for the beautiful UI components
- Recharts for the charting library
- React and Vite for the frontend framework
- Papa Parse for CSV parsing
- SheetJS for Excel file support
- π§ Email: [email protected]
- π¬ Discord: Join our community
- π Documentation: docs.dataexplorer.com
- π Issues: GitHub Issues
- Advanced ML Models: Integration with scikit-learn, pandas
- Database Support: PostgreSQL, SQLite integration
- Streaming Data: Real-time data processing capabilities
- Collaboration: Multi-user editing and sharing capabilities
- Custom Plugins: Extensible architecture for custom analysis tools
- Mobile App: Native mobile applications for iOS and Android
- Cloud Sync: Optional cloud storage and synchronization
Made with β€οΈ and π by the DataExplorer team