Skip to content

uy-nguyen00/ecom-data-simulator

Repository files navigation

E-commerce Data Simulator

🚀 Introduction

Welcome to the E-commerce Data Simulator! This project is a powerful, configurable tool designed to generate realistic user interaction data for an e-commerce website. Inspired by platforms like Google Analytics, it provides a robust stream of events - from page views to purchases - that can be used to test data pipelines, build analytics dashboards, and train machine learning models.

The primary goal is to create a flexible and scalable simulation environment that mimics real-world user behavior, allowing developers and data analysts to work with high-quality, high-volume data without relying on actual production traffic.

✨ Features

  • Dynamic Event Simulation: Generate a wide variety of user events, including:

    • Website Visits (new and returning users)
    • Page Views (home, category, product pages)
    • Product Impressions
    • Adding/Removing Products from Cart
    • Initiating and Completing Checkout
    • Adding Products to Wishlists
    • User Sign-ups and Logins
  • High Configurability: Easily customize the simulation parameters to fit your needs.

    • Product Catalog: Define your own products, categories, and pricing via a simple configuration file (e.g., products.json).
    • User Personas: Simulate different types of users (e.g., "Bargain Hunter," "Brand Loyalist," "Casual Browser") with distinct behavioral patterns.
    • Behavior Control: Adjust the rate of events, session durations, and conversion funnels.
  • Scalable Data Streaming: Built with Apache Kafka and Kafka Streams to handle high-throughput, real-time data flow, ensuring the system can scale to millions of events.

  • Database Integration: Persist the generated data into a relational database of your choice. Configure the database connection string and let the simulator populate your tables.

  • Modular Architecture: The project is divided into clear modules (e-g., data-generator, data-persister, config-manager) to make it easy to understand, extend, and maintain.

🔮 Upcoming Features

[ ] Real-time Dashboard: A web-based interface to visualize the simulated data as it's generated.

[ ] Advanced Analytics: Integration with analytics tools to perform cohort analysis, funnel analysis, and more.

[ ] Anomaly Detection: Simulate unusual user behavior or system events (e.g., sudden traffic spikes, payment failures) to test monitoring systems.

[ ] A/B Testing Simulation: Model A/B tests by directing different user segments to different product variations.

🛠️ Technology Stack

  • Core: Java 21+
  • Framework: Spring Boot 3
  • Messaging: Apache Kafka / Kafka Streams
  • Database: Support for any JPA-compatible database (PostgreSQL, MySQL, H2, etc.)
  • Build Tool: Maven

🏁 Getting Started

Prerequisites

  • Java 21
  • Docker (for running Kafka and a database)

Installation & Running

  1. Clone the repository:
git clone https://github.com/uy-nguyen00/ecom-data-simulator.git
cd ecommerce-data-simulator
  1. Start the infrastructure (Kafka & Database):
docker-compose up -d
  1. Configure the application:
  • Edit src/main/resources/application.yml to set up your database connection.

  • Modify src/main/resources/products.json to define your product catalog.

  • Adjust src/main/resources/simulation.properties to control the simulation behavior.

  1. Build and run the application:
./mvnw spring-boot:run

The simulator will now start generating data and sending it to the configured Kafka topics and database.

⚙️ Configuration

The simulation's behavior is primarily controlled by the simulation.properties file. Here are some of the key parameters you can tweak:

# Number of concurrent users to simulate
simulation.users.concurrent=100

# Rate of new events per second
simulation.events.rate=50

# Conversion rate from viewing a product to adding it to the cart
simulation.funnel.addToCartRate=0.10

# Conversion rate from cart to completed checkout
simulation.funnel.purchaseRate=0.05

For a full list of configurable parameters, please see the simulation.properties file.

🤝 Contributing

Contributions are what make the open-source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement".

  1. Fork the Project

  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)

  3. Commit your Changes (git commit -m 'Add some AmazingFeature')

  4. Push to the Branch (git push origin feature/AmazingFeature)

  5. Open a Pull Request

📄 License

Distributed under the MIT License. See LICENSE for more information.

About

E-commerce Data Simulator

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages