-
Notifications
You must be signed in to change notification settings - Fork 1k
Docker Support for Issue Docker Support #85 #112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
SANTHOSH-SACHIN
wants to merge
1
commit into
microsoft:main
Choose a base branch
from
SANTHOSH-SACHIN:main
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,124 @@ | ||
# Docker Setup and Deployment | ||
|
||
This document provides detailed instructions on setting up and deploying Data Formulator using Docker. | ||
|
||
## Prerequisites | ||
|
||
* Docker installed and running on your system. You can download Docker from [https://www.docker.com/products/docker-desktop](https://www.docker.com/products/docker-desktop). | ||
* Docker Compose installed. Docker Desktop includes Docker Compose. If you're not using Docker Desktop, follow the instructions at [https://docs.docker.com/compose/install/](https://docs.docker.com/compose/install/). | ||
|
||
## Building the Docker Image | ||
|
||
The `Dockerfile` contains instructions for building a Docker image that includes both the frontend and backend of Data Formulator. It uses a multi-stage build process to minimize the final image size. | ||
|
||
To build the image, run the following command from the root directory of the project: | ||
|
||
```bash | ||
docker build -t data-formulator . | ||
``` | ||
|
||
## Running the Docker Container | ||
|
||
To run the container, run the following command from the root directory of the project: | ||
|
||
```bash | ||
docker run -p 5000:5000 data-formulator | ||
``` | ||
|
||
This will create a Docker image named `data-formulator`. | ||
|
||
## Running Data Formulator with Docker | ||
|
||
### Using `docker run` | ||
|
||
You can run Data Formulator directly using the `docker run` command. This is useful for quick testing or simple deployments. | ||
|
||
```bash | ||
|
||
docker run -p 5000:5000 -e OPENAI_API_KEY=your-openai-key -e AZURE_API_KEY=your-azure-key ... data-formulator | ||
``` | ||
|
||
* `-p 5000:5000`: This maps port 5000 on your host machine to port 5000 inside the container. Data Formulator runs on port 5000 by default. | ||
* `-e VAR=value`: This sets environment variables inside the container. You **must** provide your API keys and other configuration settings using environment variables. See the [Configuration](#configuration) section in `README.md` for a complete list of supported environment variables. Replace placeholders like `your-openai-key` with your actual API keys. | ||
* `data-formulator`: This is the name of the Docker image you built earlier. | ||
|
||
### Using Docker Compose (Recommended) | ||
|
||
Docker Compose simplifies the process of running multi-container applications. Data Formulator, while technically a single service, benefits from Docker Compose for managing environment variables and simplifying the startup process. | ||
|
||
The `docker-compose.yml` file defines the Data Formulator service. Here's a breakdown: | ||
|
||
```yaml | ||
|
||
version: '3.8' | ||
|
||
services: | ||
data-formulator: | ||
build: | ||
context: . | ||
dockerfile: Dockerfile | ||
ports: | ||
- "5000:5000" | ||
environment: | ||
- FLASK_APP=py-src/data_formulator/app.py | ||
- FLASK_RUN_PORT=5000 | ||
- FLASK_RUN_HOST=0.0.0.0 | ||
#Add your API keys here as environment variables, e.g.: | ||
- OPENAI_API_KEY=${OPENAI_API_KEY} | ||
- AZURE_API_KEY=${AZURE_API_KEY} | ||
- AZURE_API_BASE=${AZURE_API_BASE} | ||
- AZURE_API_VERSION=${AZURE_API_VERSION} | ||
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY} | ||
- OLLAMA_API_BASE=${OLLAMA_API_BASE} | ||
volumes: | ||
- .:/app # Mount the current directory for development | ||
``` | ||
|
||
* `version: '3.8'`: Specifies the Docker Compose file version. | ||
* `services: data-formulator:`: Defines a service named `data-formulator`. | ||
* `build:`: Specifies how to build the image. | ||
* `context: .`: Uses the current directory as the build context. | ||
* `dockerfile: Dockerfile`: Uses the `Dockerfile` in the current directory. | ||
* `ports: - "5000:5000"`: Maps port 5000 on the host to port 5000 in the container. | ||
* `environment:`: Sets environment variables inside the container. This is where you should put your API keys. You can either hardcode them here (not recommended for production) or use variable substitution from your shell environment (e.g., `- OPENAI_API_KEY=${OPENAI_API_KEY}`). | ||
* `volumes: - .:/app`: This line mounts the project root directory to `/app` inside the container. This is very useful during development, as any changes you make to your code will be immediately reflected inside the running container without needing to rebuild the image. **For production deployments, you should remove or comment out this line.** | ||
|
||
To run Data Formulator using Docker Compose: | ||
|
||
1. **Set your API keys as environment variables in your shell:** | ||
|
||
```bash | ||
export OPENAI_API_KEY=your-openai-key | ||
export AZURE_API_KEY=your-azure-key | ||
# ... set other API keys as needed | ||
``` | ||
|
||
Or, create a `.env` file in the project root directory and add your API keys there: | ||
|
||
``` | ||
OPENAI_API_KEY=your-openai-key | ||
AZURE_API_KEY=your-azure-key | ||
# ... other API keys | ||
``` | ||
Docker Compose will automatically read environment variables from a `.env` file in the same directory as the `docker-compose.yml` file. **Do not commit your `.env` file to version control.** It's included in the `.gitignore` file. | ||
|
||
2. **Run Docker Compose:** | ||
|
||
```bash | ||
docker-compose up --build | ||
``` | ||
|
||
* `up`: Starts the services defined in `docker-compose.yml`. | ||
* `--build`: Forces a rebuild of the image, even if one already exists. Use this if you've made changes to the `Dockerfile` or your application code. | ||
|
||
The first time you run this, Docker will download the necessary base images and build the Data Formulator image. Subsequent runs will be faster, especially if you use the volume mount for development. | ||
|
||
3. **Access Data Formulator:** | ||
|
||
Open your web browser and go to `http://localhost:5000`. | ||
|
||
## Stopping Data Formulator | ||
|
||
To stop the Data Formulator container(s) when running with Docker Compose, press `Ctrl+C` in the terminal where `docker-compose up` is running. You can also run: | ||
|
||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
# Use a multi-stage build to reduce final image size | ||
|
||
# Stage 1: Build the frontend | ||
FROM node:18 AS frontend-builder | ||
|
||
WORKDIR /app | ||
|
||
# Copy package.json and yarn.lock first to leverage Docker cache | ||
COPY package.json yarn.lock ./ | ||
RUN yarn install --frozen-lockfile | ||
|
||
# Copy the rest of the frontend code | ||
COPY . . | ||
|
||
# Build the frontend | ||
RUN yarn build | ||
|
||
# Stage 2: Build the backend and create the final image | ||
FROM python:3.12-slim | ||
|
||
WORKDIR /app | ||
|
||
# Copy built frontend from the previous stage | ||
COPY --from=frontend-builder /app/py-src/data_formulator/dist /app/py-src/data_formulator/dist | ||
|
||
# Copy backend code | ||
COPY py-src /app/py-src | ||
COPY requirements.txt /app/ | ||
COPY pyproject.toml /app/ | ||
COPY README.md /app/ | ||
COPY LICENSE /app/ | ||
|
||
# Install Python dependencies | ||
RUN pip install --no-cache-dir -r requirements.txt | ||
|
||
# Copy the entrypoint script | ||
COPY docker-entrypoint.sh /app/ | ||
|
||
# Make the entrypoint script executable | ||
RUN chmod +x /app/docker-entrypoint.sh | ||
|
||
# Expose the port the app runs on | ||
EXPOSE 5000 | ||
|
||
# Set the entrypoint | ||
ENTRYPOINT ["/app/docker-entrypoint.sh"] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,7 +3,7 @@ | |
</h1> | ||
|
||
<div> | ||
|
||
[](https://arxiv.org/abs/2408.16119)  | ||
[](https://opensource.org/licenses/MIT)  | ||
[](https://youtu.be/3ndlwt0Wi3c)  | ||
|
@@ -22,7 +22,7 @@ Transform data and create rich visualizations iteratively with AI 🪄. Try Data | |
|
||
## News 🔥🔥🔥 | ||
|
||
- [02-20-2025] Data Formulator 0.1.6 released! | ||
- [02-20-2025] Data Formulator 0.1.6 released! | ||
- Now supports working with multiple datasets at once! Tell Data Formulator which data tables you would like to use in the encoding shelf, and it will figure out how to join the tables to create a visualization to answer your question. 🪄 | ||
- Checkout the demo at [[https://github.com/microsoft/data-formulator/releases/tag/0.1.6]](https://github.com/microsoft/data-formulator/releases/tag/0.1.6). | ||
- Update your Data Formulator to the latest version to play with the new features. | ||
|
@@ -37,11 +37,11 @@ Transform data and create rich visualizations iteratively with AI 🪄. Try Data | |
- We added a few visualization challenges with the sample datasets. Can you complete them all? [[try them out!]](https://github.com/microsoft/data-formulator/issues/53#issue-2641841252) | ||
- Comment in the issue when you did, or share your results/questions with others! [[comment here]](https://github.com/microsoft/data-formulator/issues/53) | ||
|
||
- [10-11-2024] Data Formulator python package released! | ||
- [10-11-2024] Data Formulator python package released! | ||
- You can now install Data Formulator using Python and run it locally, easily. [[check it out]](#get-started). | ||
- Our Codespaces configuration is also updated for fast start up ⚡️. [[try it now!]](https://codespaces.new/microsoft/data-formulator?quickstart=1) | ||
- New experimental feature: load an image or a messy text, and ask AI to parse and clean it for you(!). [[demo]](https://github.com/microsoft/data-formulator/pull/31#issuecomment-2403652717) | ||
|
||
- [10-01-2024] Initial release of Data Formulator, check out our [[blog]](https://www.microsoft.com/en-us/research/blog/data-formulator-exploring-how-ai-can-help-analysts-create-rich-data-visualizations/) and [[video]](https://youtu.be/3ndlwt0Wi3c)! | ||
|
||
|
||
|
@@ -50,23 +50,23 @@ Transform data and create rich visualizations iteratively with AI 🪄. Try Data | |
|
||
**Data Formulator** is an application from Microsoft Research that uses large language models to transform data, expediting the practice of data visualization. | ||
|
||
Data Formulator is an AI-powered tool for analysts to iteratively create rich visualizations. Unlike most chat-based AI tools where users need to describe everything in natural language, Data Formulator combines *user interface interactions (UI)* and *natural language (NL) inputs* for easier interaction. This blended approach makes it easier for users to describe their chart designs while delegating data transformation to AI. | ||
Data Formulator is an AI-powered tool for analysts to iteratively create rich visualizations. Unlike most chat-based AI tools where users need to describe everything in natural language, Data Formulator combines *user interface interactions (UI)* and *natural language (NL) inputs* for easier interaction. This blended approach makes it easier for users to describe their chart designs while delegating data transformation to AI. | ||
|
||
## Get Started | ||
|
||
Play with Data Formulator with one of the following options: | ||
|
||
- **Option 1: Install via Python PIP** | ||
|
||
Use Python PIP for an easy setup experience, running locally (recommend: install it in a virtual environment). | ||
|
||
```bash | ||
# install data_formulator | ||
pip install data_formulator | ||
|
||
# start data_formulator | ||
data_formulator | ||
data_formulator | ||
|
||
# alternatively, you can run data formulator with this command | ||
python -m data_formulator | ||
``` | ||
|
@@ -76,15 +76,45 @@ Play with Data Formulator with one of the following options: | |
*Update: you can specify the port number (e.g., 8080) by `python -m data_formulator --port 8080` if the default port is occupied.* | ||
|
||
- **Option 2: Codespaces (5 minutes)** | ||
|
||
You can also run Data Formulator in Codespaces; we have everything pre-configured. For more details, see [CODESPACES.md](CODESPACES.md). | ||
|
||
[](https://codespaces.new/microsoft/data-formulator?quickstart=1) | ||
|
||
- **Option 3: Working in the developer mode** | ||
|
||
You can build Data Formulator locally if you prefer full control over your development environment and the ability to customize the setup to your specific needs. For detailed instructions, refer to [DEVELOPMENT.md](DEVELOPMENT.md). | ||
|
||
## Deployment with Docker | ||
|
||
You can easily deploy Data Formulator using Docker. This is the recommended way for production deployments. | ||
|
||
1. **Build the Docker image:** | ||
|
||
```bash:README.md | ||
docker build -t data-formulator . | ||
``` | ||
|
||
2. **Run the Docker container:** | ||
|
||
```bash | ||
docker run -p 5000:5000 -e OPENAI_API_KEY=your-openai-key -e AZURE_API_KEY=your-azure-key ... data-formulator | ||
``` | ||
|
||
Replace `your-openai-key`, `your-azure-key`, etc., with your actual API keys. See the [Configuration](#configuration) section for details on setting environment variables. | ||
|
||
Alternatively, use Docker Compose: | ||
|
||
```bash | ||
docker-compose up --build | ||
``` | ||
|
||
3. **Access Data Formulator:** | ||
|
||
Open your browser and go to `http://localhost:5000`. | ||
|
||
For more detailed instructions and configuration options, see [DOCKER.md](DOCKER.md). | ||
|
||
|
||
## Using Data Formulator | ||
|
||
|
@@ -112,7 +142,7 @@ https://github.com/user-attachments/assets/160c69d2-f42d-435c-9ff3-b1229b5bddba | |
|
||
https://github.com/user-attachments/assets/c93b3e84-8ca8-49ae-80ea-f91ceef34acb | ||
|
||
Repeat this process as needed to explore and understand your data. Your explorations are trackable in the **Data Threads** panel. | ||
Repeat this process as needed to explore and understand your data. Your explorations are trackable in the **Data Threads** panel. | ||
|
||
## Developers' Guide | ||
|
||
|
@@ -123,7 +153,7 @@ Follow the [developers' instructions](DEVELOPMENT.md) to build your new data ana | |
|
||
``` | ||
@article{wang2024dataformulator2iteratively, | ||
title={Data Formulator 2: Iteratively Creating Rich Visualizations with AI}, | ||
title={Data Formulator 2: Iteratively Creating Rich Visualizations with AI}, | ||
author={Chenglong Wang and Bongshin Lee and Steven Drucker and Dan Marshall and Jianfeng Gao}, | ||
year={2024}, | ||
booktitle={ArXiv preprint arXiv:2408.16119}, | ||
|
@@ -160,8 +190,8 @@ or contact [[email protected]](mailto:[email protected]) with any addi | |
|
||
## Trademarks | ||
|
||
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft | ||
trademarks or logos is subject to and must follow | ||
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft | ||
trademarks or logos is subject to and must follow | ||
[Microsoft's Trademark & Brand Guidelines](https://www.microsoft.com/en-us/legal/intellectualproperty/trademarks/usage/general). | ||
Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. | ||
Any use of third-party trademarks or logos are subject to those third-party's policies. |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code snippet marker incorrectly includes ':README.md'; this could affect markdown rendering. Please change it to '```bash' for proper syntax highlighting.
Copilot uses AI. Check for mistakes.