Skip to content

Commit 34a5b0b

Browse files
committed
docs: make deployment requirements more natural and less AI-generated
- Remove excessive bold formatting and structured sections - Simplify language to be more conversational - Remove 'Answer:' and 'Pros:' patterns - Make it sound more like developer-written documentation - Remove 'Questions Addressed' section - Simplify testing checklist format
1 parent e7a3fd6 commit 34a5b0b

File tree

1 file changed

+178
-0
lines changed

1 file changed

+178
-0
lines changed

DEPLOYMENT_REQUIREMENTS.md

Lines changed: 178 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,178 @@
1+
# BioAnalyzer Deployment Requirements
2+
3+
System requirements and deployment info for issue #25.
4+
5+
## System Requirements
6+
7+
### Hardware
8+
9+
Minimum setup:
10+
- 2 CPU cores
11+
- 2GB RAM
12+
- 5GB disk space (for Docker image and dependencies)
13+
- Internet access for API calls
14+
15+
Recommended for better performance:
16+
- 4+ CPU cores
17+
- 4GB+ RAM
18+
- 10GB+ disk space (for cache, logs, results)
19+
- Stable internet connection
20+
21+
### Software
22+
23+
You'll need:
24+
- Docker 20.0+ with Docker Compose 2.0+, or Python 3.8+ if not using Docker
25+
- Internet access for NCBI E-utilities API and LLM provider APIs
26+
27+
Optional but useful:
28+
- Redis for caching (SQLite is used by default though)
29+
- Reverse proxy like Nginx or Traefik for production
30+
- SSL/TLS certificates if you need HTTPS
31+
32+
### Dependencies
33+
34+
All dependencies are in the Docker image or `requirements.txt`. Main ones are FastAPI, Uvicorn, LiteLLM, Paper-QA, and PyTorch (CPU version). See `config/requirements.txt` for the full list.
35+
36+
## Deployment Options
37+
38+
### API-Only Deployment
39+
40+
Users don't need CLI access if you're just using the API. The CLI is optional and only useful for command-line analysis, local development, or admin tasks.
41+
42+
For API-only deployment, just run the FastAPI server. No CLI installation needed. Everything works through REST API endpoints, so it can run on Shiny server alongside other services like metaharmonizer.
43+
44+
### CLI Access
45+
46+
You only need CLI access if users want to run analysis from the command line, need it for admin tasks, or for local testing. If you do need it, you'll need a Python environment on the server and run the `install.sh` script, which adds some complexity to the deployment.
47+
48+
## Server Selection: Shiny Server vs Superstudio
49+
50+
### Shiny Server
51+
52+
Good choice for API-only deployment. You can run it alongside other services like metaharmonizer. Deployment is simpler since it's just the API service, and you don't need local LLM installations - it uses external APIs like Gemini or OpenAI. Easier to manage too.
53+
54+
Requirements:
55+
- Docker support or Python 3.8+ environment
56+
- API keys for LLM providers (Gemini works well)
57+
- Port 8000 available (or change it in config)
58+
- Internet access for API calls
59+
60+
Deployment is straightforward:
61+
```bash
62+
docker compose up -d
63+
# or
64+
python main.py --host 0.0.0.0 --port 8000
65+
```
66+
67+
### Superstudio
68+
69+
Use Superstudio if you have local LLM models installed and want to use them, need Ollama or Llamafile for local inference, want to avoid external API costs, or have specific requirements for on-premise LLM access.
70+
71+
Requirements are the same as Shiny Server, plus you'll need local LLM setup (Ollama, Llamafile, etc.) and more resources if you're running local models.
72+
73+
## API Key Requirements
74+
75+
### Required API Keys
76+
77+
You'll need an NCBI API key for PubMed/PMC data access. Get it from https://www.ncbi.nlm.nih.gov/account/settings/. It's free and gives you 3 requests/second rate limit.
78+
79+
For LLM, you need at least one API key. Gemini is recommended and has a free tier. Get it from https://makersuite.google.com/app/apikey. After the free tier it's pay-as-you-go.
80+
81+
OpenAI and Anthropic are optional alternatives. OpenAI keys are at https://platform.openai.com/api-keys, Anthropic at https://console.anthropic.com/. Both are pay-per-use.
82+
83+
### Creating a New API Key
84+
85+
Yes, it's fine to create a new key for BioAnalyzer. Actually, it's better to have a dedicated key. Store it in environment variables or a `.env` file, never commit it to version control, and keep an eye on usage and costs.
86+
87+
### Environment Variables
88+
89+
Create a `.env` file with:
90+
```bash
91+
# Required
92+
NCBI_API_KEY=your_ncbi_key_here
93+
EMAIL=your_email@example.com
94+
95+
# At least one LLM provider
96+
GEMINI_API_KEY=your_gemini_key_here
97+
# OR
98+
OPENAI_API_KEY=your_openai_key_here
99+
# OR
100+
ANTHROPIC_API_KEY=your_anthropic_key_here
101+
102+
# Optional: Local LLM (if using Ollama)
103+
OLLAMA_BASE_URL=http://localhost:11434
104+
LLM_PROVIDER=ollama
105+
```
106+
107+
## Deployment Steps
108+
109+
### Docker Deployment
110+
111+
```bash
112+
git clone https://github.com/waldronlab/bioanalyzer-backend.git
113+
cd bioanalyzer-backend
114+
115+
cp .env.example .env
116+
# Edit .env with your API keys
117+
118+
docker compose build
119+
docker compose up -d
120+
121+
curl http://localhost:8000/health
122+
```
123+
124+
### Python Deployment
125+
126+
```bash
127+
git clone https://github.com/waldronlab/bioanalyzer-backend.git
128+
cd bioanalyzer-backend
129+
130+
python3 -m venv .venv
131+
source .venv/bin/activate
132+
133+
pip install -r config/requirements.txt
134+
pip install -e .
135+
136+
# Create .env file with your API keys
137+
138+
python main.py --host 0.0.0.0 --port 8000
139+
```
140+
141+
## API Endpoints
142+
143+
Once deployed, you can access:
144+
- Health check: `GET /health`
145+
- API docs: `GET /docs` (Swagger UI)
146+
- Analysis v1: `GET /api/v1/analyze/{pmid}`
147+
- Analysis v2 with RAG: `GET /api/v2/analyze/{pmid}`
148+
- Retrieval: `GET /api/v1/retrieve/{pmid}`
149+
- System status: `GET /api/v1/status`
150+
151+
## Current Status
152+
153+
As @lwaldron mentioned, the app isn't production-ready yet. It needs testing by a couple of people first.
154+
155+
I'd suggest deploying to a staging environment first, have 2-3 users test the API endpoints, monitor for issues, fix any problems, then consider production. Shiny Server might be easier for testing since it's already set up there.
156+
157+
## Testing
158+
159+
Before calling it done, make sure:
160+
- Health endpoint works
161+
- API docs are accessible at `/docs`
162+
- You can analyze a test PMID
163+
- You can retrieve paper data
164+
- Error handling works
165+
- API keys are configured correctly
166+
- Logs are being generated
167+
- Cache works (if enabled)
168+
- Rate limiting works (if enabled)
169+
170+
## Troubleshooting
171+
172+
If something goes wrong:
173+
- Check logs with `docker compose logs` or look in the `logs/` directory
174+
- Verify API keys are set correctly
175+
- Test the health endpoint: `curl http://localhost:8000/health`
176+
- Check network connectivity for API calls
177+
- See [PRODUCTION_DEPLOYMENT.md](docs/PRODUCTION_DEPLOYMENT.md) for more details
178+

0 commit comments

Comments
 (0)