Skip to content

Commit fc4c341

Browse files
committed
Updated .env.example to include new settings for audit logging, artifact retention, and HTTPS enforcement.
- Added a link to HIPAA requirements in `README.md` for compliance guidance. - Improved API reference documentation to clarify Phaxio backend behavior and artifact cleanup options. - Documented E.164 phone number format in SIP setup and API reference for better user guidance. - Added troubleshooting notes for HTTPS usage in production environments.
1 parent f306a22 commit fc4c341

File tree

14 files changed

+508
-35
lines changed

14 files changed

+508
-35
lines changed

.env.example

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ PHAXIO_API_SECRET=
1616
PUBLIC_API_URL=http://localhost:8080
1717
PHAXIO_STATUS_CALLBACK_URL=http://localhost:8080/phaxio-callback
1818
PHAXIO_VERIFY_SIGNATURE=true
19+
ENFORCE_PUBLIC_HTTPS=false
1920

2021
# === SIP/ASTERISK BACKEND ===
2122
# Only needed if FAX_BACKEND=sip
@@ -43,3 +44,15 @@ TZ=UTC
4344
# - Use a reverse proxy for rate limiting and IP restrictions in production
4445
# - PDF token TTL controls how long the cloud fetch link is valid
4546
PDF_TOKEN_TTL_MINUTES=60
47+
48+
# Retention (set >0 to enable automatic artifact cleanup)
49+
ARTIFACT_TTL_DAYS=0
50+
# Cleanup interval in minutes (default daily)
51+
CLEANUP_INTERVAL_MINUTES=1440
52+
53+
# Audit logging (optional)
54+
AUDIT_LOG_ENABLED=false
55+
AUDIT_LOG_FORMAT=json
56+
# AUDIT_LOG_FILE=/var/log/faxbot_audit.log
57+
AUDIT_LOG_SYSLOG=false
58+
AUDIT_LOG_SYSLOG_ADDRESS=/dev/log

.github/workflows/ci.yml

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
name: CI
2+
3+
on:
4+
push:
5+
branches: [ main, master ]
6+
pull_request:
7+
branches: [ "**" ]
8+
9+
jobs:
10+
test-api:
11+
runs-on: ubuntu-latest
12+
steps:
13+
- uses: actions/checkout@v4
14+
- name: Set up Python
15+
uses: actions/setup-python@v5
16+
with:
17+
python-version: '3.11'
18+
- name: Install system dependencies
19+
run: |
20+
sudo apt-get update
21+
sudo apt-get install -y ghostscript
22+
- name: Install Python deps
23+
working-directory: api
24+
run: |
25+
python -m pip install --upgrade pip
26+
pip install -r requirements.txt
27+
- name: Run tests
28+
env:
29+
FAX_DISABLED: 'true'
30+
FAX_DATA_DIR: './faxdata'
31+
DATABASE_URL: 'sqlite:///./test_faxbot_ci.db'
32+
working-directory: api
33+
run: |
34+
mkdir -p faxdata
35+
pytest -q

HIPAA_REQUIREMENTS.md

Lines changed: 148 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,148 @@
1+
# HIPAA_REQUIREMENTS.md
2+
3+
This document describes what is required to operate Faxbot in a HIPAA‑aligned manner. It is a technical guide and checklist for engineers and operators. It is not legal advice. Always consult your compliance team and counsel. You (the operator) are responsible for implementing and documenting the controls below and for executing a formal risk analysis and governance program.
4+
5+
## Scope & Data Flows
6+
- Covered workflows: sending faxes that may contain PHI.
7+
- Not covered: receiving faxes (non‑goal), messaging, IVR, EHR integrations.
8+
9+
Backends (choose one):
10+
- Phaxio (cloud): Client → Faxbot API → Phaxio → PSTN/Fax. Phaxio fetches the PDF from your `PUBLIC_API_URL` and posts status callbacks.
11+
- SIP/Asterisk (self‑hosted): Client → Faxbot API → Asterisk (T.38/UDPTL) → SIP trunk → PSTN/Fax.
12+
13+
PHI touchpoints:
14+
- PDF/TXT upload to Faxbot API.
15+
- Stored job artifacts (original, PDF, TIFF for SIP).
16+
- Status updates (Phaxio callbacks or Asterisk AMI user events).
17+
- Application/Reverse proxy logs (must be PHI‑free).
18+
19+
## Roles & Agreements
20+
- If you are a Covered Entity or Business Associate, you must:
21+
- Execute a BAA with any cloud provider that may handle PHI (e.g., Phaxio). Contact provider sales to obtain a BAA; do not use without a BAA.
22+
- Treat Faxbot and Asterisk operators as Business Associates if they are separate entities.
23+
- Self‑hosted SIP stack does not remove HIPAA obligations; it moves them to you.
24+
25+
## Technical Safeguards (Security Rule)
26+
Implement the following as minimum controls:
27+
28+
1) Transport security
29+
- Public API must be served over HTTPS. Use TLS certs from a reputable CA.
30+
- For Phaxio backend:
31+
- `PUBLIC_API_URL` must be HTTPS in production.
32+
- Enable callback signature verification (default on): `PHAXIO_VERIFY_SIGNATURE=true`. Server verifies `X-Phaxio-Signature` (HMAC‑SHA256 over raw body with `PHAXIO_API_SECRET`).
33+
- For SIP backend:
34+
- SIP signaling should use TLS if supported by your provider; media (T.38 over UDPTL) is typically not encrypted. Mitigate with a site‑to‑site VPN/private interconnect to your SIP provider and strict firewalling.
35+
- Never expose AMI (5038/tcp) to the public internet.
36+
37+
2) Access control
38+
- Require API key on all /fax and /fax/{id} calls (`X-API-Key`). Do not run with blank `API_KEY` in production.
39+
- Restrict inbound traffic with a reverse proxy: IP allowlists and rate limiting.
40+
- Rotate credentials and set a strong AMI password. Do not use `changeme`.
41+
42+
3) Data minimization & confidentiality
43+
- Do not log PHI. Ensure request bodies (PDF/TXT) and rendered content are never logged.
44+
- Faxbot redacts tokenized PDF URLs from logs.
45+
- Tokenized PDF access:
46+
- The server issues a per‑job, random `pdf_token` with a short TTL (`PDF_TOKEN_TTL_MINUTES`, default 60). The `/fax/{job_id}/pdf` endpoint requires exact token equality and enforces expiry.
47+
- Keep TTL as short as operationally feasible.
48+
49+
4) Storage security (at rest)
50+
- Store database and artifacts on encrypted volumes or use a managed, encrypted database. SQLite is acceptable only if disk encryption and backups are in place.
51+
- Separate storage for development vs production. Limit admin access and use MFA on hosts.
52+
- Data retention policy: delete PDFs/TIFFs after transmission completes and your minimum retention requirement is satisfied.
53+
54+
5) Integrity & auditing
55+
- Maintain audit logs of access to `/fax/{job_id}/pdf`, job creation, and status changes. No PHI in logs; use job IDs and metadata only.
56+
- Time synchronize servers (NTP) for accurate audit trails.
57+
58+
6) Availability & recovery
59+
- Back up database (and optionally artifacts) on a secure, encrypted target with rotation.
60+
- Document restore procedures and test periodically.
61+
62+
## Administrative Safeguards
63+
- Perform and document a HIPAA risk analysis for this system, covering threats to confidentiality, integrity, and availability.
64+
- Draft and adopt policies: access control, incident response, change management, data retention/secure destruction, vulnerability management.
65+
- Train workforce members on PHI handling and minimum necessary principles.
66+
- Maintain vendor due diligence (e.g., Phaxio BAA, SOC2 reports where applicable).
67+
68+
## Physical Safeguards
69+
- Secure data center/hosting environment. For on‑prem: locked server rooms, visitor controls. For cloud: select providers with appropriate attestations.
70+
71+
## Backend‑Specific Guidance
72+
73+
### Phaxio (Cloud)
74+
- Required:
75+
- BAA with Phaxio before sending PHI.
76+
- HTTPS `PUBLIC_API_URL`, valid certificate.
77+
- `PHAXIO_VERIFY_SIGNATURE=true`.
78+
- Strong `API_KEY` and reverse proxy restrictions.
79+
- Recommended:
80+
- Keep `PDF_TOKEN_TTL_MINUTES` small (e.g., 15–60 minutes).
81+
- Immediately delete PDFs after successful transmission unless retention policy requires otherwise.
82+
- Validate that `PHAXIO_STATUS_CALLBACK_URL` is reachable only over TLS.
83+
84+
### SIP/Asterisk (Self‑Hosted)
85+
- T.38/UDPTL is not encrypted. Mitigations:
86+
- Use a site‑to‑site VPN/private interconnect to your SIP provider, or run Asterisk in a private data center with dedicated connectivity.
87+
- Strict firewall allows only necessary ports and only to/from provider IPs.
88+
- Use SIP TLS for signaling if supported by your provider; still keep media protected by VPN.
89+
- Asterisk hardening:
90+
- Do not expose AMI externally. Bind to private networks only.
91+
- Use non‑default usernames, strong secrets, fail2ban/IDS.
92+
- Rotate credentials periodically. Log and alert on failed auth.
93+
94+
## MCP (AI Assistant) Considerations
95+
- MCP transmits base64 file content for the `send_fax` tool. Treat the MCP server with the same controls as the API (auth, network restrictions, audit logging).
96+
- Do not send PHI to LLMs or external services unless you have a signed BAA and an approved use case.
97+
- Keep MCP HTTP behind authentication and rate limiting.
98+
99+
## Operational Checklist (Minimum)
100+
- [ ] Signed BAA with Phaxio (if using cloud backend).
101+
- [ ] TLS everywhere (HTTPS for public endpoints; VPN/private link for SIP media).
102+
- [ ] API auth enabled (`API_KEY` set). Reverse proxy with IP allowlist + rate limiting.
103+
- [ ] Callback signature verification enabled (`PHAXIO_VERIFY_SIGNATURE=true`).
104+
- [ ] Tokenized PDF access enabled with short TTL (`PDF_TOKEN_TTL_MINUTES`).
105+
- [ ] Logs do not contain PHI; tokens redacted; job IDs only.
106+
- [ ] Encrypted storage for DB and artifacts; backups configured.
107+
- [ ] Data retention policy implemented (delete artifacts after N days or on success).
108+
- [ ] Asterisk AMI not exposed; strong credentials; fail2ban.
109+
- [ ] Risk analysis, policies, and training documented.
110+
111+
## Current Implementation Status (2025‑Q3)
112+
- Implemented:
113+
- API key support, reverse proxy guidance.
114+
- Tokenized PDF access with equality check and TTL expiry.
115+
- Phaxio callback signature verification (HMAC‑SHA256).
116+
- AMI concurrency/backoff improvements; SIP dialplan emits granular results.
117+
- Docs for HTTPS, rate limiting, NAT/port‑forwarding.
118+
- Gaps (operator‑dependent):
119+
- Encryption at rest (volume or DB) is operator‑managed.
120+
- Automated retention cleanup (cron/job) recommended (see below).
121+
- Centralized audit logging & alerting recommended.
122+
123+
## Remediation Plan & Roadmap
124+
1) Automate artifact retention
125+
- Add `ARTIFACT_TTL_DAYS` env with a daily cleanup job to purge PDFs/TIFFs older than TTL when job status is final.
126+
127+
2) Configurable audit logging
128+
- Structured logs with job lifecycle events; optional sink to SIEM.
129+
130+
3) Optional hard fail on plain HTTP
131+
- Reject `PUBLIC_API_URL` with `http://` in non‑local environments unless `ALLOW_INSECURE_PUBLIC_URL=true`.
132+
133+
4) Secrets management
134+
- Guidance and examples for loading secrets from a vault (AWS/GCP/Azure) instead of env files.
135+
136+
5) Provider‑specific SIP hardening
137+
- Example configs for TLS signaling and site‑to‑site VPN topologies.
138+
139+
## Example: Retention Cleanup (Operator)
140+
- Create a cron or systemd timer to delete artifacts after N days:
141+
```
142+
# delete PDFs/TIFFs older than 7 days
143+
find /path/to/faxdata -type f \( -name '*.pdf' -o -name '*.tiff' \) -mtime +7 -delete
144+
```
145+
- Ensure backups honor retention and secure destruction policies.
146+
147+
## Legal Notice
148+
- This document does not constitute legal advice. HIPAA compliance depends on your specific implementation, vendor agreements, and organizational controls. Engage qualified counsel and security professionals.

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ Simple fax-sending API with AI integration. Choose your backend:
2828
## Documentation
2929
- [API Reference](docs/API_REFERENCE.md) — Endpoints and examples
3030
- [Troubleshooting](docs/TROUBLESHOOTING.md) — Common issues
31+
- [HIPAA Requirements](HIPAA_REQUIREMENTS.md) — Security, BAAs, and compliance checklist
3132

3233
## Notes
3334
- Send-only. Receiving is out of scope.

TODO.md

Lines changed: 26 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -14,9 +14,9 @@ This is a focused, prioritized backlog based on the latest audit.
1414
- [COMPLETED] AMI reconnect concurrency safety
1515
- Added connection lock; background reconnect from read loop
1616
- Keeps exponential backoff
17-
- Asterisk AMI username templating
18-
- Use `${ASTERISK_AMI_USERNAME}` as the section name in `manager.conf.template`
19-
- Keep credentials aligned between API env and Asterisk
17+
- [COMPLETED] Asterisk AMI username templating
18+
- `manager.conf.template` now uses `${ASTERISK_AMI_USERNAME}` for the section name
19+
- Docs updated to call out the alignment with API env
2020
- [COMPLETED] Redact sensitive tokens from logs
2121
- Do not log `pdf_url` tokens
2222
- Ensure all logs contain job IDs, not secrets
@@ -26,38 +26,42 @@ This is a focused, prioritized backlog based on the latest audit.
2626
- Docs updated; scripts and Makefile remain valid
2727

2828
## Medium Priority
29-
- Skip TIFF generation for Phaxio path
30-
- Avoid `pdf_to_tiff` when backend is `phaxio` to save CPU/IO
31-
- Page count can be filled by provider callback
32-
- Source of truth for pages
33-
- Ensure provider callback updates overwrite any local estimates
34-
- Document this behavior
35-
- Public URL defaults and docs
36-
- Prefer HTTPS for `PUBLIC_API_URL` in docs
37-
- Warn on startup when using localhost in non‑dev
38-
- Rate limiting and auth posture
39-
- Warn if `API_KEY` is unset and `FAX_DISABLED=false`
40-
- Add reverse proxy (nginx/Caddy) examples for IP allowlist + rate limits
29+
- [COMPLETED] Skip TIFF generation for Phaxio path
30+
- Do not convert to TIFF when backend is `phaxio`; pages set by callback
31+
- [COMPLETED] Source of truth for pages
32+
- Provider callback overwrites local estimates; documented
33+
- [COMPLETED] Public URL defaults and docs
34+
- Startup warns when `PUBLIC_API_URL` is plain HTTP (non-local)
35+
- Docs recommend HTTPS for production
36+
- [COMPLETED] Rate limiting and auth posture
37+
- Startup warns when `API_KEY` is unset and faxing is enabled
38+
- Added reverse proxy examples (Nginx/Caddy) to Troubleshooting
4139

4240
## Low Priority
43-
- Phone number normalization
44-
- Document E.164 expectation; avoid guessing country codes
41+
- [COMPLETED] Phone number normalization
42+
- Documented E.164 preference and best‑effort normalization note
4543
- MCP example configs
4644
- Treat `api/configs/*.json` as examples; rely on `setup-mcp.js` to generate paths
47-
- Ghostscript flags
48-
- Clean up redundant options (e.g., lzw with tiffg4)
45+
- [COMPLETED] Ghostscript flags
46+
- Removed redundant compression flag for tiffg4
4947
- File retention
5048
- Add TTL cleanup for `orig`, `pdf`, and `tiff` artifacts
5149
- Document storage footprint and retention controls
52-
- CI tests and scripts
53-
- Add GH Actions to run `pytest -q` with `FAX_DISABLED=true`
50+
- [COMPLETED] CI tests and scripts
51+
- GitHub Actions workflow added to run tests
5452

5553
## Documentation Improvements
5654
- Port forwarding appendix for novices (router examples/screenshots)
5755
- Asterisk template walkthrough with env examples per provider
5856
- “Data persistence” note: mount DB/file volumes for durability
57+
- [ADDED] HIPAA requirements document and operator checklist
58+
- Status: Completed (HIPAA_REQUIREMENTS.md)
59+
- [COMPLETED] Audit logging docs
60+
- Added optional SIEM sink configuration and events list
61+
- [COMPLETED] Provider TLS/VPN examples
62+
- Added PJSIP TLS and WireGuard sketches to SIP guide
5963

6064
---
6165

6266
# In Progress
63-
- Implement Asterisk AMI username templating
67+
- (none)

api/app/audit.py

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
import logging
2+
import json
3+
from typing import Any, Dict, Optional
4+
from logging.handlers import SysLogHandler
5+
6+
7+
def init_audit_logger(
8+
enabled: bool,
9+
fmt: str = "json",
10+
filepath: Optional[str] = None,
11+
use_syslog: bool = False,
12+
syslog_address: Optional[str] = None,
13+
) -> None:
14+
logger = logging.getLogger("audit")
15+
if not enabled:
16+
logger.disabled = True
17+
return
18+
logger.setLevel(logging.INFO)
19+
# Avoid duplicate handlers on reload
20+
if logger.handlers:
21+
return
22+
if filepath:
23+
handler = logging.FileHandler(filepath)
24+
elif use_syslog:
25+
address = syslog_address or "/dev/log"
26+
handler = SysLogHandler(address=address)
27+
else:
28+
handler = logging.StreamHandler()
29+
handler.setLevel(logging.INFO)
30+
31+
if fmt == "json":
32+
handler.setFormatter(_JsonFormatter())
33+
else:
34+
handler.setFormatter(logging.Formatter("%(message)s"))
35+
logger.addHandler(handler)
36+
37+
38+
def audit_event(event: str, **fields: Any) -> None:
39+
logger = logging.getLogger("audit")
40+
if logger.disabled:
41+
return
42+
payload: Dict[str, Any] = {"event": event}
43+
payload.update(fields)
44+
logger.info(payload)
45+
46+
47+
def mask_number(num: Optional[str]) -> Optional[str]:
48+
if not num:
49+
return num
50+
digits = [c for c in num if c.isdigit()]
51+
if len(digits) <= 4:
52+
return "****"
53+
masked = "*" * (len(digits) - 4) + "".join(digits[-4:])
54+
return masked
55+
56+
57+
class _JsonFormatter(logging.Formatter):
58+
def format(self, record: logging.LogRecord) -> str:
59+
if isinstance(record.msg, dict):
60+
return json.dumps(record.msg, separators=(",", ":"))
61+
try:
62+
return json.dumps({"message": str(record.getMessage())})
63+
except Exception:
64+
return str(record.getMessage())
65+

api/app/config.py

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,18 @@ class Settings(BaseModel):
3636

3737
# Security
3838
pdf_token_ttl_minutes: int = Field(default_factory=lambda: int(os.getenv("PDF_TOKEN_TTL_MINUTES", "60")))
39+
enforce_public_https: bool = Field(default_factory=lambda: os.getenv("ENFORCE_PUBLIC_HTTPS", "false").lower() in {"1", "true", "yes"})
40+
41+
# Retention / cleanup
42+
artifact_ttl_days: int = Field(default_factory=lambda: int(os.getenv("ARTIFACT_TTL_DAYS", "0"))) # 0=disabled
43+
cleanup_interval_minutes: int = Field(default_factory=lambda: int(os.getenv("CLEANUP_INTERVAL_MINUTES", "1440")))
44+
45+
# Audit logging
46+
audit_log_enabled: bool = Field(default_factory=lambda: os.getenv("AUDIT_LOG_ENABLED", "false").lower() in {"1", "true", "yes"})
47+
audit_log_format: str = Field(default_factory=lambda: os.getenv("AUDIT_LOG_FORMAT", "json"))
48+
audit_log_file: str = Field(default_factory=lambda: os.getenv("AUDIT_LOG_FILE", ""))
49+
audit_log_syslog: bool = Field(default_factory=lambda: os.getenv("AUDIT_LOG_SYSLOG", "false").lower() in {"1", "true", "yes"})
50+
audit_log_syslog_address: str = Field(default_factory=lambda: os.getenv("AUDIT_LOG_SYSLOG_ADDRESS", "/dev/log"))
3951

4052

4153
settings = Settings()

0 commit comments

Comments
 (0)