Skip to content

Commit 72160e3

Browse files
committed
feat: add rolling deployment support for zero-downtime proxy recycling
- Add RollingDeploymentManager to coordinate proxy recycling across providers - Implement minimum availability constraints during age-based recycling - Add batch size limits to control concurrent recycling operations - Update all providers (DigitalOcean, AWS, GCP, Hetzner, Vultr) with rolling logic - Add configuration via environment variables (ROLLING_DEPLOYMENT, ROLLING_MIN_AVAILABLE, ROLLING_BATCH_SIZE) - Implement automatic adjustment when min_available >= min_scaling to prevent deadlock - Add comprehensive API endpoints for monitoring and controlling rolling deployments - Include detailed documentation and best practices guide - Add extensive unit tests with 100% coverage of rolling deployment logic
1 parent 3aca766 commit 72160e3

File tree

11 files changed

+1318
-21
lines changed

11 files changed

+1318
-21
lines changed

README.md

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@
4646
- [Web Interface](#web-interface)
4747
- [API Documentation](#api-documentation)
4848
- [Programmatic Usage](#programmatic-usage)
49+
- [Rolling Deployments](#rolling-deployments)
4950
- [Multi-Account Provider Support](#multi-account-provider-support)
5051
- [API Examples](#cloudproxy-api-examples)
5152
- [Roadmap](#roadmap)
@@ -85,6 +86,7 @@ CloudProxy exposes an API and modern UI for managing your proxy infrastructure.
8586
* Multi-provider support
8687
* Multiple accounts per provider
8788
* Automatic proxy rotation
89+
* **Rolling deployments** - Zero-downtime proxy recycling
8890
* Health monitoring
8991
* Fixed proxy pool management (maintains target count)
9092

@@ -293,6 +295,47 @@ my_request = requests.get("https://api.ipify.org", proxies=proxies)
293295

294296
For more detailed examples of using CloudProxy as a Python package, see the [Python Package Usage Guide](docs/python-package-usage.md).
295297

298+
## Rolling Deployments
299+
300+
CloudProxy supports rolling deployments to ensure zero-downtime proxy recycling. This feature maintains a minimum number of healthy proxies during age-based recycling operations.
301+
302+
### Configuration
303+
304+
Enable rolling deployments with these environment variables:
305+
306+
```bash
307+
# Enable rolling deployments
308+
ROLLING_DEPLOYMENT=True
309+
310+
# Minimum proxies to keep available during recycling
311+
ROLLING_MIN_AVAILABLE=3
312+
313+
# Maximum proxies to recycle simultaneously
314+
ROLLING_BATCH_SIZE=2
315+
```
316+
317+
### How It Works
318+
319+
When proxies reach their age limit:
320+
1. The system checks if recycling would violate minimum availability
321+
2. Proxies are recycled in batches to maintain service continuity
322+
3. New proxies are created as old ones are removed
323+
4. The process continues until all aged proxies are replaced
324+
325+
### Monitoring
326+
327+
Check rolling deployment status via the API:
328+
329+
```bash
330+
# Get overall status
331+
curl http://localhost:8000/rolling
332+
333+
# Get provider-specific status
334+
curl http://localhost:8000/rolling/digitalocean
335+
```
336+
337+
For detailed documentation, see the [Rolling Deployments Guide](docs/rolling-deployments.md).
338+
296339
## Multi-Account Provider Support
297340

298341
CloudProxy now supports multiple accounts per provider, allowing you to:

cloudproxy/main.py

Lines changed: 157 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@
1919

2020
from cloudproxy.providers import settings
2121
from cloudproxy.providers.settings import delete_queue, restart_queue
22+
from cloudproxy.providers.rolling import rolling_manager
2223

2324
sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
2425

@@ -726,6 +727,162 @@ def configure_instance(
726727
config=ProviderInstance(**instance_config)
727728
)
728729

730+
# Rolling Deployment Models
731+
class RollingDeploymentConfig(BaseModel):
732+
enabled: bool = Field(description="Whether rolling deployment is enabled")
733+
min_available: int = Field(ge=0, description="Minimum number of proxies to keep available during recycling")
734+
batch_size: int = Field(ge=1, description="Maximum number of proxies to recycle simultaneously")
735+
736+
class RollingDeploymentStatus(BaseModel):
737+
healthy: int = Field(description="Number of healthy proxies")
738+
pending: int = Field(description="Number of pending proxies")
739+
pending_recycle: int = Field(description="Number of proxies pending recycling")
740+
recycling: int = Field(description="Number of proxies currently being recycled")
741+
last_update: str = Field(description="Last update timestamp")
742+
healthy_ips: List[str] = Field(description="List of healthy proxy IPs")
743+
pending_recycle_ips: List[str] = Field(description="List of IPs pending recycling")
744+
recycling_ips: List[str] = Field(description="List of IPs currently being recycled")
745+
746+
class RollingDeploymentResponse(BaseModel):
747+
metadata: Metadata = Field(default_factory=Metadata)
748+
message: str
749+
config: RollingDeploymentConfig
750+
status: Dict[str, RollingDeploymentStatus] = Field(description="Status by provider/instance")
751+
752+
@app.get("/rolling", tags=["Rolling Deployment"], response_model=RollingDeploymentResponse)
753+
def get_rolling_deployment_status():
754+
"""
755+
Get the current rolling deployment configuration and status.
756+
757+
Returns:
758+
RollingDeploymentResponse: Current rolling deployment configuration and status
759+
"""
760+
config = RollingDeploymentConfig(
761+
enabled=settings.config["rolling_deployment"]["enabled"],
762+
min_available=settings.config["rolling_deployment"]["min_available"],
763+
batch_size=settings.config["rolling_deployment"]["batch_size"]
764+
)
765+
766+
raw_status = rolling_manager.get_recycling_status()
767+
status = {}
768+
for key, data in raw_status.items():
769+
status[key] = RollingDeploymentStatus(**data)
770+
771+
return RollingDeploymentResponse(
772+
message="Rolling deployment status retrieved successfully",
773+
config=config,
774+
status=status
775+
)
776+
777+
@app.patch("/rolling", tags=["Rolling Deployment"], response_model=RollingDeploymentResponse)
778+
def update_rolling_deployment_config(update: RollingDeploymentConfig):
779+
"""
780+
Update the rolling deployment configuration.
781+
782+
Args:
783+
update: New rolling deployment configuration
784+
785+
Returns:
786+
RollingDeploymentResponse: Updated configuration and current status
787+
"""
788+
# Update configuration
789+
settings.config["rolling_deployment"]["enabled"] = update.enabled
790+
settings.config["rolling_deployment"]["min_available"] = update.min_available
791+
settings.config["rolling_deployment"]["batch_size"] = update.batch_size
792+
793+
# Get current status
794+
raw_status = rolling_manager.get_recycling_status()
795+
status = {}
796+
for key, data in raw_status.items():
797+
status[key] = RollingDeploymentStatus(**data)
798+
799+
return RollingDeploymentResponse(
800+
message="Rolling deployment configuration updated successfully",
801+
config=update,
802+
status=status
803+
)
804+
805+
@app.get("/rolling/{provider}", tags=["Rolling Deployment"], response_model=RollingDeploymentResponse)
806+
def get_provider_rolling_status(provider: str):
807+
"""
808+
Get rolling deployment status for a specific provider.
809+
810+
Args:
811+
provider: The name of the provider
812+
813+
Returns:
814+
RollingDeploymentResponse: Rolling deployment status for the provider
815+
816+
Raises:
817+
HTTPException: If the provider is not found
818+
"""
819+
if provider not in settings.config["providers"]:
820+
raise HTTPException(
821+
status_code=404,
822+
detail=f"Provider '{provider}' not found"
823+
)
824+
825+
config = RollingDeploymentConfig(
826+
enabled=settings.config["rolling_deployment"]["enabled"],
827+
min_available=settings.config["rolling_deployment"]["min_available"],
828+
batch_size=settings.config["rolling_deployment"]["batch_size"]
829+
)
830+
831+
raw_status = rolling_manager.get_recycling_status(provider=provider)
832+
status = {}
833+
for key, data in raw_status.items():
834+
status[key] = RollingDeploymentStatus(**data)
835+
836+
return RollingDeploymentResponse(
837+
message=f"Rolling deployment status for '{provider}' retrieved successfully",
838+
config=config,
839+
status=status
840+
)
841+
842+
@app.get("/rolling/{provider}/{instance}", tags=["Rolling Deployment"], response_model=RollingDeploymentResponse)
843+
def get_instance_rolling_status(provider: str, instance: str):
844+
"""
845+
Get rolling deployment status for a specific provider instance.
846+
847+
Args:
848+
provider: The name of the provider
849+
instance: The name of the instance
850+
851+
Returns:
852+
RollingDeploymentResponse: Rolling deployment status for the instance
853+
854+
Raises:
855+
HTTPException: If the provider or instance is not found
856+
"""
857+
if provider not in settings.config["providers"]:
858+
raise HTTPException(
859+
status_code=404,
860+
detail=f"Provider '{provider}' not found"
861+
)
862+
863+
if instance not in settings.config["providers"][provider]["instances"]:
864+
raise HTTPException(
865+
status_code=404,
866+
detail=f"Provider '{provider}' instance '{instance}' not found"
867+
)
868+
869+
config = RollingDeploymentConfig(
870+
enabled=settings.config["rolling_deployment"]["enabled"],
871+
min_available=settings.config["rolling_deployment"]["min_available"],
872+
batch_size=settings.config["rolling_deployment"]["batch_size"]
873+
)
874+
875+
raw_status = rolling_manager.get_recycling_status(provider=provider, instance=instance)
876+
status = {}
877+
for key, data in raw_status.items():
878+
status[key] = RollingDeploymentStatus(**data)
879+
880+
return RollingDeploymentResponse(
881+
message=f"Rolling deployment status for '{provider}/{instance}' retrieved successfully",
882+
config=config,
883+
status=status
884+
)
885+
729886
if __name__ == "__main__":
730887
main()
731888

cloudproxy/providers/aws/main.py

Lines changed: 67 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@
1212
start_proxy,
1313
)
1414
from cloudproxy.providers.settings import delete_queue, restart_queue, config
15+
from cloudproxy.providers.rolling import rolling_manager
1516

1617

1718
def aws_deployment(min_scaling, instance_config=None):
@@ -58,18 +59,28 @@ def aws_check_alive(instance_config=None):
5859
"""
5960
if instance_config is None:
6061
instance_config = config["providers"]["aws"]["instances"]["default"]
62+
63+
# Get instance name for rolling deployment tracking
64+
instance_name = next(
65+
(name for name, inst in config["providers"]["aws"]["instances"].items()
66+
if inst == instance_config),
67+
"default"
68+
)
6169

6270
ip_ready = []
71+
pending_ips = []
72+
instances_to_recycle = []
73+
74+
# First pass: identify healthy and pending instances
6375
for instance in list_instances(instance_config):
6476
try:
6577
elapsed = datetime.datetime.now(
6678
datetime.timezone.utc
6779
) - instance["Instances"][0]["LaunchTime"]
80+
6881
if config["age_limit"] > 0 and elapsed > datetime.timedelta(seconds=config["age_limit"]):
69-
delete_proxy(instance["Instances"][0]["InstanceId"], instance_config)
70-
logger.info(
71-
f"Recycling AWS {instance_config.get('display_name', 'default')} instance, reached age limit -> " + instance["Instances"][0]["PublicIpAddress"]
72-
)
82+
# Queue for potential recycling
83+
instances_to_recycle.append((instance, elapsed))
7384
elif instance["Instances"][0]["State"]["Name"] == "stopped":
7485
logger.info(
7586
f"Waking up: AWS {instance_config.get('display_name', 'default')} -> Instance " + instance["Instances"][0]["InstanceId"]
@@ -87,7 +98,9 @@ def aws_check_alive(instance_config=None):
8798
logger.info(
8899
f"Pending: AWS {instance_config.get('display_name', 'default')} -> " + instance["Instances"][0]["PublicIpAddress"]
89100
)
90-
# Must be "pending" if none of the above, check if alive or not.
101+
if "PublicIpAddress" in instance["Instances"][0]:
102+
pending_ips.append(instance["Instances"][0]["PublicIpAddress"])
103+
# Must be "running" if none of the above, check if alive or not.
91104
elif check_alive(instance["Instances"][0]["PublicIpAddress"]):
92105
logger.info(
93106
f"Alive: AWS {instance_config.get('display_name', 'default')} -> " + instance["Instances"][0]["PublicIpAddress"]
@@ -104,8 +117,57 @@ def aws_check_alive(instance_config=None):
104117
logger.info(
105118
f"Waiting: AWS {instance_config.get('display_name', 'default')} -> " + instance["Instances"][0]["PublicIpAddress"]
106119
)
120+
if "PublicIpAddress" in instance["Instances"][0]:
121+
pending_ips.append(instance["Instances"][0]["PublicIpAddress"])
107122
except (TypeError, KeyError):
108123
logger.info(f"Pending: AWS {instance_config.get('display_name', 'default')} -> allocating ip")
124+
125+
# Update rolling manager with current proxy health status
126+
rolling_manager.update_proxy_health("aws", instance_name, ip_ready, pending_ips)
127+
128+
# Handle rolling deployments for age-limited instances
129+
if instances_to_recycle and config["rolling_deployment"]["enabled"]:
130+
rolling_config = config["rolling_deployment"]
131+
132+
for inst, elapsed in instances_to_recycle:
133+
if "PublicIpAddress" in inst["Instances"][0]:
134+
instance_ip = inst["Instances"][0]["PublicIpAddress"]
135+
136+
# Check if we can recycle this instance according to rolling deployment rules
137+
if rolling_manager.can_recycle_proxy(
138+
provider="aws",
139+
instance=instance_name,
140+
proxy_ip=instance_ip,
141+
total_healthy=len(ip_ready),
142+
min_available=rolling_config["min_available"],
143+
batch_size=rolling_config["batch_size"],
144+
rolling_enabled=True,
145+
min_scaling=instance_config["scaling"]["min_scaling"]
146+
):
147+
# Mark as recycling and delete
148+
rolling_manager.mark_proxy_recycling("aws", instance_name, instance_ip)
149+
delete_proxy(inst["Instances"][0]["InstanceId"], instance_config)
150+
rolling_manager.mark_proxy_recycled("aws", instance_name, instance_ip)
151+
logger.info(
152+
f"Rolling deployment: Recycled AWS {instance_config.get('display_name', 'default')} instance (age limit) -> {instance_ip}"
153+
)
154+
else:
155+
logger.info(
156+
f"Rolling deployment: Deferred recycling AWS {instance_config.get('display_name', 'default')} instance -> {instance_ip}"
157+
)
158+
elif instances_to_recycle and not config["rolling_deployment"]["enabled"]:
159+
# Standard non-rolling recycling
160+
for inst, elapsed in instances_to_recycle:
161+
delete_proxy(inst["Instances"][0]["InstanceId"], instance_config)
162+
if "PublicIpAddress" in inst["Instances"][0]:
163+
logger.info(
164+
f"Recycling AWS {instance_config.get('display_name', 'default')} instance, reached age limit -> " + inst["Instances"][0]["PublicIpAddress"]
165+
)
166+
else:
167+
logger.info(
168+
f"Recycling AWS {instance_config.get('display_name', 'default')} instance, reached age limit -> " + inst["Instances"][0]["InstanceId"]
169+
)
170+
109171
return ip_ready
110172

111173

0 commit comments

Comments
 (0)