- Understand Azure Well-Architected Framework
- Review the five pillars of cloud excellence
- Conduct workload assessment
- Analyze cost optimization recommendations
- Evaluate security and reliability posture
- Implement performance improvements
- Apply operational excellence practices
- Clean up all resources
- Azure CLI installed (
az --version) - Azure subscription with contributor access
- Location: Australia East
# Set Azure region and resource naming
LOCATION="australiaeast"
RG_NAME="rg-lab16d-waf"
APP_NAME="app-waf-$RANDOM"
# Display configuration
echo "$LOCATION"
echo "$RG_NAME"# Create resource group
az group create \
--name "$RG_NAME" \
--location "$LOCATION"# Display Well-Architected Framework overview
cat << 'EOF'
Azure Well-Architected Framework:
==================================
Overview:
The Azure Well-Architected Framework is a set of guiding tenets
that can be used to improve the quality of a workload. The framework
consists of five pillars of architectural excellence: Cost Optimization,
Operational Excellence, Performance Efficiency, Reliability, and Security.
The Five Pillars:
1. Cost Optimization:
Managing costs to maximize value delivered
Key Principles:
- Plan and estimate costs
- Provision with optimization
- Use monitoring and analytics
- Maximize efficiency
2. Operational Excellence:
Operations processes that keep a system running in production
Key Principles:
- Embrace DevOps culture
- Establish development standards
- Evolve operations with observability
- Deploy with confidence
- Automate for efficiency
3. Performance Efficiency:
Ability to scale to meet demands placed on it by users
Key Principles:
- Design for horizontal scaling
- Shift left on performance testing
- Continuously monitor
- Improve efficiency through optimization
4. Reliability:
Ability to recover from failures and continue to function
Key Principles:
- Design for business requirements
- Design for resilience
- Design for recovery
- Design for operations
5. Security:
Protecting applications and data from threats
Key Principles:
- Plan resources and how to harden them
- Design to protect confidentiality
- Design to protect integrity
- Design to protect availability
- Sustain and evolve security posture
Framework Structure:
Each Pillar Contains:
- Design Principles
- Assessment Questions
- Best Practices
- Guidance Documentation
- Reference Architectures
Assessment Process:
1. Workload Identification:
- Define application/service
- Identify components
- Map dependencies
- Document architecture
2. Assessment Execution:
- Answer pillar questions
- Rate current state
- Identify gaps
- Prioritize improvements
3. Recommendation Review:
- Review suggested actions
- Assess impact and effort
- Create improvement plan
- Track implementation
4. Continuous Improvement:
- Regular reassessments
- Monitor progress
- Adapt to changes
- Share learnings
Tools and Resources:
Azure Well-Architected Review:
- Interactive assessment tool
- Automated recommendations
- Priority scoring
- Available in Azure Portal
Azure Advisor:
- Automated recommendations
- Continuous monitoring
- Aligned with WAF pillars
- Actionable guidance
Documentation:
- Microsoft Learn
- Architecture Center
- Reference architectures
- Best practice guides
Benefits:
Architectural Confidence:
✓ Proven patterns
✓ Industry best practices
✓ Risk mitigation
✓ Quality assurance
Cost Savings:
✓ Identify waste
✓ Optimization opportunities
✓ Right-sizing guidance
✓ 10-30% typical savings
Improved Reliability:
✓ High availability design
✓ Disaster recovery
✓ Fault tolerance
✓ Better SLAs
Enhanced Security:
✓ Threat protection
✓ Compliance alignment
✓ Data protection
✓ Identity security
Better Performance:
✓ Scalability
✓ Latency optimization
✓ Throughput improvement
✓ User experience
Operational Maturity:
✓ Automation
✓ Monitoring
✓ DevOps practices
✓ Incident response
When to Use:
New Workload Design:
- Architecture planning
- Technology selection
- Design validation
- Risk assessment
Existing Workload Review:
- Health check
- Modernization planning
- Cost optimization
- Compliance review
Before Major Changes:
- Architecture changes
- Scaling plans
- Migration
- New features
Regular Intervals:
- Quarterly reviews
- Annual assessments
- After incidents
- Continuous improvement
EOF# Display cost optimization pillar
cat << 'EOF'
Cost Optimization Pillar:
=========================
Principle: Manage costs to maximize value
Key Focus Areas:
1. Cost Model and Budgets:
Questions:
☐ Do you have a cost model?
☐ Are budgets defined?
☐ Is cost allocated to teams?
☐ Are budgets monitored?
Best Practices:
✓ Define cost model
✓ Set realistic budgets
✓ Implement chargeback/showback
✓ Regular cost reviews
✓ Cost allocation tags
2. Provision Resources:
Questions:
☐ Are resources right-sized?
☐ Using appropriate SKUs?
☐ Leveraging PaaS when possible?
☐ Using serverless for variable workloads?
Best Practices:
✓ Right-size resources
✓ Choose appropriate tiers
✓ Use PaaS over IaaS
✓ Serverless for event-driven
✓ Avoid over-provisioning
3. Optimize Spending:
Questions:
☐ Using reserved instances?
☐ Azure Hybrid Benefit applied?
☐ Spot instances for suitable workloads?
☐ Auto-scaling configured?
Best Practices:
✓ Purchase reservations (40-60% savings)
✓ Apply existing licenses
✓ Use spot for fault-tolerant (90% savings)
✓ Auto-scale to match demand
✓ Schedule shutdown for dev/test
4. Monitor and Optimize:
Questions:
☐ Cost monitoring in place?
☐ Cost alerts configured?
☐ Azure Advisor reviewed?
☐ Regular optimization?
Best Practices:
✓ Enable Cost Management
✓ Set up budget alerts
✓ Act on Advisor recommendations
✓ Review monthly
✓ Track optimization savings
Cost Checklist:
Compute:
☐ VMs right-sized
☐ Reserved instances for stable
☐ Spot VMs for fault-tolerant
☐ Auto-shutdown for dev/test
☐ Azure Hybrid Benefit applied
Storage:
☐ Appropriate tier (Hot/Cool/Archive)
☐ Lifecycle management
☐ Unused disks deleted
☐ Snapshots managed
Networking:
☐ Minimize cross-region traffic
☐ CDN for static content
☐ Unused public IPs deleted
Databases:
☐ Right-sized DTUs/vCores
☐ Serverless for variable
☐ Reserved capacity
General:
☐ Unused resources deleted
☐ Dev/test pricing used
☐ Tags for cost allocation
☐ Budgets configured
Typical Savings:
- Right-sizing: 20-30%
- Reserved instances: 40-60%
- Spot VMs: up to 90%
- Storage tiering: 50-70%
- Overall: 10-30% total reduction
EOF# Display operational excellence pillar
cat << 'EOF'
Operational Excellence Pillar:
==============================
Principle: Operations that keep systems running
Key Focus Areas:
1. DevOps Culture:
Questions:
☐ DevOps culture embraced?
☐ Cross-functional teams?
☐ Continuous improvement?
Best Practices:
✓ Break down silos
✓ Shared responsibility
✓ Regular retrospectives
✓ Learn from failures
2. Development Standards:
Questions:
☐ Coding standards defined?
☐ Code review process?
☐ Infrastructure as Code?
Best Practices:
✓ Coding guidelines
✓ Peer code reviews
✓ IaC (ARM, Bicep, Terraform)
✓ Git for all code
3. Observability:
Questions:
☐ Comprehensive monitoring?
☐ Centralized logging?
☐ Distributed tracing?
Best Practices:
✓ Application Insights
✓ Log Analytics
✓ Metrics and dashboards
✓ Actionable alerts
4. Deployment:
Questions:
☐ CI/CD pipelines?
☐ Automated testing?
☐ Rollback capability?
Best Practices:
✓ Azure DevOps/GitHub Actions
✓ Automated tests
✓ Progressive deployment
✓ Easy rollback
5. Automation:
Questions:
☐ Manual tasks automated?
☐ Runbooks documented?
☐ Auto-scaling configured?
Best Practices:
✓ Azure Automation
✓ Documented runbooks
✓ Auto-remediation
✓ Reduce toil
Operations Checklist:
Infrastructure:
☐ Infrastructure as Code
☐ Version controlled
☐ Automated deployments
Monitoring:
☐ Application monitoring
☐ Infrastructure monitoring
☐ Log aggregation
Alerting:
☐ Health alerts
☐ Performance alerts
☐ On-call rotation
Deployment:
☐ CI/CD pipelines
☐ Automated testing
☐ Progressive rollout
Documentation:
☐ Architecture diagrams
☐ Runbooks
☐ Postmortem reports
Maturity Levels:
Level 1: Manual operations
Level 2: Basic automation
Level 3: CI/CD implemented
Level 4: Metrics-driven
Level 5: Full automation, self-healing
EOF# Display performance efficiency pillar
cat << 'EOF'
Performance Efficiency Pillar:
==============================
Principle: Scale to meet user demands
Key Focus Areas:
1. Scalability Design:
Questions:
☐ Designed for horizontal scaling?
☐ Stateless application?
☐ Caching implemented?
Best Practices:
✓ Scale-out over scale-up
✓ Stateless application tier
✓ Redis cache layer
✓ CDN for static content
2. Performance Testing:
Questions:
☐ Performance tests in CI/CD?
☐ Load testing performed?
☐ SLOs defined?
Best Practices:
✓ Automated performance tests
✓ Regular load testing
✓ Performance budgets
✓ SLO monitoring
3. Monitoring:
Questions:
☐ Performance metrics tracked?
☐ APM in place?
☐ Performance alerts?
Best Practices:
✓ Application Insights
✓ Response time tracking
✓ User experience monitoring
✓ Latency alerts
4. Optimization:
Questions:
☐ Code profiling performed?
☐ Database queries optimized?
☐ Network latency minimized?
Best Practices:
✓ Regular profiling
✓ Query optimization
✓ CDN and caching
✓ Connection pooling
Performance Checklist:
Application:
☐ Horizontal scaling
☐ Stateless design
☐ Async processing
☐ Connection pooling
Caching:
☐ Redis cache
☐ CDN for static assets
☐ App-level caching
☐ Query caching
Database:
☐ Indexes optimized
☐ Queries tuned
☐ Read replicas
☐ Connection pooling
Networking:
☐ CDN enabled
☐ Accelerated networking
☐ Private endpoints
☐ Compression enabled
Auto-Scaling:
☐ VMSS auto-scale
☐ App Service auto-scale
☐ AKS autoscaler
Performance Targets:
Response Time:
Excellent: <100ms
Good: 100-300ms
Acceptable: 300-1000ms
Poor: >1000ms
Improvements:
- Caching: 50-90% faster
- CDN: 70-90% latency reduction
- Database optimization: 5-10x
EOF# Display reliability pillar
cat << 'EOF'
Reliability Pillar:
===================
Principle: Recover from failures and continue
Key Focus Areas:
1. Business Requirements:
Questions:
☐ SLA requirements defined?
☐ RPO/RTO determined?
☐ Availability targets set?
Best Practices:
✓ Define SLAs (99.9%, 99.95%, 99.99%)
✓ Set RPO/RTO objectives
✓ Classify criticality
✓ Cost vs reliability tradeoff
2. Resiliency:
Questions:
☐ Redundancy implemented?
☐ Load balancing configured?
☐ Availability Zones used?
Best Practices:
✓ Multiple instances
✓ Load balancer
✓ Availability Zones (99.99% SLA)
✓ Circuit breaker pattern
✓ Retry logic
3. Recovery:
Questions:
☐ Backup configured?
☐ Disaster recovery plan?
☐ Recovery tested?
Best Practices:
✓ Azure Backup
✓ DR to secondary region
✓ Regular DR drills
✓ Geo-redundant storage
✓ Automated failover
4. Operations:
Questions:
☐ Health monitoring?
☐ Failure alerts?
☐ Incident response plan?
Best Practices:
✓ Comprehensive monitoring
✓ Actionable alerts
✓ On-call coverage
✓ Documented procedures
✓ Postmortem analysis
Reliability Checklist:
High Availability:
☐ Multiple instances
☐ Load balancer
☐ Availability Zone
☐ Health probes
☐ Auto-scaling
Backup:
☐ Azure Backup enabled
☐ Retention policy
☐ Recovery tested
☐ Geo-redundant
Disaster Recovery:
☐ Secondary region
☐ Replication configured
☐ Failover tested
☐ DR runbook
Monitoring:
☐ Uptime monitoring
☐ Health checks
☐ Availability alerts
Fault Tolerance:
☐ Retry logic
☐ Circuit breaker
☐ Timeout handling
☐ Graceful degradation
SLA Levels:
Single Instance: 99.9% (43 min/month downtime)
Availability Set: 99.95% (22 min/month)
Availability Zone: 99.99% (4 min/month)
Multi-Region: 99.999% (26 sec/month)
EOF# Display security pillar
cat << 'EOF'
Security Pillar:
================
Principle: Protect applications and data
Key Focus Areas:
1. Resource Planning:
Questions:
☐ Security requirements defined?
☐ Compliance needs identified?
☐ Threat model created?
Best Practices:
✓ Define requirements
✓ Identify compliance (GDPR, HIPAA)
✓ Threat modeling (STRIDE)
✓ Security baseline
2. Confidentiality:
Questions:
☐ Data encrypted at rest?
☐ Data encrypted in transit?
☐ Secrets in Key Vault?
Best Practices:
✓ Storage encryption
✓ TLS/HTTPS everywhere
✓ Azure Key Vault
✓ RBAC least privilege
3. Integrity:
Questions:
☐ Input validation?
☐ Audit logging?
☐ Change tracking?
Best Practices:
✓ Validate all inputs
✓ Comprehensive audit logs
✓ Immutable infrastructure
✓ Version control
4. Availability:
Questions:
☐ DDoS protection?
☐ WAF configured?
☐ Rate limiting?
Best Practices:
✓ Azure DDoS Protection
✓ Application Gateway WAF
✓ API throttling
✓ Regular backups
5. Sustain Security:
Questions:
☐ Security monitoring?
☐ Vulnerability scanning?
☐ Patch management?
Best Practices:
✓ Defender for Cloud
✓ Automated scanning
✓ Update Management
✓ Incident response plan
Security Checklist:
Identity:
☐ Azure AD authentication
☐ MFA enabled
☐ Conditional Access
☐ Managed identities
☐ Least privilege
Network:
☐ NSG rules configured
☐ Private endpoints
☐ VNet segmentation
☐ DDoS protection
Data:
☐ Encryption at rest
☐ Encryption in transit
☐ Key Vault for secrets
☐ Backup encrypted
Application:
☐ Input validation
☐ SQL injection prevention
☐ XSS prevention
☐ HTTPS only
Compliance:
☐ Security baseline
☐ Audit logging
☐ Regular assessments
Security Patterns:
Defense in Depth:
Multiple layers of protection
Least Privilege:
Minimum required access
Zero Trust:
Verify explicitly, never trust
Secrets Management:
Key Vault, no hardcoding
EOF# Display assessment workflow
cat << 'EOF'
WAF Assessment Process:
=======================
1. Preparation (1-2 hours):
Gather:
- Architecture diagrams
- Resource inventory
- Current metrics
- Known issues
2. Assessment (2-4 hours):
For Each Pillar:
- Answer questions
- Rate current state
- Identify gaps
- Review recommendations
3. Analysis (1-2 hours):
Prioritize:
High Impact + Low Effort: DO FIRST
- Delete unused resources
- Enable MFA
- Configure alerts
High Impact + High Effort: PLAN
- Multi-region deployment
- Migrate to PaaS
Low Impact + Low Effort: FILL-IN
- Documentation
- Tagging
Low Impact + High Effort: RECONSIDER
4. Action Planning (1-2 hours):
Create Roadmap:
Immediate (0-30 days):
☐ Enable backup
☐ Configure alerts
☐ Implement tagging
Short-term (1-3 months):
☐ Right-size resources
☐ Purchase RIs
☐ Implement auto-scaling
Medium-term (3-6 months):
☐ Multi-region deployment
☐ Migrate to PaaS
Long-term (6-12 months):
☐ Architecture modernization
☐ Microservices
5. Execution and Tracking:
Implementation:
- Assign owners
- Set deadlines
- Track progress
Regular Reviews:
- Monthly: Quick wins
- Quarterly: Major initiatives
- Annually: Full reassessment
Sample Results:
Cost Optimization: 65/100 ⚠
Critical: No reserved instances
Warning: Oversized VMs
Operational Excellence: 75/100 ✓
Good: CI/CD implemented
Warning: Manual runbooks
Performance: 70/100 ✓
Good: Auto-scaling
Warning: No caching
Reliability: 60/100 ⚠
Critical: Single-region only
Critical: No backup
Security: 80/100 ✓
Good: MFA enabled
Warning: Some public endpoints
EOF# Create App Service plan
az appservice plan create \
--name "$APP_NAME-plan" \
--resource-group "$RG_NAME" \
--location "$LOCATION" \
--sku B1 \
--is-linux
echo "App Service plan created"
# Create web app
az webapp create \
--name "$APP_NAME" \
--resource-group "$RG_NAME" \
--plan "$APP_NAME-plan" \
--runtime "NODE:18-lts"
echo "Web app created"# Tag with assessment metadata
az group update \
--name "$RG_NAME" \
--tags \
WAF-Date="2024-01" \
WAF-Overall="70" \
WAF-Cost="65" \
WAF-Operations="75" \
WAF-Performance="70" \
WAF-Reliability="60" \
WAF-Security="80"
echo "WAF assessment tags applied"# Display improvement plan
cat << 'EOF'
Improvement Roadmap:
====================
Phase 1: Quick Wins (Month 1)
Week 1-2:
✓ Enable Azure Backup
✓ Configure budget alerts
✓ Implement tagging
Impact: Basic protection
Week 3-4:
✓ Enable MFA
✓ Right-size VMs
Impact: 15% cost savings, better security
Results:
- Backup enabled
- MFA enforced
- Savings: $750/month
Phase 2: Foundation (Months 2-3)
Month 2:
✓ Purchase reserved instances (40% savings)
✓ Implement auto-scaling
✓ Add health probes
Impact: $2000/month savings, better availability
Month 3:
✓ Implement caching (60% faster)
✓ Enable CDN
✓ Centralized logging
Impact: Better performance, observability
Results:
- Cost savings: $2,750/month
- Performance: 60% improvement
- Availability: 99.9%
Phase 3: Scale (Months 4-6)
Month 4-5:
✓ Multi-region deployment (99.95% SLA)
✓ Migrate to PaaS
Impact: DR capability, reduced overhead
Month 6:
✓ Implement CI/CD
✓ Advanced monitoring
Impact: Faster deployments
Results:
- Availability: 99.95%
- Deployment: Daily
- MTTR: Reduced 50%
Phase 4: Optimize (Months 7-12)
Quarters 3-4:
✓ Microservices migration
✓ Advanced security
✓ Chaos engineering
✓ FinOps maturity
Impact: 30% cost optimization
Results:
- Availability: 99.99%
- Cost: 30% reduction
- Security: 95/100
- MTTR: <30 minutes
Total (12 months):
Cost:
- Monthly savings: $3,500 (30%)
- Annual savings: $42,000
- ROI: 6-month payback
Reliability:
- Availability: 99.99%
- MTTR: 30 minutes
Performance:
- 60% faster responses
- 3x throughput
Security:
- Score: 95/100
- 80% fewer incidents
Operations:
- 10x deployment frequency
- 90% less manual work
EOF# Display ongoing practices
cat << 'EOF'
Continuous Improvement:
=======================
Monthly:
Week 1:
✓ Review Advisor recommendations
✓ Cost optimization
Week 2:
✓ Security review
✓ Patch compliance
Week 3:
✓ Performance review
✓ Capacity planning
Week 4:
✓ Reliability review
✓ Incident analysis
Quarterly:
✓ Full WAF re-assessment
✓ Score tracking
✓ Roadmap adjustment
✓ Stakeholder communication
Annual:
✓ Comprehensive assessment
✓ Strategic review
✓ Budget planning
✓ Multi-year roadmap
Culture:
Principles:
✓ Always learning
✓ Data-driven decisions
✓ Experiment and iterate
✓ Share knowledge
Practices:
✓ Blameless postmortems
✓ Regular retrospectives
✓ Innovation time
✓ Documentation
Metrics:
Cost:
- Monthly spend trend
- Optimization savings
- RI coverage
Reliability:
- Availability %
- MTBF/MTTR
- Incident count
Performance:
- Response times
- Throughput
- Error rates
Security:
- Security score
- Vulnerabilities
- Incidents
Operations:
- Deployment frequency
- Lead time
- Automation %
EOF# Delete resource group and all resources
az group delete \
--name "$RG_NAME" \
--yes \
--no-wait
echo "Cleanup complete"You explored Azure Well-Architected Framework as comprehensive architectural guidance across five pillars including cost optimization for maximizing value, operational excellence for production operations, performance efficiency for scaling, reliability for fault tolerance, and security for threat protection, reviewed cost optimization pillar with right-sizing achieving 20-30% savings, reserved instances providing 40-60% discounts, spot VMs offering up to 90% savings, appropriate SKU selection, lifecycle management, and tagging for cost allocation, examined operational excellence pillar emphasizing DevOps culture, infrastructure as code using ARM/Bicep/Terraform, comprehensive monitoring with Application Insights and Log Analytics, CI/CD pipelines, automation to reduce manual toil, and maturity progression from manual to self-healing operations, learned performance efficiency pillar covering horizontal scaling design, multi-level caching strategies providing 50-90% improvements, CDN integration reducing latency 70-90%, database optimization achieving 5-10x gains, async processing, and auto-scaling to meet demand, analyzed reliability pillar defining SLA targets from 99.9% to 99.99%, implementing redundancy with availability zones, backup and disaster recovery with RPO/RTO objectives, health monitoring, retry logic with exponential backoff, and circuit breaker patterns, studied security pillar implementing defense-in-depth with multiple protection layers, zero trust principles, Azure AD with MFA, encryption at rest and in transit, Key Vault for secrets management, network segmentation, DDoS protection, and continuous vulnerability management, conducted Well-Architected assessment process with preparation phase gathering architecture documentation, execution phase answering pillar questions and rating current state, analysis using impact-effort matrix for prioritization, action planning with phased roadmap from quick wins to strategic initiatives, and tracking with regular reviews, developed comprehensive improvement roadmap achieving Phase 1 quick wins with 15% savings and MFA implementation, Phase 2 foundation work delivering $2750 monthly savings and 60% performance improvement, Phase 3 scaling to 99.95% availability with CI/CD, and Phase 4 optimization reaching 30% total cost reduction and 99.99% availability with microservices and chaos engineering, and established continuous improvement practices with monthly Advisor reviews, quarterly full WAF re-assessments, annual strategic planning, metrics tracking across all five pillars, blameless postmortems, regular retrospectives, and governance through cross-functional teams driving ongoing cloud excellence culture.