newrelic · ricegi · Aug 8, 2025 · Aug 11, 2025 · Aug 12, 2025 · Aug 12, 2025
diff --git a/...ocs/new-relic-solutions/observability-maturity/business-uptime/introduction.mdx b/...ocs/new-relic-solutions/observability-maturity/business-uptime/introduction.mdx
@@ -0,0 +1,58 @@
+---
+title: Introduction to business uptime with the New Relic platform
+tags:
+  - Observability maturity
+  - Intelligent observability
+  - Instrumentation
+  - Implementation guide
+metaDescription: Introduction to business uptime
+redirects:
+  - /docs/new-relic-solutions/observability-maturity
+  - /docs/full-stack-observability
+  - /docs/new-relic-solutions/best-practices-guides/full-stack-observability
+freshnessValidatedDate: never
+---
+
+# Overview
+
+Business uptime is a critical metric for any organization, reflecting the reliability and availability of services that directly impact customer satisfaction and business results. The New Relic Observability platform provides a comprehensive suite of tools and practices to enhance business uptime through improved service delivery. This document outlines a maturity progression model that leverages observability practices to drive business-focused results, specifically targeting business uptime.
+
+# Maturity progression model
+
+The maturity progression model is designed to guide organizations through a structured journey from reactive to proactive and ultimately mastery levels of observability. Each level is characterized by specific practices and metrics that you will find in the related scorecard to help measure and improve business uptime.
+
+## Level 1: Reactive approach
+
+At the reactive level, organizations respond to incidents as they occur, often without prior warning. The focus is on establishing basic alert mechanisms to ensure that issues are detected promptly. The following rules are used to evaluate the effectiveness of a reactive approach:
+
+- ***[Infrastructure Alert Coverage](/docs/new-relic-solutions/observability-maturity/business-uptime/l1-infrastructure-alert-coverage):*** Ensures alert definitions are present for INFRA-HOST or INFRA-KUBERNETES-POD entities. A lack of alerts results in a failure score.
+- ***[Service Delivery Alert Coverage](/docs/new-relic-solutions/observability-maturity/business-uptime/l1-service-delivery-alert-coverage):*** Checks for alert definitions on APM-APPLICATION, BROWSER-APPLICATION, MOBILE-APPLICATION, or SYNTH-MONITOR entities. Missing alerts lead to a failure score.
+- ***[Critical Alert Coverage](/docs/new-relic-solutions/observability-maturity/business-uptime/l1-critical-alert-coverage):*** Evaluates a 7-day sample of alert incidents per target entity to determine the percentage due to critical versus warning violations.
+- ***[Alert Noise](/docs/new-relic-solutions/observability-maturity/business-uptime/l1-alert-noise):*** Assesses incidents over a 7-day period to determine if a specific policy is responsible for more than 14 incidents during that time.
+
+## Level 2: Proactive approach
+
+The proactive level involves anticipating potential issues before they impact business operations. Organizations at this stage use observability practices to continuously improve service delivery. The following rules and metrics are evaluated:
+
+- ***[Service Level Coverage](/docs/new-relic-solutions/observability-maturity/business-uptime/l2-service-level-coverage):*** Assesses whether entities have defined Service Level Indicators (SLIs) during the latest entity harvest. Defined SLIs indicate proactive monitoring.
+- ***[Alerts Mean Time To Close](/docs/new-relic-solutions/observability-maturity/business-uptime/l2-alerts-mean-time-to-close):*** Measures the time taken to close incidents, with resolutions under 30 minutes considered successful. This metric reflects the efficiency of incident management processes.
+- ***[APM Criticality Tag Coverage](/docs/new-relic-solutions/observability-maturity/business-uptime/l2-apm-criticality-tag-coverage):*** Evaluates the assignment of criticality ratings (low, medium, high) to entities, highlighting their importance for business operations.
+
+## Level 3: Mastery 
+
+At the mastery level, organizations achieve direct business benefits from their observability practices, transcending mere incident remediation. The focus is on service level attainment:
+
+- ***[Service Level Attainment](/docs/new-relic-solutions/observability-maturity/business-uptime/l3-service-level-attainment):*** Evaluates the latest service level compliance score for each defined SLI. A success rate above 95% is considered successful, indicating high reliability and uptime.
+
+# Observability practices
+
+Observability practices are the actionable components of the maturity model, enabling organizations to realize the potential value of the New Relic platform. These practices include:
+
+- ***[Alert quality management (AQM)](/docs/new-relic-solutions/observability-maturity/uptime-performance-reliability/aqm-implementation-guide/)***: Reduces alert fatigue by focusing on alerts with true business impact. AQM improves response times and increases awareness of critical events, leading to higher uptime and availability.
+- ***[Service level management (SLM)](/docs/new-relic-solutions/observability-maturity/uptime-performance-reliability/slm-implementation-guide/)***: Standardizes data into a universal language, improving communication between IT and business stakeholders. SLM enhances reliability by reducing business-impacting incidents and their duration.
+
+The New Relic Observability platform provides a structured approach to improving business uptime through a maturity progression model. By advancing from reactive to proactive and mastery levels, organizations can achieve significant improvements in service delivery and business results. Observability practices such as AQM and SLM play a crucial role in this journey, ensuring that organizations focus on the right metrics and actions to enhance reliability and uptime.
+
+# Next steps
+
+Organizations are encouraged to explore New Relic's resources and guides to tailor their observability journey according to their specific needs. By leveraging the maturity progression model and observability practices, businesses can unlock the full potential of the New Relic platform and achieve their uptime goals.
diff --git a/...s/new-relic-solutions/observability-maturity/business-uptime/l1-alert-noise.mdx b/...s/new-relic-solutions/observability-maturity/business-uptime/l1-alert-noise.mdx
@@ -0,0 +1,41 @@
+---
+title: Level 1 - Alert noise scorecard rule
+tags:
+  - Observability maturity
+  - Intelligent observability
+  - Instrumentation
+  - Implementation guide
+metaDescription: Observability maturity business uptime alert noise
+redirects:
+  - /docs/new-relic-solutions/observability-maturity
+  - /docs/full-stack-observability
+  - /docs/new-relic-solutions/best-practices-guides/full-stack-observability
+freshnessValidatedDate: never
+---
+
+# Overview
+
+The Alert Noise rule has produced a score based on the number of incidents attributed to specific policies over a 7-day window. This document explains the interpretation of your score and offers guidance on actions you can take to optimize your incident management strategy.
+
+# Description
+
+The score evaluates incidents over a 7-day period to determine if a specific policy is responsible for more than 14 incidents during that time.
+
+# Interpretation
+
+The goal is to maintain a manageable number of incidents that can be effectively addressed. If incidents reoccur from specific policies, the underlying source of instability should be identified and remediated. If an incident isn't suitable for remediation, consider whether it should be classified as a critical violation.
+
+# Actions to Consider
+
+- ***Evaluate Target Cohort:*** Ensure the rule is targeting the correct account and cohort of entity types. Modify the rule to address alert incidents for target entities as needed.
+
+- ***Adjust Incident Occurrence Threshold:*** Review and adjust the number of incident occurrences to align with your expectations and systems management standards. Customers may find more frequent or less frequent incidents appropriate for their alerting strategy.
+
+- ***Develop Long-Term Assessment:*** Assess policy violations over time, considering longer evaluation periods to identify systems indicative of persistent reliability challenges. Create a prioritized list with risk assessments for each system to determine if they could benefit from architectural or implementation improvements.
+
+# Important Considerations
+
+* Custom Evaluation: Remember, these rules and scores are not an exact science. It's crucial to evaluate them based on your specific needs and conditions. Tailor your measurements to align with your systems management standards and best practices.
+* Continuous Improvement: Incident management strategies should evolve. Regularly review and adjust your approach to ensure it meets your current requirements.
+
+By understanding your score and taking the recommended actions, you can enhance your policy incident management and ensure it aligns with your broader systems management strategy.
diff --git a/...solutions/observability-maturity/business-uptime/l1-critical-alert-coverage.mdx b/...solutions/observability-maturity/business-uptime/l1-critical-alert-coverage.mdx
@@ -0,0 +1,45 @@
+---
+title: Level 1 - Critical alert coverage scorecard rule
+tags:
+  - Observability maturity
+  - Intelligent observability
+  - Instrumentation
+  - Implementation guide
+metaDescription: Observability maturity business uptime critical alert coverage
+redirects:
+  - /docs/new-relic-solutions/observability-maturity
+  - /docs/full-stack-observability
+  - /docs/new-relic-solutions/best-practices-guides/full-stack-observability
+freshnessValidatedDate: never
+---
+
+# Overview
+
+The Critical Alert Coverage rule has produced a score based on the critical alert coverage of your systems. This document explains the interpretation of your score and offers guidance on actions you can take to enhance your alerting strategy.
+
+# Description
+
+The score evaluates a 7-day sample of alert incidents per target entity to determine what percentage are due to critical versus warning violations.
+
+# Interpretation
+
+An overreliance on critical alert conditions may indicate a lack of progressive alerting and incident response processes. This could lead to alert fatigue and hinder continual improvement in system and service reliability or quality.
+
+It's important to balance your alerts by defining an alerting strategy that includes:
+
+* Immediately Actionable Alerts: These are critical alerts that indicate negative business-impacting events requiring immediate attention.
+* Anticipatory Alerts: These alerts signal unexpected conditions that are not immediately business-impacting but may require future adjustments.
+* Retrospective Alerts: These alerts are not meant for immediate action but should be evaluated through thoughtful periodic analysis of system behavior.
+
+# Actions to Consider
+
+- ***Evaluate Target Cohort:*** Ensure the rule targets the correct cohort of incidents, focusing on production systems.
+- ***Adjust Success Threshold:*** Review and adjust the defined success threshold. The default is 25% of alerts should be critical. If your use of New Relic is primarily for critical alert conditions or if there are other reasons for a higher proportion of critical alerts, adjust accordingly.
+- ***Review Alerting Strategy:*** Conduct a broad review of your alerting strategy. Ensure there are well-established expectations for system operations and a progression of alert design that includes anticipatory and retrospective alerting conditions, in addition to those needing immediate attention.
+
+# Important Considerations
+
+* Custom Evaluation: Remember, these rules and scores are not an exact science. It's crucial to evaluate them based on your specific needs and conditions. Tailor your measurements to align with your observability goals.
+* Continuous Improvement: Observability strategies should evolve. Regularly review and adjust your approach to ensure it meets your current requirements.
+
+By understanding your score and taking the recommended actions, you can enhance your critical alert coverage and ensure it aligns with your broader monitoring strategy.
diff --git a/...ons/observability-maturity/business-uptime/l1-infrastructure-alert-coverage.mdx b/...ons/observability-maturity/business-uptime/l1-infrastructure-alert-coverage.mdx
@@ -0,0 +1,40 @@
+---
+title: Level 1 - Infrastructure alert coverage scorecard rule
+tags:
+  - Observability maturity
+  - Intelligent observability
+  - Instrumentation
+  - Implementation guide
+metaDescription: Observability maturity business uptime infrastructure alert coverage
+redirects:
+  - /docs/new-relic-solutions/observability-maturity
+  - /docs/full-stack-observability
+  - /docs/new-relic-solutions/best-practices-guides/full-stack-observability
+freshnessValidatedDate: never
+---
+
+# Overview
+
+The Infrastructure Alert Coverage rule has produced a score based on the alert coverage of your infrastructure entities. This document explains the interpretation of your score and offers guidance on actions you can take to improve your observability program.
+
+# Description
+
+The score is generated from a rule that checks for alert definitions on your INFRA-HOST or INFRA-KUBERNETES-POD entities. If any of these entities lack a defined alert, the rule will register as a failure.
+
+# Interpretation
+
+A low score in alert coverage may suggest that infrastructure management is not prioritized within your observability strategy, or it may indicate the absence of a standardized approach to infrastructure alerting.
+
+# Actions to Consider
+
+- ***Review Infrastructure Entity Coverage:*** Evaluate your infrastructure entities in relation to your observability goals. Consider entities from cloud integrations, agent extensions, or Prometheus if they play a significant role in your telemetry data. Update the scorecard rule to reflect the unique aspects of your infrastructure.
+- ***Adjust Rule Applicability:*** If New Relic is not central to your infrastructure observability, consider disabling or removing the rule.
+- ***Refine Rule Query:*** A low score might result from capturing an inappropriate cohort of entities. Modify the NRQL query to focus on production infrastructure entities more accurately.
+- ***Develop an Alerting Strategy:*** Optimize the rule for your needs, then review or develop an alerting strategy that includes infrastructure alerting.
+
+# Important Considerations
+
+* Custom Evaluation: Remember, these rules and scores are not an exact science. It's crucial to evaluate them based on your specific needs and conditions. Tailor your measurements to align with your observability goals.
+* Continuous Improvement: Observability strategies should evolve. Regularly review and adjust your approach to ensure it meets your current requirements.
+
+By understanding your score and taking the recommended actions, you can enhance your infrastructure observability and ensure it aligns with your broader monitoring strategy.
diff --git a/...s/observability-maturity/business-uptime/l1-service-delivery-alert-coverage.mdx b/...s/observability-maturity/business-uptime/l1-service-delivery-alert-coverage.mdx
@@ -0,0 +1,40 @@
+---
+title: Level 1 - Service delivery alert coverage scorecard rule
+tags:
+  - Observability maturity
+  - Intelligent observability
+  - Instrumentation
+  - Implementation guide
+metaDescription: Observability maturity business uptime service delivery alert coverage
+redirects:
+  - /docs/new-relic-solutions/observability-maturity
+  - /docs/full-stack-observability
+  - /docs/new-relic-solutions/best-practices-guides/full-stack-observability
+freshnessValidatedDate: never
+---
+
+# Overview
+
+The Service Delivery Alert Coverage rule has produced a score based on the alert coverage of your service delivery entities. This document explains the interpretation of your score and offers guidance on actions you can take to enhance your observability program.
+
+# Description
+
+The score is derived from a rule that checks for alert definitions on your APM-APPLICATION, BROWSER-APPLICATION, MOBILE-APPLICATION, or SYNTH-MONITOR entities. If any of these entities lack a defined alert, the rule will register as a failure.
+
+# Interpretation
+
+A low score in alert coverage may suggest that the management of service entities, such as APM and Browser applications, is not prioritized within your observability strategy.
+
+# Actions to Consider
+
+- ***Review Service Delivery Entity Coverage:*** Assess your service delivery entities in relation to your observability goals. These entities are typically associated with supporting or executing customer business processes. Modify the rule to include entities that align with your service delivery architecture, such as Lambda or Databricks.
+- ***Adjust Rule Applicability:*** If New Relic is not central to your service delivery observability, consider disabling or removing the rule.
+- ***Refine Rule Query:*** A low score might result from capturing an inappropriate cohort of entities. Modify the NRQL query to focus more accurately on production service delivery entities.
+- ***Develop an Alerting Strategy:*** Optimize the rule for your needs, then review or develop an alerting strategy that includes service delivery alerting.
+
+# Important Considerations
+
+* Custom Evaluation: Remember, these rules and scores are not an exact science. It's crucial to evaluate them based on your specific needs and conditions. Tailor your measurements to align with your observability goals.
+* Continuous Improvement: Observability strategies should evolve. Regularly review and adjust your approach to ensure it meets your current requirements.
+
+By understanding your score and taking the recommended actions, you can enhance your service delivery observability and ensure it aligns with your broader monitoring strategy.
diff --git a/...lutions/observability-maturity/business-uptime/l2-alerts-mean-time-to-close.mdx b/...lutions/observability-maturity/business-uptime/l2-alerts-mean-time-to-close.mdx
@@ -0,0 +1,42 @@
+---
+title: Level 2 - Alerts, mean time to close scorecard rule
+tags:
+  - Observability maturity
+  - Intelligent observability
+  - Instrumentation
+  - Implementation guide
+metaDescription: Observability maturity business uptime alerts mean time to close
+redirects:
+  - /docs/new-relic-solutions/observability-maturity
+  - /docs/full-stack-observability
+  - /docs/new-relic-solutions/best-practices-guides/full-stack-observability
+freshnessValidatedDate: never
+---
+
+# Overview
+
+The Alerts Mean Time to Close rule has produced a score based on the time taken to close incidents. This document explains the interpretation of your score and offers guidance on actions you can take to optimize your incident management strategy.
+
+# Description
+
+The score evaluates the time taken to close each incident, with those resolved in under 30 minutes considered successful incident resolutions.
+
+# Interpretation
+
+Long-running incident open times, especially those related to specific alert policies and conditions, may indicate sub-optimal detection and resolution processes or volatility in the targeted entity. Consider the following:
+
+* Entity Behavior and Alert Thresholds: Evaluate the behavior of the entity and the alert thresholds intended for it. Aim to improve the alert-to-action incident management procedure.
+* Entity Importance: Some entities may not warrant rapid remediation. Consider alternative methods for being informed of unexpected telemetry values from such entities.
+
+# Actions to Consider
+
+- ***Evaluate Target Cohort:*** Determine if the cohort of incidents-to-entities needs modification to exclude entities prone to long-running incident times.
+- ***Review Incident Management Practices:*** Assess whether New Relic is capturing the close event accurately. If incident management occurs outside New Relic AIOps/Alerts, the rule logic may need revision. In some cases, disabling or deleting the rule may be more realistic.
+- ***Develop Alerting and Incident Management Strategy:*** Ensure you have a well-defined alerting and incident management strategy. If not, engage in an [Alert Quality Management (AQM)](/docs/new-relic-solutions/observability-maturity/uptime-performance-reliability/aqm-implementation-guide/) workshop to introduce the need for a comprehensive, well-documented approach to alerting maintenance, incident management, and regular program review.
+
+# Important Considerations
+
+* Custom Evaluation: Remember, these rules and scores are not an exact science. It's crucial to evaluate them based on your specific needs and conditions. Tailor your incident management strategy to align with your business objectives and operational requirements.
+* Continuous Improvement: Incident management strategies should evolve. Regularly review and adjust your approach to ensure it meets your current requirements.
+
+By understanding your score and taking the recommended actions, you can enhance your incident resolution times and ensure they align with your broader business objectives and observability strategy.