Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added design-proposals/images/modular-vpro-orch.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
239 changes: 239 additions & 0 deletions design-proposals/vpro-eim-modular-decomposition.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,239 @@
# Design Proposal: Out-of-band Device Management Service Decomposition

Author(s): Edge Infrastructure Manager Team

Last Updated: 2025-11-10

## Abstract

As part of the Edge Infrastructure Manager (EIM) included in the Edge Manageability Framework (EMF) full stack
deployment, it deploys a device management workflow. This workflow is used to support out-of-band device management for
edge nodes connected to the Orchestrator for those devices that can support vPRO/Active Management Technology
(AMT)/Intel Standard Manageability (ISM). It does this using the components included in the
[Open Device Management Toolkit (OpenDMT)](https://device-management-toolkit.github.io/docs/2.28/)
release. Currently, this workflow cannot be deployed on it own, instead the whole EIM needs to deployed in order to be
able to use out-of-band device management. This proposal outlines how the current workflow integrated into EIM can be
separated into its own modular workflow that can be deployed independently of the rest of the EIM services using just
the shared foundational services of the EMF stack. It will also cover how such a modular workflow can be deployed in an
environment where, instead of the EMF foundational services being avaiable, a different infrastructure stack is used to
suport the workflow.

## Background

The current out-of-band implementation uses two of the OpenDMT services, the
[Management Presence Server](https://device-management-toolkit.github.io/docs/2.28/Reference/MPS/configuration/)
and the [Remote Provisioning Server](https://device-management-toolkit.github.io/docs/2.28/Reference/RPS/configuration/),
as well as a [Device Management (DM) Manager](https://github.com/open-edge-platform/infra-external/tree/main/dm-manager)
in the Orchestrator to manage the device activation and power for all connected edge node devices. On each Edge Node,
the [Platform Manageability Agent](https://github.com/open-edge-platform/edge-node-agents/tree/main/platform-manageability-agent)
communicates with the DM manager and triggers the
[Remote Provisioning Client](https://device-management-toolkit.github.io/docs/2.28/Reference/RPC/buildRPC_Manual/) on
the edge node to update the device settings based on the DM Manager response. For more details on the out-of-band device
management workflow, see the [vPRO Devices Activation Documentation](./vpro-device.md).

Since this workflow requires the inventory to track edge nodes and devices as well as some of the foundational platform
services of the EMF stack, such as authentication, multitenancy and storage, in order to use the out-of-band device
management the entire EIM, with all services, must be deployed. This means that, if only the out-of-band device
management is needed, there are a number of services deployed that are not required for the use case.

As outlined in the [EIM Modular Decomposition proposal](./eim-modular-decomposition.md), by decoupling individual use
case workflows currently in the EIM so that they can be deployed into an Orchestrator environment without requiring the
other EIM services to also be deployed. The out-of-band device management use case is one of these workflows that will
be decomposed from the EIM stack into a modular workflow. To do this, the EIM services related to this use case will be
combined into an out-of-band device management installation package that can be installed onto an Orchestrator that
contains only the required foundational services needed by the workflow. For the Edge Node agents and services needed
for this use case, the same will also be done for these.

## Proposal

### Scope

- This proposal will only cover the EIM and Edge Node agents and services used in the current out-of-band device
management workflow.
- Proposal only covers how the workflow will run and outlines how it will differ from the current workflow in order to
support modular workflows, it will not cover packaging and installation of modular workflows. For details on how
packaging and installation of such modular workflows will be hanldled, please see the
[Modular Packaging and Installer Design Proposal](./eim-modular-decomposition-installer.md).
- Changes outlined below are designed to work in both Track 1 and Track 2 use case outlined in the
[EIM Modular Decomposition proposal](./eim-modular-decomposition.md).

### Architectural Design

#### EIM Service Design

The current design of the EIM services used in out-of-band device management are outlined in the
[vPRO design documentation](./vpro-device.md) and will remain the same when moved to a modular use case.
For example, the current device activation and power management flows will remain as is, however abstraction will
be added for the APIs used by the DM Manager and Inventory services on the Orchestrator to allow them to be plugged into
both the EMF foundational services as well as a customer's own infrastructure if needed.

![Modular vPRO Orchestrator Workflow](images/modular-vpro-orch.png)

In the modular use case, the creation of device profiles and communication between the out-of-band device management
services on the Orchestrator and the agents on the Edge Node will remain the same before. Using the
[Orchestrator CLI](orch-cli.md), a user will be able to create the domain profile for a device and add it to the
database so it can be retrieved when the RPC binary on the Edge Node communicates with the RPS service to register and
activate the device. They will also be able to trigger device activation and power off operations using the CLI to send
requests to the EIM services.

```mermaid
sequenceDiagram
autonumber
participant US as User
box rgba(50, 219, 219, 1) Orchestrator Components
participant CLI as Orchestrator CLI
participant DM as Device Management (DM) Manager
participant INV as Inventory
participant RPS as Remote Provisioning Server (RPS)
participant MPS as Management Presence Server (MPS)
participant DB as Database
end
participant EN as Edge Node
alt OpenDMT Configuration
US ->> CLI: Create CIRA Config
CLI ->> RPS: Create CIRA Configuration
RPS ->> DB: Store CIRA Configuration to database
US ->> CLI: Create Domain Profile for device
CLI ->> RPS: Create Domain Profile for device
RPS ->> DB: Store Domain Profile to database
end
alt Device Activation
US ->> CLI: Request Device Activation
CLI ->> INV: Update device management setting
DM ->> INV: Retrieve latest device settings from inventory
activate DM
DM ->> EN: Send device activation request
EN ->> RPS: Start device activation
activate RPS
RPS ->> EN: Device activation successful
deactivate RPS
EN ->> DM: Report device activation success
DM ->> INV: Update device activation state
deactivate DM
end
alt Device Power Management
US ->> CLI: Request device power off
CLI ->> INV: Set device power state rquest to off
DM ->> INV: Retrieve latest device settings from inventory
activate DM
DM ->> MPS: Get device capabilities
DM ->> DM: Confirm device can be remotely powered off
DM ->> MPS: Get current device state
DM ->> DM: Check device current power state
DM ->> MPS: Send device power off request
MPS ->> EN: Send power off request
EN ->> MPS: Power off request success
MPS ->> DM: Report power off success
DM ->> INV: Update device activation success
deactivate DM
end
```

#### Edge Node Agent Design

For the edge node agents, there will need to be some changes needed to the Platform Manageability Agent (PMA) as well as
the Node Agent (NA) to install and configure them to work in a modular use case deployment. Currently, the NA requires
connections to the Vault and Keycloak services in the EMF to retrieve tokens for each agent on the edge node as well as
the Host Resource Manager (HRM). In a modular deployment, the underlying infrastructure being used to provide such
tokens to enable authenticated connections with the Orchestrator may be different, while the HRM may not be deployed
with EIM services for the NA to connect to. If that is the case, then the NA will need to be able to connect to
whichever identity management infrastructure is in use on the Orchestrator, either EMF or customer infrastructure, and
determine whether the HRM has been deployed as well. This will require updates to the NA configuration file used on
start up to provide the required Orchestration information needed by the agent as well as updates to how the NA runs so
that it can skip current calls API calls and workflows that it currently runs that may not be needed for modular use
case.

Since the out-of-band use case will be required to run in environments where the full EMF stack has not been deployed,
the detection of vPRO supporting devices will also need to be modified. When deploying the full EMF stack, both the
cloud-init and installer scripts used for provisioning and onboarding edge nodes to an Orcehstrator will check for vPRO
or ISM support on the edge node. However, in a modular deployment of the out-of-band device management workflow, these
components will not be included and may not have been installed under a separate workflow. Therefore, the PMA will need
to be updated to detect support for vPRO or ISM and operate accordingly.

![Modular vPRO Workflow Edge Node](images/modular-vpro-edge-node.png)

Below is the modular workflow for the PMA and NA once installed onto an edge node. This flow is similar to the agent
install and agent operation outlined in the [vPRO Device Activation](./vpro-device.md) documentation with the vPRO/ISM
support detection moved to the PMA.

```mermaid
sequenceDiagram
autonumber
participant US as User
box rgba(50, 219, 219, 1) Orchestrator Components
participant CLI as Orchestrator CLI
participant DM as Device Management (DM) Manager
participant INV as Inventory
participant RPS as Remote Provisioning Server (RPS)
participant MPS as Management Presence Server (MPS)
participant FPS as Foundational Platform Services (FPS)
end
box rgba(37, 182, 21, 1)
participant NA as Node Agent
participant PMA as Platform Manageability Agent
participant RPC as Remote Provisioning Client
participant EN as Edge Node
end
alt Install Edge Node Agents
US ->> EN: Install Edge Node Agents
activate EN
EN ->> NA: Install Node Agent
NA ->> FPS: Retrieve Node Agent token
FPS ->> NA: Token
NA ->> EN: Store token
NA ->> FPS: Retrieve Platform Manageability Agent token
FPS ->> NA: Token
NA ->> EN: Store token
EN ->> PMA: Install Platform Manageability Agent
EN ->> RPC: Install Remote Provisioning Client
EN ->> US: Return install success
deactivate EN
end
alt Detect vPRO/ISM device support
PMA ->> RPC: Send device status check command
activate PMA
RPC ->> EN: Check device support
EN ->> RPC: Report device support for vPRO/ISM
RPC ->> PMA: Report device support status
PMA ->> DM: Report device support
deactivate PMA
DM ->> INV: Update device support status
end
alt Start Device Activation
US ->> CLI: Request Device Activation
CLI ->> INV: Update device management setting
DM ->> INV: Retrieve latest device settings from inventory
activate DM
PMA ->> DM: Retrieve device activation details
activate PMA
PMA ->> RPC: Send device activation command with profile
RPC ->> RPS: Start device activation
activate RPS
RPS ->> EN: Activate device
EN ->> RPS: Report device activation success
RPS ->> RPC: Report activation success
deactivate RPS
RPS ->> PMA: Report activation success
PMA ->> DM: Report activation success
deactivate PMA
DM ->> INV: Set device activation status
deactivate DM
end
```

## Implementation Plan

- EIM services:
- Create new Helm chart for EIM services for use case.
- Update Orchestrator CLI to add any APIs needed for setting the CIRA configuration or domain profile for a device.
- Update Orchestrator CLI to add APIs for device activation and power cycling.
- Test use case Helm chart installation.
- Confirm that Out-of-band Device Management flow remains the same with new Helm chart.
- Edge Node Agents:
- Update Node Agent to only manage agent tokens and statuses.
- Update Platform Management Agent to perform device vPRO/AMT/ISM support on installation.
- Modify Platform Management Agent flow when edge node device does not support vPRO/AMT/ISM.

## Open Issues

- Will the Host Resource Manager service always be deployed with the EIM services regardless of the use case?
Loading