Skip to content

feat: migrate Alarmd to Spring Boot 4.0.3 (prototype)#8357

Closed
pbrane wants to merge 394 commits intoOpenNMS:developfrom
pbrane:feature/spring-boot-4-alarmd-migration
Closed

feat: migrate Alarmd to Spring Boot 4.0.3 (prototype)#8357
pbrane wants to merge 394 commits intoOpenNMS:developfrom
pbrane:feature/spring-boot-4-alarmd-migration

Conversation

@pbrane
Copy link

@pbrane pbrane commented Mar 15, 2026

Summary

Prototype migration of Alarmd from Karaf/OSGi to Spring Boot 4.0.3, establishing the pattern for migrating all 14 daemons to lightweight Spring Boot containers.

  • daemon-common — shared Spring Boot starter module with DaemonSmartLifecycle, KafkaEventTransportConfiguration, DaemonDataSourceConfiguration, and AbstractDaoJpa (JPA-native DAO base class)
  • daemon-boot-alarmd — Spring Boot application that wires Alarmd beans and runs as a fat JAR (java -jar)
  • RTCd deleted — dead code cleanup (no longer needed without webapp)
  • daemon-loader-alarmd deleted — Karaf loader replaced by Spring Boot app
  • Docker pipeline updated — fat JAR staged in build.sh, Dockerfile.daemon updated, docker-compose.yml switches alarmd to java -jar with Actuator healthcheck

What works

  • Spring Boot 4.0.3 starts successfully (Spring Banner, context initialization)
  • Spring Framework 7 + Hibernate 7.2.4 + JPA auto-config all fire
  • Dependency conflicts resolved: ServiceMix Spring 4.2.x excluded, SLF4J 2.x pinned, jboss-logging 3.6.x pinned, Jakarta XML Bind 4.x pinned
  • All unit tests pass (16 tests across daemon-common and daemon-boot-alarmd)
  • Fat JAR builds and loads in Docker container

What's blocked (follow-up work)

Hibernate 7 entity scanning fails on opennms-model because entities use Hibernate 3.6-era annotations (e.g., @ParamDef(type = "string") where Hibernate 7 changed type from String to Class<?>). This is not just a javax→jakarta namespace rename — it's a Hibernate annotation API breaking change.

Next step: Run OpenRewrite org.openrewrite.java.migrate.hibernate recipes on opennms-model to create opennms-model-jakarta with updated annotations. This unblocks the full Alarmd boot and enables E2E testing.

Key patterns established

Pattern Implementation
Daemon lifecycle DaemonSmartLifecycle wraps AbstractServiceDaemon into Spring Boot SmartLifecycle
Event transport KafkaEventTransportConfiguration replaces Blueprint XML with @Configuration
DAO base class AbstractDaoJpa<T, K> replaces AbstractDaoHibernate using jakarta.persistence.EntityManager
Dependency isolation Parent POM version overrides handled via explicit <dependencyManagement> pins + ServiceMix exclusions
Docker integration Fat JAR at /opt/daemon-boot-alarmd.jar, entrypoint: [] overrides Sentinel base image

Test plan

  • daemon-common unit tests pass (11 tests)
  • daemon-boot-alarmd smoke tests pass (5 tests)
  • ./compile.pl -DskipTests --projects :org.opennms.core.daemon-boot-alarmd -am install succeeds
  • Docker image builds with fat JAR at /opt/daemon-boot-alarmd.jar
  • Spring Boot starts in container (reaches Hibernate entity scanning)
  • Full E2E test (blocked on entity migration — follow-up PR)

pbrane added 30 commits March 8, 2026 22:24
All 12 daemon containers were mounting overlays to /opt/daemon-etc-overlay
and data to /opt/daemon/data, but the sentinel entrypoint only checks
/opt/sentinel-etc-overlay and /opt/sentinel/data. This meant config
overlays were never being applied to any daemon container.
…iner

DiscoveryConfigFactory is a concrete class that cannot be proxied by
onmsgi:reference (JDK dynamic proxy requires an interface). Replaced
with direct bean instantiation.

LocationAwarePingClient's RPC-based implementation requires the full
ICMP+RPC stack (PingerFactory, RpcClientFactory, etc.) which is not
available in daemon containers. Created LocalLocationAwarePingClient
using InetAddress.isReachable() for connectivity checks.
Added CROSS_CONTAINER_INTERNAL_UEIS whitelist to EventClassifier so
internal events that trigger cross-container actions (like newSuspect
from Discovery to Provisiond) are classified as DUAL instead of IPC.

Temporarily enabled Discovery on core as a workaround — standalone
Discovery publishes newSuspect to Kafka but core lacks a Kafka event
consumer to receive it. Added discovery-configuration.xml overlays
targeting host.docker.internal for integration testing.
Two transport layers: IPC via AMQ hub-and-spoke (core hosts broker,
daemon containers connect as spokes), fault events via Kafka. Three
implementation phases: (1) JmsMessageBus in daemon containers,
(2) Kafka producer/consumer on core, (3) UEI classification audit.
9-task plan across 3 phases:
Phase 1 — IPC via AMQ (Tasks 1-5): JmsMessageBus blueprint, broker URL
config, core LocalMessageBus→JmsMessageBus switch, bridge wiring, e2e test
Phase 2 — Kafka on core (Tasks 6-8): FaultEventPublisher, KafkaFaultEventConsumer
Phase 3 — UEI audit (Task 9): classify all events as FAULT/IPC/DUAL
Creates ActiveMQConnectionFactory from broker.url config property and
registers JmsMessageBus as OSGi service. Added opennms-core-messagebus-jms
to opennms-event-forwarder-kafka feature so all daemon containers get it.
Each daemon container gets org.opennms.core.messagebus.jms.cfg with
broker.url = failover:tcp://core:61616, connecting JmsMessageBus to
core's embedded AMQ broker (hub-and-spoke topology).
Core's EventRouter now publishes IPC events to AMQ Topics via
JmsMessageBus, making them available to all daemon containers
connected to core's embedded broker.
Investigation found only Trapd needs the bridge among all 13 daemon
containers. Trapd has @eventhandler(uei=RELOAD_DAEMON_CONFIG_UEI) which
is an internal UEI flowing via MessageBus (AMQ), not Kafka.

Enlinkd and other daemons use addEventListener() for non-internal UEIs
(nodeAdded, etc.) — these flow via KafkaEventSubscriptionService.
Complete UEI classification audit for the dual-transport architecture.
forceRescan must be DUAL because Enlinkd's standalone container uses
AnnotationBasedEventListenerAdapter (Kafka path) for this internal UEI.

Fixed existing test that incorrectly expected newSuspect to be IPC
(it's been DUAL since the whitelist was added). Added test for
nodeScanCompleted (IPC — consumers are core-only).
Replace LocalFaultEventPublisher no-op with the real FaultEventPublisher
backed by a Kafka producer. Core now publishes fault events (events with
alarm-data) to opennms-fault-events Kafka topic.

Uses FaultEventPublisherFactory to avoid circular Maven dependency with
event-forwarder-kafka module. XmlEventSerializer produces the same XML
format that daemon containers use.
…vents

Core now polls opennms-fault-events Kafka topic and broadcasts incoming
events to local listeners via EventIpcBroadcaster. Skips core-originated
events using TSID node-id (bits 12-21) to prevent echo loops.

Added Kafka system properties to core's JAVA_OPTS in docker-compose.yml:
bootstrap.servers, fault.topic, consumer.group.
The spring-extender bundle must start AFTER serviceregistry so that
the extender's initial bundle scan discovers serviceregistry and creates
its Spring context. This context registers ServiceRegistry as an OSGi
service, which distributed-dao-impl depends on.

When spring-extender started first, the race condition meant serviceregistry's
Spring context was never created, leaving daemon containers permanently
unhealthy.
…or testing

- Disable Telemetryd (crashes without Karaf due to missing ServiceLoader adapters)
- Disable PerspectivePoller (same issue)
- Disable Discovery on core (runs as standalone container)
- Add dev JAR overlay mounts for event architecture modules
- Update header comment to reflect disabled services
Replaces dual-transport (AMQ + Kafka) with Kafka-only architecture.
Minion handles all device network I/O. Two Kafka topics: opennms-ipc-events
for daemon-to-daemon, opennms-fault-events for alarms. Removes need for
AMQ connectivity between daemon containers.
9-task plan to replace AMQ MessageBus with second Kafka topic for IPC events.
KafkaEventForwarder publishes IPC events to opennms-ipc-events topic.
KafkaEventSubscriptionService subscribes to both topics.
Core gets IpcEventPublisher + KafkaIpcEventConsumer.
AMQ infrastructure removed from daemon containers.
…orwarder

IPC-classified events now publish to a second Kafka topic (configurable
via setIpcTopicName) instead of MessageBus/AMQ. Same KafkaProducer handles
both fault and IPC topics. Removes IpcMessageConverter and MessageBus
dependencies from KafkaEventForwarder.
Accept comma-separated topic names so daemon containers can subscribe to
both opennms-fault-events and opennms-ipc-events from a single consumer.
Blueprint wires ipc.topic.name for KafkaEventForwarder and subscribes
KafkaEventSubscriptionService to both fault and IPC topics. Removes
MessageBus OSGi reference. All daemon overlay configs updated with
ipc.topic.name=opennms-ipc-events.
Publishes IPC-classified events to opennms-ipc-events Kafka topic.
Same pattern as FaultEventPublisher but for internal daemon-to-daemon events.
Wired in applicationContext-eventDaemon.xml.
EventRouter now uses two symmetric EventProcessor publishers (fault + IPC)
instead of MessageBus + IpcMessageConverter. KafkaFaultEventConsumer renamed
to generic KafkaEventConsumer with two instances (fault + IPC topics).
Removed all AMQ/MessageBus infrastructure: bridge classes, JMS config
overlays, messagebus Maven/Karaf feature dependencies.

111 tests pass (50 event-forwarder-kafka + 61 events daemon).
Provisioner, Discovery, Collectd, Pollerd, EventTranslator, DataSender,
BroadcastEventProcessor, and PerspectivePollerd all had MessageBus
subscribe/handler code that fails at runtime now that MessageBus is no
longer registered as an OSGi service. These IPC events now flow through
Kafka topics instead.
- Remove REPLACED_MARKER dangling <bean> tag in beanRefContext.xml
  (left by Statsd deletion subagent, caused SAXParseException on startup)
- Add event-forwarder-kafka JAR overlay for discovery daemon container
  (was using stale baked-in JAR without IPC topic support)
- Add messagebus.jms to discovery health ignore list
  (bundle can't start in Kafka-only mode, harmless)
- Update discovery config to target postgres container
  (host.docker.internal was already managed, no new events)
- Update core JAR overlays: add provisiond, services, discovery;
  remove stale messagebus-api/jms overlays

Integration test verified end-to-end:
  Discovery (standalone) → newSuspect → Kafka fault-events →
  Core KafkaEventConsumer → Eventd → EventRouter (DUAL) →
  Kafka ipc-events + local broadcast → Provisiond → node created
KafkaEventForwarder now loads 157 eventconf definitions from the
database on startup and enriches events with severity + alarm-data
before publishing to Kafka. This replaces EventExpander/Eventd in
the microservice flow, ensuring translated events (e.g. SNMP_Link_Down)
arrive on Kafka fully formed for Alarmd to create/clear alarms.

Also removes messagebus-jms from Karaf features (keeping only the
API for OSGi Import-Package resolution), reverts health-ignore
workarounds, and adds JAR/features overlays to all daemon containers.

End-to-end verified: trap → Trapd → Kafka → EventTranslator →
translate + enrich → Kafka → Alarmd → alarm created (linkDown)
and cleared (linkUp).
feat: Kafka-only microservice event architecture for Strike Fighter
Document the exploration of making Minion mandatory for all locations
(including Default). Deferred due to non-distributable ServiceMonitors
(PassiveServiceMonitor), collectors that route SNMP but don't delegate
full logic, and need for comprehensive monitor/collector audit.

Near-term items:
- PassiveStatusd/Pollerd shared-state bug (recommend merging into Pollerd)
- Add Default Minion container to docker-compose (all profiles)
- Move deploy.sh profiles to native Docker Compose profiles
- Investigate migrating Alarmd from opennms/alarmd to opennms/daemon image
- Investigate eliminating Minion REST dependency via Twin API
Tasks: Docker Compose profiles, Default Minion container,
deploy.sh simplification, manual smoke test verification.
Profiles: lite (10 svc), passive (7 svc), full (17 svc).
Infrastructure (postgres, kafka, core, webapp) always starts.
Replaces algorithmic profile selection in deploy.sh.
Minion registers at location 'Default' with Kafka IPC transport.
Included in all profiles (no profiles: key = always starts).
REST config sync points to webapp container as workaround
until Twin API migration eliminates REST dependency.
Trap/syslog ports not mapped (standalone Trapd/Syslogd own those).
Removes bash case-statement profile selection. Profiles are now
declared in docker-compose.yml. deploy.sh passes COMPOSE_PROFILES
env var to docker compose. Default (no profile) starts infra only
(postgres, kafka, core, webapp, minion).
pbrane added 26 commits March 14, 2026 20:12
Complete Twin API integration for passive status sync to Minion:

- Create KafkaTwinPublisher + LocalTwinSubscriber beans directly in
  the Pollerd Spring context (avoids cross-bundle classloader issues)
- InlineIdentity for Twin subscriber (avoids opennms-identity bundle dep)
- Add twin.kafka.bootstrap.servers to Pollerd docker-compose JAVA_OPTS
- Add opennms-identity + Twin dependencies to daemon feature definition
- Generate .sha1 checksums for all staged JARs in Dockerfile

16/17 E2E tests passing. Full pipeline proven:
  syslog → Minion → Syslogd → EventTranslator → passiveServiceStatus →
  PassiveStatusKeeper → PassiveStatusHolder → Twin API → Minion →
  PassiveServiceMonitor → outage created

Last failure (outage close) is timing — the Twin sync + poll cycle
chain takes longer than the 120s test timeout.
Both getInterpolatedAttributes() and interpolateAttributes() calls in
PollerRequestBuilderImpl.execute() now catch EntityScopeProvider
failures and fall back to uninterpolated attributes. This prevents
poll failures in standalone daemon containers where the MATE engine
isn't available.

15/16 E2E tests passing. Last failure (outage close) is due to
PersisterFactory OSGi service missing in Pollerd container — the
poll response processing fails after the monitor successfully returns.
feat: Minion-mandatory architecture + passive status E2E test
1. deploy.sh reset: explicitly removes all delta-v_* volumes instead of
   relying on docker compose down -v (which only handles active profiles).
   Prevents stale Karaf bundle caches from persisting across deploys.

2. poller-configuration.xml: PerspectivePollerd overlay is now a symlink
   to Pollerd's copy. Base assembly synced with passive service monitors
   (GoogleCloud, Azure, AWS). Single source of truth, no more drift.
Comprehensive prompt covering the migration from Karaf OSGi to Spring
Boot 4 for all 17 daemon containers. Documents current state, design
questions, file catalog, and phased approach (planning → prototype →
full migration).
fix: deploy.sh cleanup + config dedup + Spring Boot 4 prompt
Design for migrating 16 Delta-V daemon containers from Apache Karaf
(OSGi) to Spring Boot 4.0.3. Covers module structure, javax→jakarta
namespace migration, Hibernate 6.x per-daemon scoping, migration
sequence (4 tiers by complexity), and Docker deployment strategy.
Fix daemon count (15 not 16), clarify javax→jakarta sequencing
(must land with Alarmd migration), add OnmsDao full contract note,
add REST API JAX-RS→Spring MVC migration detail with endpoint mapping,
add logging/testing/port allocation sections, add application.yml
skeleton, specify shared module directory locations.
Minion-mandatory architecture + Spring Boot 4 migration planning
Un-migrated Karaf daemons must continue to compile and run so the
full E2E test suite can validate migrated daemons alongside
un-migrated ones. The javax→jakarta rename is now scoped per-daemon
migration rather than applied globally upfront.
17-task plan covering: RTCd deletion, daemon-common module creation
(DaemonSmartLifecycle, KafkaEventTransportConfiguration, DataSource),
javax→jakarta bridge strategy, Hibernate 6.x DAO compatibility layer,
daemon-boot-alarmd Spring Boot app, Docker integration, and cleanup.
Implements SmartLifecycle wrapper around AbstractServiceDaemon to replace
the Karaf-era DaemonLifecycleManager. On start(), calls daemon.init()
then daemon.start(); on stop(), calls daemon.stop().

Also fixes daemon-common POM for Spring Boot 4 compatibility:
- Skip enforcer for banned Spring 7 / lz4 dependencies
- Pin JUnit to 5.12.2 / Platform 1.12.2 (surefire 3.5.5 compat)
- Pin logback to 1.5.32 (Spring Boot 4 Configurator API compat)
- Override surefire to 3.5.5 with Jupiter-only engine config
Translates the OSGi Blueprint blueprint-event-forwarder-kafka.xml into
a Spring @configuration class in daemon-common. Creates three beans:
KafkaEventForwarder, KafkaEventSubscriptionService (with start/stop
lifecycle), and KafkaEventIpcManagerAdapter (exposing EventIpcManager).

Adds the event-forwarder-kafka dependency to daemon-common's POM.
Enables @EntityScan for org.opennms.netmgt.model and
@EnableTransactionManagement. Spring Boot auto-configures HikariCP
DataSource, EntityManagerFactory, and JpaTransactionManager from
application.yml properties.
Configure the Eclipse Transformer Maven plugin as a profile
(jakarta-transform) in daemon-common so Spring Boot 4 fat JARs can
rewrite javax.* bytecode references to jakarta.* at build time.

This lets shared modules (opennms-model, opennms-dao, etc.) keep
javax.persistence annotations for Karaf/Hibernate 3.6 consumers
while Spring Boot daemons get jakarta.* bytecode that Hibernate 6.x
and Spring 7 require.

Includes:
- META-INF/jakarta-rename.properties with package rename mappings
- transformer-maven-plugin 0.5.0 configured with jakartaDefaults
- Profile produces a -jakarta classifier JAR when activated
…rnate

Implements OnmsDao<T, K> using jakarta.persistence.EntityManager instead
of HibernateDaoSupport. Provides the same protected helper API surface
(find, findUnique, findObjects, queryInt) so subclass DAOs like
AlarmDaoJpa can migrate with minimal changes.

Key design decisions:
- Uses jakarta.persistence (Spring Boot 4 / Hibernate 6 target)
- lock() uses native SQL to avoid importing AccessLock from opennms-dao
- findMatching/countMatching throw UnsupportedOperationException until
  a JpaCriteriaConverter is implemented; subclass DAOs use HQL instead
- HQL positional parameters use 1-based binding (?1, ?2) per JPA spec
Spring Boot 4 application entry point for Alarmd. Depends on
daemon-common (shared starter) and opennms-alarmd (daemon impl).
Includes enforcer-skip and dependency overrides for Spring Boot 4
compatibility with the parent POM.
Defines @bean methods for the core Alarmd daemon beans:
- AlarmPersisterImpl (alarm persistence)
- AlarmLifecycleListenerManager (lifecycle callbacks)
- NorthbounderManager (northbound interface forwarding)
- Alarmd (main daemon with persister wiring)
- AnnotationBasedEventListenerAdapter (event handler bridge)
- DaemonSmartLifecycle (Spring lifecycle adapter)

DAO dependencies (AlarmDao, NodeDao, etc.) and infrastructure
beans (EventUtil, EventSubscriptionService, EventProxy, etc.)
are expected from separate configurations via @Autowired field
injection in the existing classes.
Smoke test that verifies the AlarmdApplication class exists, is
annotated with @SpringBootApplication, scans the correct packages,
and exposes a static main() method.

A full @SpringBootTest with Testcontainers is documented in comments
but disabled until the DAO layer has JPA implementations and the
database schema is available in the test container.
- Exclude ServiceMix Spring 4.2.x bundles from fat JAR (conflict with Spring 7)
- Pin SLF4J to 2.0.17 (parent POM manages 1.7.36, incompatible with logback 1.5)
- Exclude old Hibernate 3.x from transitive deps
- Add entrypoint override in docker-compose (Sentinel base image has entrypoint script)
- Fix deploy.sh image check to use Delta-V image names (not base images)
- Pin jboss-logging to 3.6.1 (Hibernate 7 needs MethodHandles$Lookup API)
- Pin jakarta.xml.bind-api to 4.0.2 (Hibernate 7 needs JAXB 4.x)
- Add javax.persistence-api 2.2 (legacy entities use javax.persistence annotations)
- Add jakarta.xml.bind-api, jaxb-runtime, jakarta.inject-api
- Exclude hibernate-jpa-2.0-api, pax-logging, commons-logging from transitive deps

Spring Boot starts and reaches Hibernate initialization. Blocked on
Hibernate 3.6→7 annotation incompatibility in opennms-model entities
(@ParamDef.type changed from String to Class<?>) — requires entity
migration or adapter layer.
@github-actions github-actions bot requested a review from indigo423 March 15, 2026 15:08
@github-actions github-actions bot added the docs label Mar 15, 2026
@pbrane pbrane closed this Mar 15, 2026
@pbrane pbrane deleted the feature/spring-boot-4-alarmd-migration branch March 15, 2026 15:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant