Central Monitor
The Monitor code runs on every cluster, both the central cluster and the edge clusters. The instance on the central cluster operates a little differently from the instances on the edge.
ACS Monitor Component
Overview
The central Monitor relies on the Config Store to find the list of clusters to monitor. The monitor monitors every cluster which has a Cluster status entry, indicating that the cluster has been successfully bootstrapped and the Edge Sync operator on the cluster was operating correctly at some point.
Having determined the list of clusters to monitor, the central monitor
then watches for the Sparkplug output of the edge monitor on each
cluster. The Node address of the edge monitor Node is currently
determined by a fixed rule: the cluster has a Sparkplug Group associated
with it, and the edge monitor must always use the Monitor
Node address
within that group. If an edge monitor node goes offline or will not
respond to CMDs then the central monitor raises an alert.
MQTT interface
The Monitor publishes as a Node. The Node metrics currently hold no useful information beyond identifying the Node as a monitor node. For each cluster monitored, a Device is created under the Node. This Device publishes Link metrics linking to the edge monitor Node being monitored. Cluster monitor devices publish DBIRTH and DDEATH when the list of monitored clusters changes; the monitor device does not go offline when the cluster becomes inaccessible.
If an edge monitor on a cluster has not published any packets in several minutes a rebirth CMD will be sent. If the edge monitor does not respond then the central monitor Device will raise an alert indicating that the cluster is offline. The current set of alerts can be accessed via the alerts API on the Directory.
Well-Known UUIDs
These well-known UUIDs are part of the core framework and all MUST to be registered with the Configuration Store component under the appropriate classes.