July 22, 2025 in HPC, Power Control, Infrastructure by Ben McDonald4 minutes
Managing a modern data center requires control over a diverse set of hardware. System administrators need to wrangle not just servers via their Baseboard Management Controllers (BMCs), but also critical power infrastructure like Power Distribution Units (PDUs). The problem is that these devices often speak different languages; BMCs typically use the Redfish
API, while many PDUs have their own unique interfaces.
With recent enhancements to the Magellan tool as a part of the OpenCHAMI softawre stack, this complexity can now be managed through a single, consistent interface. OpenCHAMI’s Power Control Service (PCS) now provides a unified API to query and control the power state of both BMCs and PDUs, backed by a always-on monitoring engine.
This post will demonstrate this powerful new capability and explain some of the changes in Magellan, the State Management Database (SMD), and PCS that make this unified workflow possible.
Note that the full deployment recipe and instructions ran in this blog post are available at the demo repository on GitHub: https://github.com/bmcdonald3/openchami-demo. This also includes a more full workflow, powering back on and querying the status of a transition, with expected output from a real machine.
The workflow for managing BMCs and PDUs with OpenCHAMI tools involves four key components:
Redfish
or JAWS
interface every 20 seconds to maintain its current power state.This polling architecture means that PCS isn’t just a passive proxy; it’s a stateful service that provides a near real-time, cached view of the entire data center’s power status.
Let’s walk through the full workflow. This demonstrates how an administrator registers new hardware and then uses PCS to manage it.
Before you begin, you will need to know the mapping between your device IP addresses and their corresponding xname
identifiers (e.g., mapping 10.254.1.26
to x3000m0
). This demonstration assumes this mapping is known.
First, we use Magellan to discover a PDU and collect its inventory. This information is then sent to SMD.
# Discover, collect, and send PDU inventory to SMD in one pipe
magellan collect pdu x3000m0 | magellan send http://localhost:27779
Next, we do the same for a BMC.
# Discover, collect, and send BMC inventory to SMD
magellan collect "https://172.24.0.2" | magellan send http://localhost:27779
At this point, SMD contains the hardware inventory, but PCS doesn’t know how to access it yet.
The next step is for the administrator to securely store the credentials for x3000m0
and the BMCs in a Vault instance that PCS is configured to read from. Once the credentials are in Vault, the lifecycle is complete.
On its next cycle, PCS will:
x3000m0
, x1000c1s7b0
, etc.).Now, we can interact with PCS. When we query PCS, we get an immediate response from its internal state cache—we don’t have to wait for a live poll to the device. This can prevent overloading the hardware with requests, but also means that the current state may need to wait for the next 20 second polling cycle.
Let’s check the power status of a management node and a PDU outlet. Note the API call is identical; only the xname
changes.
# Get power status of a node managed by a BMC
curl -sS -X GET http://localhost:28007/v1/power-status?xname=x1000c1s7b0n0 | jq '.'
# Get power status of an outlet managed by a PDU
curl -sS -X GET http://localhost:28007/v1/power-status?xname=x3000m0p0v17 | jq '.'
Now, let’s execute a power Off
command on the BMC node.
curl -sS -X POST -H "Content-Type: application/json" \
-d '{"operation": "Off", "location": [{"xname": "x1000c1s7b0n0"}]}' \
http://localhost:28007/v1/transitions
PCS receives this command, performs the action on the device, and its next polling cycle will confirm the new “off” state, updating its cache for all future queries. We can do the exact same thing for the PDU outlet using the same consistent API.
With these enhancements, OpenCHAMI has taken a major step forward in unified, programmatic hardware management. The stateful, polling model we’ve established could easily be extended to manage other critical data center hardware, such as network switches or smart cooling units, under the same consistent API.
Your feedback is valuable! If you’d like to try out this workflow, contribute ideas, or report issues, we invite you to check out the demo repository on GitHub: https://github.com/bmcdonald3/openchami-demo.