January 18, 2024 in Development, LANL by David J. Allen (LANL)6 minutes
SMD comes with a built-in partitioning feature that should allow us to partition off certain components through it's API. In this blog post, we explore how groups and partitions work in practice.
For the Supercomputing Institute 2024 at LANL, we would like to use Ochami to replace Warewulf for system management. To do so, we would need to introduce a multi-tenant feature into the SMD microservice. The idea is to run a single instance of SMD managing multiple clusters at once, but each cluster is only accessible after authentication.
SMD comes with a built-in partitioning feature that should allow us to partition off certain components through it’s API. For example, we can view all of the partitions currently in our SMD instance using a curl
command (assuming you have a local instance of SMD running on port 27779). You can check if SMD is reachable then list all of the existing partitions with the following:
Similiarly, if we already have components loaded, we can query the Redfish endpoints and create a new partition using its xname
and a POST request.
Note that the partition name must follow the convention of p#
or p#.#
for the partition name and each component can only be added to a single partition. If you try to add a component that’s already a part of another partition you will get an error.
This is a feature of course and not a bug. We will also see how this same xname can be added to a group in the next section as well.
The groups API works in a similar manner to the partitions API. Like before, if we want to check for currently available within our SMD instance, we can make the following request:
Creating new groups is just as easy:
However, unlike partitions, groups do not have a stringent requirement for using xnames and xnames can be included in multiple groups. Confirm that the group was created:
At this point, there are no members (as xnames) in the new group. Let’s try adding the xname that we added in the partition before. This done via the membership API as well, but with groups instead of partitions. However, the endpoint is slightly different for adding groups.
Now for a quick test. Let’s try adding another group, and then add the same xname as above to it.
Both commands should have worked with producing an error and the xname should be in both groups. If we make a group exclusive, then the group would behave more like a partition in that it prevents new groups from adding xnames to it that are contained in the exclusive group. This behavior does not affect making new exclusive groups though.
Although groups and partitions may seem very similiar on the surface, their use-cases are fundamentally different. For example, we saw before that we can add one xname to two groups, but not two partitions.
Looking at both the Group
and Partition
data structure, we can see the similiarities by just observing the fields below. The only difference between the structs is the ExclusiveGroup
field in the Group
struct, but not the Partition
struct.
That’s because groups are intended to be an abstaction for general purposes whereas partitions are meant to be used specifically for separating hardware components. Therefore, partitions have certains hard, intentional constraints that groups do not have.
Memberships provide a way to do a reverse lookup to find partitions using xnames. They are automatically created whenever a new partition is created using the partition API. Each membership object contains an ID, list of groups, and a partition name.
We can view all memberships or memberships in a SMD instance with the following:
Note that new memberships are automatically created whenever a new partition is created for each xname. After creating a partition, we can add any additional memberships using the membership API:
However, be aware that this API endpoint is not really well documented if you’re looking in docs/examples.adoc
for more information so there may be some missing details here as well. Now, if we want to see all of the members of a specific partition, like the partition of the component we just added, we can query the p1
partition:
That covers the relevant aspects of memberships and how to use them with SMD for our needs.
Unfortunately, the membership ID MUST be an xname when creating a new instance and they are created automatically. If we want to be able to look up a specific partition for all components belonging to a user (or whatever), then we would want the ID to be any arbitrary string value (like a user name - david
- or a partition ID like - 2b2a5899-4e9a-44af-ba3c-4159c033c352
). This can be done by removing (or commenting out) a couple lines of code in the doPartitionMembersPost
in cmd/smd-api.go
(starting at ilne 4805 with commit c95ce488afd9b3b230a1812a752c3c4dd6410039
.
This will allow the membership ID to be set to any value without checking for an xname. The normID
should also just be set to the memberIn.ID
value. Now rebuild the binaries and test adding a new membership.
Or using docker: