October 5, 2023 in Development by Manuel Sopena Ballesteros, Miguel Gila and Matteo Chesi11 minutes
Managing High-Performance Computing (HPC) infrastructure efficiently requires a tool that is both flexible and powerful. Manta is a next-generation Command Line Interface (CLI) designed to streamline HPC management for external developers by integrating multiple backend technologies and simplifying complex workflows. Whether you’re managing thousands of compute nodes or configuring a single system, Manta provides a seamless and intuitive experience.
HPC infrastructures often involve a mix of backend technologies, requiring smooth interactions between components like system management, authentication, configuration, and monitoring. Manta acts as the bridge between these complexities, offering a unified, developer-friendly interface to manage HPC resources effortlessly across diverse environments.
Manta is designed with flexibility and scalability in mind, featuring an extensive set of capabilities to make HPC management more efficient and accessible.
Manta is built on a polyglot architecture, enabling interaction with multiple backend technologies. Currently, it supports:
Each backend is encapsulated within an independent library, and the CLI intelligently directs user operations to the appropriate backend, ensuring a smooth and efficient experience.
Manta allows users to configure and manage multiple backend instances within a single configuration file. Each backend:
This feature makes Manta ideal for organizations with multiple data centers and heterogeneous computing environments.
Manta enhances infrastructure management by integrating with critical HPC services:
The command below will show most common information (groups, nid, power status, CFS configuration used to build boot/rootfs image, boot/rootfs image id, runtime CFS configuration, etc) related to a list of xnames
manta get nodes x1001c1s0b0n0,x1001c1s0b0n1
+---------------+-----------+------------+--------------+-------------------------------------------+----------------------+---------+-------------+---------------------+--------------------------------------+
| XNAME | NID | HSM | Power Status | Runtime Configuration | Configuration Status | Enabled | Error Count | Image Configuration | Image ID |
+===============================================================================================================================================================================================================+
| x1001c1s0b0n1 | nid001289 | alps, | READY | fora-mc-compute-config-cscs-24.8.0.r0-0.2 | configured | true | 0 | Not found | d39fedce-82e3-48d5-bd83-534f37c74c0c |
| | | fora, | | | | | | | |
| | | fora_cn, | | | | | | | |
| | | fora_nc, | | | | | | | |
| | | fora_test2 | | | | | | | |
|---------------+-----------+------------+--------------+-------------------------------------------+----------------------+---------+-------------+---------------------+--------------------------------------|
| x1001c1s0b0n0 | nid001288 | alps, | READY | fora-mc-compute-config-cscs-24.8.0.r0-0.2 | configured | true | 0 | Not found | d39fedce-82e3-48d5-bd83-534f37c74c0c |
| | | fora, | | | | | | | |
| | | fora_ns, | | | | | | | |
| | | fora_uan | | | | | | | |
+---------------+-----------+------------+--------------+-------------------------------------------+----------------------+---------+-------------+---------------------+--------------------------------------+
Note: If CFS session is deleted, then manta won’t be able to find the CFS configuration used to build the image.
Assume we want to change the kernel parameters for two nodes: nid001288
and nid001289
.
manta get kernel-parameters -n 'nid00128[8-9]'
This command lists and groups nodes based on shared kernel parameters, making it easier to manage configurations across multiple nodes.
+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| XNAME | Kernel Params |
+========================================================================================================================================================================================================================+
| x1001c1s0b0n0 | console=ttyS0,115200 |
| x1001c1s0b0n1 | nmd_data=url=s3://boot-images/d39fedce-82e3-48d5-bd83-534f37c74c0c/rootfs,etag=89a5bd99d9c940ffa992308dc68c53a3-646 quiet |
| | root=craycps-s3:s3://boot-images/d39fedce-82e3-48d5-bd83-534f37c74c0c/rootfs:89a5bd99d9c940ffa992308dc68c53a3-646:dvs:api-gw-service-nmn.local:300:nmn0,hsn0:true spire_join_token=${SPIRE_JOIN_TOKEN} |
+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Now, we want to add two new kernel parameters: test=test
and quiet
.
manta add kernel-parameters -n 'nid001288, nid001289' 'test=test test=test2 quiet'
Add kernel params:
"test=test test=test2 quiet"
For nodes:
"x1001c1s0b0n0, x1001c1s0b0n1"
✔ This operation will add the kernel parameters for the nodes below. Please confirm to proceed · yes
? "x1001c1s0b0n0, x1001c1s0b0n1"
✔ "x1001c1s0b0n0, x1001c1s0b0n1"
The nodes above will restart. Please confirm to proceed? · no
Cancelled by user. Aborting.
Manta detects that the kernel parameters have changed and prompts the user for confirmation before applying updates, ensuring controlled modifications.
Also, kernel parameter test is specified twice but manta only stores it once.
To filter specific kernel parameters when inspecting configurations:
manta get kernel-parameters -n 'nid00128[8-9]' -f quiet,test
+---------------+---------------+
| XNAME | Kernel Params |
+===============================+
| x1001c1s0b0n0 | quiet |
| x1001c1s0b0n1 | test=test2 |
+---------------+---------------+
If a kernel parameter already exists, Manta prevents duplication, ensuring sanitized and clean configurations.
To remove the kernel parameter `test=test2`:
manta delete kernel-parameters -n nid001289 test
After executing this command, Manta confirms the operation and removes the specified parameter, maintaining a streamlined and consistent kernel configuration across nodes.
These capabilities allow administrators to maintain consistency across nodes while also offering flexibility for tuning kernel settings dynamically.
Get hardware summary for all nodes in a HSM group called fora
manta get hardware cluster fora
+------------------------------------+----------+
| HW Component | Quantity |
+===============================================+
| Memory (GiB) | 3072 |
|------------------------------------+----------|
| SS11 200Gb 2P NIC Mezz REV02 (HSN) | 12 |
|------------------------------------+----------|
| AMD EPYC 7742 64-Core Processor | 24 |
+------------------------------------+----------+
Get hardware summary in HSM group fora broken down by xname
manta get hardware cluster fora -o details
+---------------+-----------+---------------------------------+------------------------------------+
| Node | 16384 MiB | AMD EPYC 7742 64-Core Processor | SS11 200Gb 2P NIC Mezz REV02 (HSN) |
+==================================================================================================+
| x1001c1s0b0n0 | ✅ (16) | ✅ (2) | ✅ (1) |
|---------------+-----------+---------------------------------+------------------------------------|
| x1001c1s0b0n1 | ✅ (16) | ✅ (2) | ✅ (1) |
|---------------+-----------+---------------------------------+------------------------------------|
| x1001c1s0b1n0 | ✅ (16) | ✅ (2) | ✅ (1) |
|---------------+-----------+---------------------------------+------------------------------------|
| x1001c1s0b1n1 | ✅ (16) | ✅ (2) | ✅ (1) |
|---------------+-----------+---------------------------------+------------------------------------|
| x1001c1s2b0n0 | ✅ (16) | ✅ (2) | ✅ (1) |
|---------------+-----------+---------------------------------+------------------------------------|
| x1001c1s2b0n1 | ✅ (16) | ✅ (2) | ✅ (1) |
|---------------+-----------+---------------------------------+------------------------------------|
| x1001c1s2b1n0 | ✅ (16) | ✅ (2) | ✅ (1) |
|---------------+-----------+---------------------------------+------------------------------------|
| x1001c1s2b1n1 | ✅ (16) | ✅ (2) | ✅ (1) |
|---------------+-----------+---------------------------------+------------------------------------|
| x1001c1s4b0n0 | ✅ (16) | ✅ (2) | ✅ (1) |
|---------------+-----------+---------------------------------+------------------------------------|
| x1001c1s4b0n1 | ✅ (16) | ✅ (2) | ✅ (1) |
|---------------+-----------+---------------------------------+------------------------------------|
| x1001c1s4b1n0 | ✅ (16) | ✅ (2) | ✅ (1) |
|---------------+-----------+---------------------------------+------------------------------------|
| x1001c1s4b1n1 | ✅ (16) | ✅ (2) | ✅ (1) |
+---------------+-----------+---------------------------------+------------------------------------+
If we want to add four AMD Epyc cpus into fora_test HSM group:
manta add hardware -P 'epyc:4' -p fora -t fora_test
+---------------+--------+--------+
| Node | epyc | memory |
+=================================+
| x1001c1s0b0n0 | ✅ (2) | ❌ |
|---------------+--------+--------|
| x1001c1s0b0n1 | ✅ (2) | ❌ |
|---------------+--------+--------|
| x1001c1s2b0n0 | ✅ (2) | ❌ |
+---------------+--------+--------+
✔ Please check and confirm new hw summary for cluster 'fora_test': {"epyc": 6, "memory": 48} · no
Cancelled by user. Aborting.
As a result of the previous command, manta will add two extra nodes x1001c1s0b0n0 and x1001c1s0b0n1 from group fora to fora_test
To get the logs of a CFS session:
manta log batcher-d241f65c-9114-4e38-ba3f-c62edd921fec
Get CFS configuration details
manta get configurations -n daint_xfer-generic-vcluster-1.0.14-x86-1.0.9`
+----------------------------------------------+---------------------+----------------------------------------------------+-------------------------------------------------------------+
| Configuration Name | Last updated | Layers | Derivatives |
+=======================================================================================================================================================================================+
| daint_xfer-generic-vcluster-1.0.14-x86-1.0.9 | 09/01/2025 11:39:19 | Name: csm | CFS sessions: |
| | | Branch: cray/csm/1.16.33 | |
| | | Tag: | BOS sessiontemplates: |
| | | Date: 2024-09-12T12:14:51Z | - daint_xfer-generic-vcluster-1.0.14-x86-1.0.9-ramroot-nmn |
| | | Author: crayvcs - cf-gitea-import | - daint_xfer-generic-vcluster-1.0.14-x86-1.0.9-ramroot-hsn |
| | | Commit: 2e92fb880d60e6d2f44e73ea122e03b602f7e7ab | - daint_xfer-generic-vcluster-1.0.14-x86-1.0.9-nmn |
| | | Playbook: csm_packages.yml | - daint_xfer-generic-vcluster-1.0.14-x86-1.0.9-hsn |
| | | | |
| | | Name: slingshot-host-software | IMS images: |
| | | Branch: cxi-p2p | - generic-vcluster-1.0.14-x86 |
| | | Tag: | |
| | | Date: 2024-10-14T18:57:57+02:00 | |
| | | Author: root | |
| | | Commit: 0020adb2c265fbac41c4a002cc1ce724bab28284 | |
| | | Playbook: shs_cassini_install.yml | |
| | | | |
| | | Name: uss | |
| | | Branch: | |
| | | Tag: | |
| | | Date: 2024-12-13T17:08:20+01:00 | |
| | | Author: Marco Induni | |
| | | Commit: 067af80f395670201886098fa828252a1a10fdf9 | |
| | | Playbook: cos-compute.yml | |
| | | | |
| | | Name: csm-diags | |
| | | Branch: cray/csm-diags/1.5.46 | |
| | | Tag: | |
| | | Date: 2024-09-12T17:35:34Z | |
| | | Author: crayvcs - cf-gitea-import | |
| | | Commit: 8659bd5dfb2c0b34f5a2f2a08df50be4601e0571 | |
| | | Playbook: csm-diags-compute.yml | |
| | | | |
| | | Name: sma | |
| | | Branch: cray/sma/1.9.18 | |
| | | Tag: | |
| | | Date: 2024-09-12T17:45:46Z | |
| | | Author: crayvcs - cf-gitea-import | |
| | | Commit: a15f741958f850aa88ae7e9647bd8db477c6ea8b | |
| | | Playbook: sma-ldms-compute.yml | |
| | | | |
| | | Name: uss | |
| | | Branch: | |
| | | Tag: | |
| | | Date: 2024-12-13T17:08:20+01:00 | |
| | | Author: Marco Induni | |
| | | Commit: 067af80f395670201886098fa828252a1a10fdf9 | |
| | | Playbook: cos-compute-last.yml | |
| | | | |
| | | Name: nomad-orchestrator | |
| | | Branch: | |
| | | Tag: | |
| | | Date: 2025-01-08T12:09:49+01:00 | |
| | | Author: Alejandro Dabin | |
| | | Commit: dcbdf9439188b978e1791211f577129f2e305ab5 | |
| | | Playbook: site-client.yml | |
+----------------------------------------------+---------------------+----------------------------------------------------+-------------------------------------------------------------+
manta apply boot nodes --runtime-configuration <cfs configuration name> <list of xnames>
manta sat-file --sat-template-file <sat template file> --values-file <sat values file>
With template file
cat sat-file/my_template_mc.yml
---
schema_version: 1.0.2
configurations:
- name: "img-{{vcluster.name}}-mc-{{vcluster.version}}"
layers:
- name: test_layer
playbook: site.yml
git:
url: https://api-gw-service-nmn.local/vcs/cray/test_layer.git
tag: {{test_layer.tag}}
- name: "runtime-{{vcluster.name}}-mc-{{vcluster.version}}"
layers:
- name: test_layer
playbook: site.yml
git:
url: https://api-gw-service-nmn.local/vcs/cray/test_layer.git
tag: {{test_layer.tag}}
images:
- name: "{{vcluster.name}}-mc-{{vcluster.version}}"
ims:
is_recipe: false
id: "{{vcluster.base_image_id}}"
configuration: "img-{{vcluster.name}}-mc-{{vcluster.version}}"
configuration_group_names:
{{ vcluster.image_group_list }}
session_templates:
- name: "my-template-{{ template_version }}"
image:
ims:
name: "{{vcluster.name}}-mc-{{vcluster.version}}"
configuration: "runtime-{{vcluster.name}}-mc-{{vcluster.version}}"
bos_parameters:
boot_sets:
compute:
arch: X86
kernel_parameters: ip=dhcp quiet ksocklnd.skip_mr_route_setup=1 cxi_core.disable_default_svc=0 cxi_core.enable_fgfc=1 cxi_core.sct_pid_mask=0xf spire_join_token=${SPIRE_JOIN_TOKEN}
node_groups:
{{ vcluster.sessiontemplate_group_list }}
rootfs_provider_passthrough: "dvs:api-gw-service-nmn.local:300:nmn0,hsn0:true"
And values file
---
template_version: 1.0
vcluster:
name: fora-test
version: __DATE__
base_image_id: 016b42f4-d1fe-4505-914a-4e31841a9313
cscs_git_branch: "cscs-24.8.0"
image_group_list:
- Compute
- alps
- fora_test
sessiontemplate_group_list:
- fora_test
default:
network_type: "cassini"
working_branch: "{{ vcluster.cscs_git_branch }}"
test_layer:
tag: v0.1.2
uss:
version: 1.1.0-135-csm-1.5
working_branch: "{{ vcluster.cscs_git_branch }}"
cpe:
version: 23.12.3
working_branch: "{{ vcluster.cscs_git_branch }}"
csm:
version: 1.5.2
csm_diags:
version: 1.5.46
slingshot_host_software:
version: 2.1.3-107-csm-1.5-x86-64
working_branch: "{{ vcluster.cscs_git_branch }}"
sma:
version: 1.9.18
Note: when processing a SAT template/values file __DATE__
will get replaced with current timestamp
Note: for quick prototyping, manta allows overwrite values inline
manta apply sat-file -t sat-file/my_template_mc.yml -f sat-file/my_values.yml --values vcluster.version=1.0
Unlike traditional CLI tools, Manta is designed with modern HPC workflows in mind. Its ability to interact with multiple backend technologies, support distributed infrastructure, and provide deep integration with Kubernetes and security services makes it a game-changer for HPC administrators and developers.
Manta is built for HPC administrators and developers looking for a powerful yet intuitive CLI for infrastructure management. By abstracting backend complexities and offering a robust set of integrations, Manta simplifies complex workflows, enhances security, and provides greater control over compute nodes and servers.
Stay tuned for detailed guides, real-world use cases, and tutorials on how to make the most of Manta. If you’re interested in testing or contributing to the project, reach out—we’d love to collaborate!
Our development roadmap is transparent and community-driven. We actively manage and update our roadmap through GitHub issues, which include:
https://github.com/eth-cscs/manta/issues/64
https://github.com/eth-cscs/manta/issues/82
https://github.com/eth-cscs/manta/issues/93
https://github.com/eth-cscs/manta/issues/94