Unboxing the Ampere-Powered ARS-211M-NR by Supermicro: AI Acceleration with Arm Architecture

Posted on 02 June, 2025

Here at Boston, we have the fantastic opportunity to get our hands on with some of the latest servers from our partner Supermicro – most notably the AmpereOne®-powered ARS-211M-NR. In this article, we will explore this Arm64-based server, which represents a breakaway from the legacy x86_64 architecture. The ARS-211M-NR delivers energy efficient high-throughput compute for modern cloud workloads and traditional AI use cases. With support for up to 192 cores and 8 channels of DDR5, AmpereOne processors are ideal for scaling out AI enabled web services with AI inference integration commonly used in today’s machine learning (ML) applications. This server also specialises in High Performance Computing, Immersive Media, Cloud Gaming, AIC, Dense-VDI and Video-on-Demand workloads.

x86_64 vs Arm64 Architecture

Legacy x86_64 architectures have been around since the 1970s, with AMD introducing the x64 extension in 2003. In contrast, Arm64 architecture was developed in the 1980s by Acorn Computers and is now owned by SoftBank. Arm64 is known for its energy efficiency and scalability; one key difference between these architectures lies in the instruction set. Arm_64 uses Reduced Instruction Set Computing (RISC), which emphasises simpler, faster and more energy-efficient instructions. X86_64, on the other hand, follows a Complex Instruction Set Computing (CISC) model, optimised to handle more complex tasks per instruction. Arm64’s rapid advancements in performance, parallel processing and energy efficiency making it a compelling and scalable choice for general purpose and AI workloads across enterprise and cloud native workload deployments. AmpereOne®’s single threaded custom core architecture delivers linear scalability, high efficiency and predictable performance, making it ideally suited for cloud native workload deployments in data centres, cloud platforms and hyperscale environments.

AmpereOne® Platform

The AmpereOne® platform features 192 AmpereOne®64-bit cores, each equipped with 16 KB/core L1 instruction cache and 64 KB/core L1 data cache, along with a dedicated 2 MB low-latency unified L2 cache/core. A shared 64 MB System Level Cache (SLC) further enhances performance. The architecture includes dual 128-bit SIMD units per core and a Coherent Mesh Network (CMN) with distributed snoop filtering for efficient data access and inter-core coherence. It supports full interrupt virtualisation (GICv4.1.1), full I/O virtualisation (SMMUv3.2) and data-centre-grade RAS capabilities. Built on TSMC’s 5 nm process, the system adheres to Arm v8.6+ with SBSA Level 5 compliance and features active power management including dynamic estimation and voltage droop mitigation, with up to 400W supportable TDP and actual usage power as low as 185W.

The memory subsystem includes eight DDR5 channels, each with two independent 36- or 40-bit subchannels, supporting 1 or 2 DIMMs per channel—yielding a total of up to 16 DIMMs. Configurations such as 2R2DPC at 4400 MT/s and 1R2DPC, 2R1DPC or 1R1DPC at 5200 MT/s are supported. The system can accommodate up to 4 TB of memory using 256 GB DIMMs. It supports RDIMM and 3DS RDIMM modules, DRAM devices with x4 and x8 widths and densities of 16 Gb, 24 Gb and 32 Gb. Memory reliability is ensured through Symbol ECC, SECDED ECC, DRAM throttling and full RAS. Additional features include AES-256/XTS memory encryption, Memory Tagging Extension and MPAM-based QoS. Connectivity includes up to 192 PCIe Gen5 lanes in dual-socket or 128 lanes in single-socket configurations, with 32 total PCIe root port controllers offering flexible lane grouping and coherent multi-socket support.

The Processor

The AmpereOne® A192-32X processor is the flagship of the AmpereOne platform series, purpose built for AI inference and cloud-native workloads. This processor operates at a frequency of 3.2 GHz and a usage power of 284W, much lower than the maximum Thermal Design Point (TDP) due to Ampere’s efficient custom core architecture. This results in high performance and better energy efficiency than x86 processors. It features a 64MB system-level cache and supports up to 4TB of DDR5 ECC memory across 8 channels. The A192-32X also offers 128 PCIe Gen 5 lanes. The cores in Ampere CPUs use an efficient single-threaded design, reducing contention between different workloads running concurrently within the CPU and resulting in linear, predictable performance per core.

From the first glance, the die itself has a much smaller surface area than you would expect for such a high core-count Ampere processor. This is because the design itself houses the compute cores in the centre; the memory and PCIe dies are based around them. This can be shown in greater detail on the heatsink where the thermal compound is positioned.

The heatsink itself has three different physical levels that fit the unique Ampere CPU design. The highest point connects to the compute die and the second level cools the memory and PCIe dies.

Below we have a list of the full AmpereOne® SKUs:

PROCESSOR MODEL CORE COUNT FREQ (GHZ) USAGE POWER* (W)
AmpereOne® A192-32X 192 3.2 284
AmpereOne® A192-26X 192 2.6 211
AmpereOne® A160-28X 160 2.8 215
AmpereOne® A144-27X 144 2.7 332
AmpereOne® A144-26X 144 2.6 332
AmpereOne® A144-24X 144 2.4 185
AmpereOne® A128-34X 128 3.4 280
AmpereOne® A96-36X 96 3.6 292

*Performance per socket and usage power data based on estimated SPECrate®2017_int_base (GCC23) and are subject to change based on system configuration and other factors. Usage Power is defined as average power consumed over time by a given workload.

The Packagaing

The ARS-211M-NR arrives in Supermicro’s signature double-boxed packaging, built for durability and peace of mind. This extra layer of protection ensures the system is well-guarded against any shipping or handling mishaps before it reaches your data centre or lab.

The server presents itself with an efficient layout. It is carefully encased in protective foam and wrapped in anti-static plastic, safeguarding it from both physical impact and static discharge. The rails and accessory box are neatly packed around it and are ready for deployment.

The Outside

Designed in a sleek 2U form factor, this system is engineered for high-performance environments. Its compact frame belies the raw power it holds—capable of supporting up to four double-width GPUs, making it a strong candidate for AI, deep learning and graphical workloads. What sets this model apart is its shorter chassis depth of just 25.5 inches, offering increased flexibility in space-constrained deployments without sacrificing performance.

The front of the chassis is cleanly divided into three functional sections. On the left, you’ll find native support for four NVMe U.2 drives, with expansion options that can scale up to 24 drives depending on your chosen configuration. The centre features a broad air intake vent to optimise cooling efficiency, while the right side contains blanking panels where two GPUs can be installed, allowing flexible hardware expansion without compromising airflow.

At the rear, the design deviates slightly from traditional chassis layouts, prioritising both accessibility and ventilation. On the far left, two hot-swappable 2000W redundant power supplies sit alongside an air vent. The centre rear section is split between a top slot for a double-width GPU and additional airflow space, and a lower half that hosts a comprehensive I/O panel. This includes a Micro USB C port that doubles as a COM port, two USB 3.0 ports, a VGA port, a dedicated IPMI LAN port and two 25Gb SFP28 ports, all surrounded by strategically placed vents to support optimal thermal management. The right side of the rear panel offers another GPU slot on top and space for a low-profile PCIe card and an AIOM card on the bottom.

The Inside

Under the hood, the ARS-211M-NR supports AmpereOne processors up to 192 cores and a TDP limit up to 400W, making it an ideal fit for modern workloads that demand both high core counts and power efficiency. It offers memory configurations of up to 16 DDR5 ECC RDIMM/LRDIMM modules, supporting up to 2TB at 5200MT/s in a 1DPC setup or up to 4TB at 4400MT/s when running 2DPC modules.

Cooling is handled by four high-performance 80mm fans mounted in the front of the chassis. The fourth fan is positioned slightly further back than the others, creating the necessary clearance for any GPUs installed in the system while maintaining strong front-to-back airflow.

PCIe Switch Modular Option for LLM AI Inference (PIO-211M-NRG)

The ARS-211M-NR is designed as a flexible and modular system that can be upgraded with a PCIe Switch supported for the LLM AI interface workload. This OEM version (PIO-211M-NRG) is designed to address the needs of volume AI inference customers in both cloud and enterprise markets.  PCle switching support is required for dense AI Accelerators working together to support LLMs sized greater than 20B parameters.

Conclusion

The ARS-211M-NR is a thoughtfully engineered solution that blends compact design with serious computing power. Its ability to support up to four GPUs, massive memory capacity and the latest Ampere processors, now with physical core counts reaching 192, makes it an ideal candidate for next-gen AI, HPC and data-heavy workloads. With a shorter 25.5-inch depth, it’s a strong fit for space-conscious deployments. Supermicro continues to push boundaries with this Arm64-based server, delivering a system that doesn’t just compete with x86_64; it challenges it head-on.

Test drives of the ARS-211M-NR are already available via Boston Labs, our onsite R&D and test facility. The team are ready to enable customers to test-drive the latest technology on-premises or remotely via our fast internet connectivity.

If you are ready to start your AmpereOne journey, then please get in touch either by email or by calling us on 01727 876100 and one of our experienced sales engineers will happily guide you to your perfect tailored solution and invite you for a demo.

Author

Peter Wilsher

Field Application Engineer

Boston Limited

Tags: Boston Labs, AmpereOne, Arm, ARS-211M-NR, Supermicro, High Performance Computing, Immersive Media, Cloud Gaming, AIC, AI Inference, Training, Dense-VDI, Video-on-Demand

Test out any of our solutions at Boston Labs

To help our clients make informed decisions about new technologies, we have opened up our research & development facilities and actively encourage customers to try the latest platforms using their own tools and if necessary together with their existing hardware. Remote access is also available

Contact us

Latest Event

Boston Technology Innovation Day | 10th - 11th September 2025, The Grove, Watford

Boston's annual Technology Innovation Day is back for 2025 and this year we're at The Grove!

more info