What is it about?
The performance of modern AI systems is increasingly constrained by limitations in memory and network resources, driven by fundamental scaling challenges in current communication technologies. Copper links are power-efficient and reliable but have very limited reach (< 2 m). Optical links offer longer reach but at the expense of high power consumption and lower reliability. This paper introduces MOSAIC, a collaborative effort across Microsoft Research, Microsoft Azure, and M365 investigating the use of a wide-and-slow architecture and microLEDs to develop a novel technology that can break this trade-off by providing high-distance connectivity with low power consumption, low cost, and high reliability, opening up exciting opportunities for radical new AI cluster designs.
Featured Image
Photo by Justin Lane on Unsplash
Why is it important?
The fundamental trade-off among power, reliability, and reach stems from the narrow-and-fast architecture deployed in today’s copper and optical links, comprising a few channels operating at very high data rates. These challenges grow as speeds increase with every generation of networks. Transmitting at high speeds also pushes the limits of optical components, reducing systems margins and increasing failure rates. These limitations force systems designers to make unpleasant choices, limiting the scalability of AI infrastructure. For example, scale-up networks connecting AI accelerators at multi-Tbps bandwidth typically must rely on copper links to meet the power budget, requiring ultra-dense racks that consume hundreds of kilowatts per rack. This creates significant challenges in cooling and mechanical design, which constrain the practical scale of these networks and end-to-end performance. This imbalance ultimately erects a networking wall akin to the memory wall, in which CPU speeds have outstripped memory speeds, creating performance bottlenecks. By offering copper-like power efficiency and reliability over long distances, MOSAIC can overcome this networking wall, enabling multi-rack scale-up domains and unlocking new architectures.
Perspectives
By combining the best properties of copper and optical links, MOSAIC enables networks that are long-reach, low-power, and highly reliable, and can scale to future bandwidth demands. While this can already provide immediate gains, this unique combination also opens up exciting new opportunities to rethink AI infrastructure from network and cluster architectures to compute and memory designs. Historically, each 10x increase in network bandwidth (and corresponding reduction in latency) has driven a new era of distributed computing, from FTP and email in the 1970s to today’s epoch of machine learning and resource disaggregation. By enabling the next step-change in bandwidth, MOSAIC could not only unlock recently proposed architectures but also enable the community to explore entirely new designs for networks, compute, memory and clusters. In paper, we provide some possible examples and we hope that by highlighting these new directions, this work will spur future innovation at the intersection of networks and systems.
Paolo Costa
Microsoft Corp
Read the Original
This page is a summary of: Mosaic: Breaking the Optics versus Copper Trade-off with a Wide-and-Slow Architecture and MicroLEDs, August 2025, ACM (Association for Computing Machinery),
DOI: 10.1145/3718958.3750510.
You can read the full text:
Resources
Contributors
The following have contributed to this page







