Microsoft Cloud Supply Chain: Democratizing Hyperscale Optimization for Cloud Fulfillment

Microsoft

With 34% annual growth and over $75 billion in revenue in fiscal year 2025, Microsoft Azure, the cloud computing platform developed by Microsoft, has reached unprecedented scale, operating more than 400 data centers across over 70 regions. This expansion has been dramatically accelerated by the AI transformation. Adding capacity is not enough to support the demand surge; the company must deliver the right resources at the right time to the right place. This raises a critical question: How can Microsoft deliver cloud capacity faster, more efficiently, and on a global scale? 

In 2019, Microsoft Cloud Supply Chain and Microsoft Research launched a multiyear collaboration to answer this question. The journey to deliver cloud capacity begins nearly two years before hardware lands in a data center. It begins with long-term demand forecasts that guide procurement decisions, including opening data centers, setting contracts with suppliers, and securing internal financial approvals. The focus of our work is the fulfillment phase – installing data center capacity. Fulfillment operates on a quarterly horizon, processing thousands of demand requests from both external customers and internal services such as Azure, Office, Teams, and Bing. Our operational team translates customer needs into requirements – including the hardware product family and quantity, the target region, and the due date for the capacity to go live. 

Then, the fulfillment team makes a series of interdependent decisions for each demand to determine supplier, data center, and timeline. These decisions are constrained by hardware and data center compatibility, inventory levels, data center labor availability, and limitations on power and space. At the same time, the system must prioritize hot demands, maximize on-time deployments, reduce shipping costs, efficiently manage billion-dollar hardware inventory, and avoid data center resource fragmentation. At the center of this process are business experts that shepherd all fulfillment plans, adjusting them if needed. 

Before 2020, Microsoft’s cloud fulfillment relied on manual, spreadsheet-driven processes, which were quickly overwhelmed by spiking demand and complexity. To address compounding delays, human overload, and suboptimal decisions, we embarked on the journey to reinvent fulfillment. At the heart of this transformation lies the Intelligent Fulfillment Service (IFS), a novel integration of mathematical optimization and generative AI that optimizes and explains cloud hardware shipment planning at scale.  

Because processes had historically been manual, our first challenge was to precisely define the optimization problem we needed to solve. Working closely with business and supply chain planners, we uncovered many implicit constraints and objectives embedded in tacit knowledge, enabling us to design a solution with mixed-integer linear optimization. New business requirements, expansion to cover all demand resources, and a dramatic increase in the demand and supply volume led to substantial scale challenges. These dynamics were largely driven by the rapid acceleration of graphics processing unit (GPU) deployments to support the AI transformation. To achieve scale, we designed a decomposition approach that balanced detailed decision-making with online responsiveness.  

We decoupled the master problem of our fulfillment algorithm into a primary resource allocation phase and a secondary dock scheduling phase. The resource allocator makes daily global trade-offs across all demands and competing business policies yet retains enough timeline granularity to unlock high-quality schedules. Once the resource allocator selects specific racks, the docking scheduler uses refreshed production data and supply chain planner inputs to fine-tune data center and dock date choices with more specific constraints throughout the day. Combined with other techniques to prune the underlying resource matching graph and reduce symmetry, IFS scaled to accommodate the increasing demand without affecting solution quality. GPU allocation was onboarded in 2025, ultimately resulting in full coverage.  

Yet, one challenge remained: trust. Business experts sought explicability for decisions, requesting the ability to explore additional scenarios: “Why was supplier A chosen over B for demand D1?” or “What if demand D2 docks a week earlier?” Given the interdependencies across decisions, such questions frequently required reoptimization and multiple rounds of interaction between planners and engineers, often taking over two days to resolve. 

In early 2023, we designed OptiGuide, a large language model (LLM) framework for interacting with optimization technology. Within months, OptiGuide was deployed as the IFS AI assistant, marking the first productionized LLM in the cloud supply chain organization and opening its advanced decision-making tools to nontechnical users. 

Since its launch, IFS has fundamentally transformed cloud fulfillment operations at Microsoft, delivering substantial financial and operational impact. Cycle times have been cut in half, resulting in tens to hundreds of millions of dollars in annual savings. OptiGuide has reduced the average response time to planners’ questions from 2.5 days to near real time, leading to a significant reduction in the time spent by the fulfillment team and a corresponding 2.5-day decrease in planning time for associated demands.  

Microsoft Cloud earned Gartner's 2023 Supply Chain Breakthrough of the Year and Process or Technology Innovation of the Year awards, ranking in the top 10 of Gartner's Global Supply Chain Top 25 for three consecutive years. The success and visibility of this work inspired other organizations across Microsoft to take steps toward automation and optimization of their processes. Our work on GenAI for explainability further amplified this momentum, becoming a reference point for several internal initiatives. Externally, this work has attracted significant attention from both academia and industry and was featured in Gartner and the Harvard Business Review.