Microsoft has introduced Mu, a new artificial intelligence (AI) model that can run locally on a device. Last week, the Redmond-based tech giant released new Windows 11 features in beta, among which was the new AI agents feature in Settings. The feature allows users to describe what they want to do in the Settings menu, and uses AI agents to either navigate to the option or autonomously perform the action. The company has now confirmed that the feature is powered by the Mu small language model (SLM).

Microsoft’s Mu AI Model Powers Agents in Windows Settings

In a blog post, the tech giant detailed its new AI model. It is currently deployed entirely on-device in compatible Copilot+ PCs, and it runs on the device’s neural processing unit (NPU). Microsoft has worked on the optimisation and latency of the model and claims that it responds at more than 100 tokens per second to meet the “demanding UX requirements of the agent in Settings scenario.”

Mu is built on a transformer-based encoder-decoder architecture featuring 330 million token parameters, making the SLM a good fit for small-scale deployment. In such an architecture, the encoder first converts the input into a legible fixed-length representation, which is then analysed by the decoder, which also generates the output.

Microsoft said this architecture was preferred due to the high efficiency and optimisation, which is necessary when functioning with limited computational bandwidth. To keep it aligned with the NPU’s restrictions, the company also opted for layer dimensions and optimised parameter distribution between the encoder and decoder.

Distilled from the company’s Phi models, Mu was trained using A100 GPUs on Azure Machine Learning. Typically, distilled models exhibit higher efficiency compared to the parent model. Microsoft further improved its efficiency by pairing the model with task-specific data and fine-tuning via low-rank adaptation (LoRA) methods. Interestingly, the company claims that Mu performs at a similar level as the Phi-3.5-mini despite being one-tenth the size.

Optimising Mu for Windows Settings

The tech giant also had to solve another problem before the model could power AI agents in Settings — it needed to be able to handle input and output tokens to change hundreds of system settings. This required not only a vast knowledge network but also low latency to complete tasks almost instantaneously.

Hence, Microsoft massively scaled up its training data, going from 50 settings to hundreds, and used techniques like synthetic labelling and noise injection to teach the AI how people phrase common tasks. After training with more than 3.6 million examples, the model became fast and accurate enough to respond in under half a second, the company claimed.

One important challenge was that Mu performed better with multi-word queries over shorter or vague phrases. For instance, typing “lower screen brightness at night” gives it more context than just typing “brightness.” To solve this, Microsoft continues to show traditional keyword-based search results when a query is too vague.

Microsoft also observed a language-based gap. In instances when a setting could apply to more than a single functionality (for instance, “increase brightness” could refer to the device’s screen or an external monitor). To address this gap, the AI model currently focuses on the most commonly used settings. This is something the tech giant continues to refine.

Share.
Exit mobile version