Microsoft introduces Phi-Silica, a 3.3B parameter model made for Copilot+ PC NPUs

by | May 21, 2024 | Technology

Join us in returning to NYC on June 5th to collaborate with executive leaders in exploring comprehensive methods for auditing AI models regarding bias, performance, and ethical compliance across diverse organizations. Find out how you can attend here.

Microsoft is making more investments in the development of small language models (SLMs). At its Build developer conference, the company announced the general availability of its Phi-3 models and previewed Phi-3-vision. However, on the heels of Microsoft’s Copilot+ PC news, it’s introducing an SLM built specifically for these device’s powerful Neural Processing Units (NPUs).

Phi-3-Silica will be embedded in all Copilot+ PCs when they go on sale starting in June. It’s the smallest of all the Phi models, with 3.3 billion parameters.

Microsoft claims the first token latency is 650 tokens per second and uses about 1.5 Watts of power, meaning that it won’t be a resource hog and frees the PC’s CPU and GPU to handle other computations. In addition, its token generation reuses the NPU’s KV cache and will run on the CPU, producing about 27 tokens per second.

A Microsoft spokesperson tells VentureBeat that what differentiates Phi-Silica is “its distinction as Windows’ inaugural locally deployed language model. It is optimized to run on Copilot + PCs NPU, bringing lightning-fast local inferencing to your device. This milestone marks a pivotal moment in bringing advanced AI directly to 3P developers optimized for Windows to begin building incredible 1P & 3P experiences that will, this fall, come to end users, elevating productivity and accessibility within the Windows ecosystem.” 

VB Event
The AI Impact Tour: The AI Audit

Join us as we return to …

Article Attribution | Read More at Article Source

[mwai_chat context=”Let’s have a discussion about this article:nn
Join us in returning to NYC on June 5th to collaborate with executive leaders in exploring comprehensive methods for auditing AI models regarding bias, performance, and ethical compliance across diverse organizations. Find out how you can attend here.

Microsoft is making more investments in the development of small language models (SLMs). At its Build developer conference, the company announced the general availability of its Phi-3 models and previewed Phi-3-vision. However, on the heels of Microsoft’s Copilot+ PC news, it’s introducing an SLM built specifically for these device’s powerful Neural Processing Units (NPUs).

Phi-3-Silica will be embedded in all Copilot+ PCs when they go on sale starting in June. It’s the smallest of all the Phi models, with 3.3 billion parameters.

Microsoft claims the first token latency is 650 tokens per second and uses about 1.5 Watts of power, meaning that it won’t be a resource hog and frees the PC’s CPU and GPU to handle other computations. In addition, its token generation reuses the NPU’s KV cache and will run on the CPU, producing about 27 tokens per second.

A Microsoft spokesperson tells VentureBeat that what differentiates Phi-Silica is “its distinction as Windows’ inaugural locally deployed language model. It is optimized to run on Copilot + PCs NPU, bringing lightning-fast local inferencing to your device. This milestone marks a pivotal moment in bringing advanced AI directly to 3P developers optimized for Windows to begin building incredible 1P & 3P experiences that will, this fall, come to end users, elevating productivity and accessibility within the Windows ecosystem.” 

VB Event
The AI Impact Tour: The AI Audit

Join us as we return to …nnDiscussion:nn” ai_name=”RocketNews AI: ” start_sentence=”Can I tell you more about this article?” text_input_placeholder=”Type ‘Yes'”]

Share This