Alexa, Siri, Google Assistant vulnerable to malicious commands, study reveals

by | May 16, 2024 | Technology

Join us in returning to NYC on June 5th to collaborate with executive leaders in exploring comprehensive methods for auditing AI models regarding bias, performance, and ethical compliance across diverse organizations. Find out how you can attend here.

A new study from researchers at Amazon Web Services has exposed significant security flaws in large language models that can understand and respond to speech. The paper, titled “SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models,” details how these AI systems can be manipulated to produce harmful or unethical responses using carefully designed audio attacks.

As speech interfaces become ubiquitous, from smart speakers to AI assistants, ensuring the safety and robustness of the underlying technology is crucial. However, the AWS researchers found that despite built-in safety checks, speech language models (SLMs) are highly vulnerable to “adversarial attacks” — slight perturbations to the audio input that are imperceptible to humans but can completely alter the model’s behavior.

A diagram from the AWS research paper illustrates how a spoken question-answering AI system can be manipulated into providing unethical instructions on how to rob a bank when subjected to an adversarial attack. The researchers propose a pre-processing defense to mitigate such vulnerabilities in speech-based language models. (image credit: arxiv.org)

Jailbreaking SLMs with adversarial audio

“Our experiments on jailbreaking demonstrate the vulnerability of SLMs to adversarial perturbations and transfer attacks, with average attack success rates of 90% and 10% respectively when evaluated on a dataset of carefully designed harmful questions,” the authors wrote. “This raises serious concerns about the potential for bad actors to exploit these systems at scale.”

Using a technique called projected gradient descent, the researchers were able to generate adversarial examples that consistently caused the SLMs to produce toxic outputs across 12 different categories, from explicit violence to hate speech. Shockingly, with full access to the model, they achieved a 90% success rate in compromising its safety barriers.

VB Event
The AI Impact Tour: The AI Audit

Join us as we return to NYC on June 5th to engage with top executive leaders, delving into strategies for auditing AI models to ensure fairness, optimal performance, and ethical compliance across diverse organizations. Secure your attendance for this exclusive invite-only event.

Request an invite

The research demonstrates how adversarial attacks can be carried out across different spoken question-answering AI models, using techniques like cross-model and cross-prompt attacks to elicit unintended responses, underscoring the need for robust, transferable defenses. (image credit: arxiv.org)

Black-box attacks: A real-world threat

Even more alarming, the study showed that audio attacks crafted on one SLM often transferred to other models, even without direct access — a realistic scenario given that most commercial providers only allow limited API access. While the success rate dropped t …

Article Attribution | Read More at Article Source

[mwai_chat context=”Let’s have a discussion about this article:nn
Join us in returning to NYC on June 5th to collaborate with executive leaders in exploring comprehensive methods for auditing AI models regarding bias, performance, and ethical compliance across diverse organizations. Find out how you can attend here.

A new study from researchers at Amazon Web Services has exposed significant security flaws in large language models that can understand and respond to speech. The paper, titled “SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models,” details how these AI systems can be manipulated to produce harmful or unethical responses using carefully designed audio attacks.

As speech interfaces become ubiquitous, from smart speakers to AI assistants, ensuring the safety and robustness of the underlying technology is crucial. However, the AWS researchers found that despite built-in safety checks, speech language models (SLMs) are highly vulnerable to “adversarial attacks” — slight perturbations to the audio input that are imperceptible to humans but can completely alter the model’s behavior.

A diagram from the AWS research paper illustrates how a spoken question-answering AI system can be manipulated into providing unethical instructions on how to rob a bank when subjected to an adversarial attack. The researchers propose a pre-processing defense to mitigate such vulnerabilities in speech-based language models. (image credit: arxiv.org)

Jailbreaking SLMs with adversarial audio

“Our experiments on jailbreaking demonstrate the vulnerability of SLMs to adversarial perturbations and transfer attacks, with average attack success rates of 90% and 10% respectively when evaluated on a dataset of carefully designed harmful questions,” the authors wrote. “This raises serious concerns about the potential for bad actors to exploit these systems at scale.”

Using a technique called projected gradient descent, the researchers were able to generate adversarial examples that consistently caused the SLMs to produce toxic outputs across 12 different categories, from explicit violence to hate speech. Shockingly, with full access to the model, they achieved a 90% success rate in compromising its safety barriers.

VB Event
The AI Impact Tour: The AI Audit

Join us as we return to NYC on June 5th to engage with top executive leaders, delving into strategies for auditing AI models to ensure fairness, optimal performance, and ethical compliance across diverse organizations. Secure your attendance for this exclusive invite-only event.

Request an invite

The research demonstrates how adversarial attacks can be carried out across different spoken question-answering AI models, using techniques like cross-model and cross-prompt attacks to elicit unintended responses, underscoring the need for robust, transferable defenses. (image credit: arxiv.org)

Black-box attacks: A real-world threat

Even more alarming, the study showed that audio attacks crafted on one SLM often transferred to other models, even without direct access — a realistic scenario given that most commercial providers only allow limited API access. While the success rate dropped t …nnDiscussion:nn” ai_name=”RocketNews AI: ” start_sentence=”Can I tell you more about this article?” text_input_placeholder=”Type ‘Yes'”]

Share This