AI-controlled robots can be jailbroken, and the results could be disastrous

Penn University researchers have developed an algorithm that can jailbreak a number of popular robotics platforms, often with a 100% success rate.

Oct 18, 2024 - 21:06

0 31

AI-controlled robots can be jailbroken, and the results could be disastrous

The Figure 02 robot looking at its own hand

Figure Robotics

Researchers at Penn Engineering have reportedly uncovered previously unidentified security vulnerabilities in a number of AI-governed robotic platforms.

“Our work shows that, at this moment, large language models are just not safe enough when integrated with the physical world,” George Pappas, UPS Foundation Professor of Transportation in Electrical and Systems Engineering, said in a statement.

Pappas and his team developed an algorithm, dubbed RoboPAIR, “the first algorithm designed to jailbreak LLM-controlled robots.” And unlike existing prompt engineering attacks aimed at chatbots, RoboPAIR is built specifically to “elicit harmful physical actions” from LLM-controlled robots, like the bipedal platform Boston Dynamics and TRI are developing.

RoboPAIR reportedly achieved a 100% success rate in jailbreaking three popular robotics research platforms: the four-legged Unitree Go2, the four-wheeled Clearpath Robotics Jackal, and the Dolphins LLM simulator for autonomous vehicles. It took mere days for the algorithm to fully gain access to those systems and begin bypassing safety guardrails. Once the researchers had taken control, they were able to direct the platforms to take dangerous actions, such as driving through road crossings without stopping.

“Our results reveal, for the first time, that the risks of jailbroken LLMs extend far beyond text generation, given the distinct possibility that jailbroken robots could cause physical damage in the real world,” the researchers wrote.

The Penn researchers are working with the platform developers to harden their systems against further intrusion, but warn that these security issues are systemic.

“The findings of this paper make abundantly clear that having a safety-first approach is critical to unlocking responsible innovation,” Vijay Kumar, a coauthor from the University of Pennsylvania, told The Independent. “We must address intrinsic vulnerabilities before deploying AI-enabled robots in the real world.”

“In fact, AI red teaming, a safety practice that entails testing AI systems for potential threats and vulnerabilities, is essential for safeguarding generative AI systems,” added Alexander Robey, the paper’s first author, “because once you identify the weaknesses, then you can test and even train these systems to avoid them.”

Andrew Tarantola

Andrew Tarantola is a journalist with more than a decade reporting on emerging technologies ranging from robotics and machine…

Figure’s latest robot is already in testing at a BMW plant

The Figure 02 robot looking at its own hand

OpenAI-backed robotics startup Figure unveiled its second-generation bipedal humanoid robot, the Figure 02 (F.02), on Tuesday.

The automaton is being developed to perform menial tasks on factory floors, but the company has hinted that we might be getting robo-butlers sometime soon.

An accurate ChatGPT watermarking tool may exist, but OpenAI won’t release it

chatGPT on a phone on an encyclopedia

ChatGPT plagiarists beware, as OpenAI has developed a tool that is capable of detecting GPT-4's writing output with reportedly 99.99% accuracy. However, the company has spent more than a year waffling over whether or not to actually release it to the public.

The company is reportedly taking a “deliberate approach” due to “the complexities involved and its likely impact on the broader ecosystem beyond OpenAI,” per TechCrunch. "The text watermarking method we’re developing is technically promising, but has important risks we’re weighing while we research alternatives, including susceptibility to circumvention by bad actors and the potential to disproportionately impact groups like non-English speakers,” an OpenAI spokesperson said.

The Apple Vision Pro can now be controlled only by your mind

Mark has ALS but can use the Vision Pro via Synchron's Stentrode.

The Apple Vision Pro is already incredibly easy to use, largely thanks to its lack of controllers. You just look at a control and tap your index finger to your thumb to select.

But hand gestures aren’t always easy or possible for the millions of people worldwide who have paralysis of the upper limbs. Synchron recently announced a spatial computing breakthrough that lets users of the Stentrode BCI (brain computer interface) implant control an Apple Vision Pro.