GearriceGearrice
  • Tech World
  • Smart Home
  • Mobile Zone
  • 5G
  • Alexa
  • Amazon
  • AMD
  • Android
  • Apple
  • AirPods
  • AirTag
  • Apple Studio
  • Apple TV
  • Apple Watch
  • HomePod
  • iOS
  • iOS 15.4
  • iPad
  • iPhone
  • Mac
  • iMac
  • MacBook
  • Apps
  • Asus
  • Bitcoin
  • Cars
  • ChatGPT
  • Computer
  • Keyboard
  • Contact us
  • Disney
  • Display
  • Electric
  • Elon Musk
  • Gaming
  • Google
  • Chromecast
  • Google Maps
  • HBO
  • How to
  • Huawei
  • HONOR
  • Instagram
  • Intel
  • Internet
  • iQOO
  • Laptop
  • Lenovo
  • LG
  • Meta
  • Facebook
  • Galaxy
  • Metaverse
  • Microsoft
  • Windows
  • Motorola
  • Movies
  • Movistar
  • MWC Barcelona 2022
  • Netflix
  • News
  • Nintendo
  • Nokia
  • Nvidia
  • OPPO
  • OnePlus
  • Realme
  • Orange
  • Oscars
  • Philips
  • PlayStation
  • Pokémon
  • Qualcomm
  • Snapdragon
  • Samsung
  • Solar
  • Sony
  • SpaceX
  • Spotify
  • Tablet
  • Tesla
  • TikTok
  • Tips and Tricks
  • Today
  • Twitch
  • Twitter
  • Vivo
  • VPN
  • WhatsApp
  • Write For Us
  • MIUI
  • POCO
  • Redmi
  • Mouse
  • OLED
  • Prime
  • Scooter
  • Xbox
  • Xiaomi
  • YouTube
Facebook Twitter Instagram
Facebook Twitter Instagram Pinterest
Gearrice Gearrice
Subscribe
  • Tech World
  • Best Deals
  • Gaming
  • Mobile Zone
    • Android
    • Apple
  • Smart Home
GearriceGearrice
Home»Tech World»everything you need to know about the biggest security flaw in AI

everything you need to know about the biggest security flaw in AI

By Daniel Casil10/09/20236 Mins Read
everything you need to know about the biggest security flaw in AI
Share
Facebook Twitter LinkedIn Pinterest

A huge security flaw affects all generative AI, from ChatGPT to Google Bard. With a so-called prompt injection attack, it is in fact possible to manipulate a chatbot to use it for malicious purposes. We take stock of this type of attack with disastrous consequences.

ChatGPT, Google Bard, Anthropic’s Claude and all generative AI have a major security flaw. Users, malicious or simply curious, can push the chatbot to generate content that is dangerous, offensive, unethical or concerns illegal activities. The restrictions put in place by OpenAI, Google and others, from the first stages of training the linguistic model, are then ignored by the algorithms.

Also read: This open source AI model challenges ChatGPT, Google Bard and Meta’s Llama 2

Contents hide
1 Everything you need to know about the prompt injection attack
2 The consequences of the AI ​​security breach
3 A flaw that is impossible to correct 100%?
4 How to mitigate the risks of AI?

Everything you need to know about the prompt injection attack

When a user persuades a chatbot toignore your programming to generate prohibited content, it carries out a so-called “prompt injection” attack. Concretely, it injects calibrated requests into the conversation with an AI. These are the words chosen that push artificial intelligence to override its programming.

There is in fact two types of attacks of “prompt-injection”. The first, the direct method, consists of speaking with an AI to ask it things that are forbidden to it. Very often, you have to talk a little with the chatbot to manipulate it and achieve convincing results. In detail, the AI ​​will in fact “think” that the response it will provide does not contravene its principles. One of the most used mechanisms consists of giving the chatbot the impression that it is in agreement with its programming.

For example, it is possible toget forbidden answers by distorting the context. If you tell him that you are doing research for a film, a novel, or to protect a loved one, you could, with a little patience, obtain information on the best way to commit a crime. If you question a chatbot like ChatGPT point blank, you will never get a convincing answer. Another method used is to give the AI ​​a plethora of instructions, before asking it to go back, ignore these, and do the opposite. This is the principle of an adversarial attack. Confused, the AI ​​may then begin to obey a little too meekly. Finally, some attackers manage to determine the words that trigger the AI ​​alerts. After isolating the prohibited terms, they look for synonyms or make subtle typos. Ultimately, the AI ​​misses the prohibited aspect of the request.

The second type of offensive is called indirect. Instead of chatting with the AI, attackers will slip in the malicious request in websites or documents intended to be consulted by the robot, including PDFs or images. More and more chatbots are indeed capable of reading documents or examining a page of a website. For example, ChatGPT has been enriched with a series of plugins that allow it to summarize a PDF or a web page.

In this case, the attack is not carried out by the user, but by a third party. It therefore endangers the AI ​​interlocutors, who could find themselves, without their knowledge, with a conversational robot which has been manipulated by an unknown attacker. From then on, the chatbot could start ignoring its programming and suddenly generate horrors. These attacks are even more worrying for security experts.

Interviewed by Wired, Rich Harang, security researcher specializing in AI at Nvidia, regrets that “anyone who provides information to an LLM (Large Model Language) has a high degree of influence on production”. Vijay Bolina, director of information security at Google Deepmind, agrees and reveals that rapid injection, especially indirect, is ” a preoccupation “ of the subsidiary.

The consequences of the AI ​​security breach

Once an attack of this type has been carried out, the AI ​​will answer the question without worrying about the limits posed by its creators. At the request of a criminal, artificial intelligence can therefore code malware, write phishing pages, explain how to produce drugs or write a tutorial on kidnapping. According to Europol, criminals have already massively adopted AI as an assistant.

By relying on prompt injection attacks, hackers have also developed malicious versions of ChatGPT, such as WormGPT or FraudGPT. These chatbots are designed to assist hackers and scammers in their misdeeds. Likewise, it is possible to force the AI ​​to imagine fake news, generate hate speech or make racist, misogynistic or homophobic comments.

According to researcher Kai Greshake, hackers can use a chatbot to steal data from a company or an Internet user. Through an indirect rapid injection attack, they can convince the AI ​​toexfiltrate all data provided by the interlocutor. Likewise, malicious requests, hidden in documents exchanged by email, can lead to the installation of a virus, such as ransomware for example, on a machine. For security reasons, do not slip any file into a conversation with ChatGPT or an alternative.

A flaw that is impossible to correct 100%?

Unsurprisingly, OpenAI, Google and others are doing everything they can to block all prompt injection attacks targeting their artificial intelligences. According to OpenAI, GPT-4 is less sensitive to manipulation attempts than GPT-3.5. This is why some users may feel that ChatGPT tends to regress at times. For the moment, however, it seems impossible to completely overcome the vulnerability inherent in the very functioning of linguistic models. This is the opinion of Simon Willison, cybersecurity researcher:

“It’s easy to build a filter for attacks you know about. And if you think really hard, you might be able to block 99% of attacks you’ve never seen before. But the problem is that when it comes to security, 99% filtering is a failure.”

How to mitigate the risks of AI?

Researchers, and AI giants, therefore recommend instead mitigating the risks generated and taking precautions. In a report published on the Nvidia website, Rich Harang even recommends “treat all LLM productions as potentially malicious” out of caution. Vijay Bolina from Deepmind recommends limiting the amount of data communicated to artificial intelligence.

Aware of the risks posed by ChatGPT, OpenAI says it is continually working on risk mitigation posed by rapid injection. Same story from Microsoft, which claims to fight against indirect attacks, by blocking suspicious websites, and against direct offensives, by filtering manipulative requests. Mirroring Microsoft, Google Deepmind is doing its best to “identify known malicious entries”. To achieve this, Google’s AI division relies on “specially trained models” intended to analyze queries.

Related Posts

If you are also looking for premium foldable screen laptops? see the list

The new Decathlon poncho, the soon-to-be obsolete Apple Watch and the Tesla Model 3 which is setting the meters crazy – Tech’spresso

a study confirms the harmful effects of X-rated content on our sexuality

Add A Comment

Leave A Reply Cancel Reply

Tech World

Digital vs. Physical Marketing: An Insight

By gearrice03/10/20230

If you are also looking for premium foldable screen laptops? see the list

04/10/2023

The new Decathlon poncho, the soon-to-be obsolete Apple Watch and the Tesla Model 3 which is setting the meters crazy – Tech’spresso

04/10/2023

a study confirms the harmful effects of X-rated content on our sexuality

04/10/2023

Best surveying apps for iPhone and iPad

04/10/2023
Gearrice
Facebook Twitter Instagram Pinterest
  • Privacy Policy
  • Terms and Conditions
  • Write For Us
© 2023 Gearrice.

Type above and press Enter to search. Press Esc to cancel.