Category: Schneier on Security

Jailbreaking LLMs with ASCII Art

Researchers have demonstrated that putting words in ASCII art can cause LLMs—GPT-3.5, GPT-4, Gemini, Claude, and Llama2—to ignore their safety instructions. Research paper. This article has been indexed from Schneier on Security Read the original article: Jailbreaking LLMs with ASCII…

Using LLMs to Unredact Text

Initial results in using LLMs to unredact text based on the size of the individual-word redaction rectangles. This feels like something that a specialized ML system could be trained on. This article has been indexed from Schneier on Security Read…

Essays from the Second IWORD

The Ash Center has posted a series of twelve essays stemming from the Second Interdisciplinary Workshop on Reimagining Democracy (IWORD 2023). Aviv Ovadya, Democracy as Approximation: A Primer for “AI for Democracy” Innovators Kathryn Peters, Permission and Participation Claudia Chwalisz,…

A Taxonomy of Prompt Injection Attacks

Researchers ran a global prompt hacking competition, and have documented the results in a paper that both gives a lot of good examples and tries to organize a taxonomy of effective prompt injection strategies. It seems as if the most…

How Public AI Can Strengthen Democracy

With the world’s focus turning to misinformation,  manipulation, and outright propaganda ahead of the 2024 U.S. presidential election, we know that democracy has an AI problem. But we’re learning that AI has a democracy problem, too. Both challenges must be…

Surveillance through Push Notifications

The Washington Post is reporting on the FBI’s increasing use of push notification data—”push tokens”—to identify people. The police can request this data from companies like Apple and Google without a warrant. The investigative technique goes back years. Court orders…

The Insecurity of Video Doorbells

Consumer Reports has analyzed a bunch of popular Internet-connected video doorbells. Their security is terrible. First, these doorbells expose your home IP address and WiFi network name to the internet without encryption, potentially opening your home network to online criminals.…

NIST Cybersecurity Framework 2.0

NIST has released version 2.0 of the Cybersecurity Framework: The CSF 2.0, which supports implementation of the National Cybersecurity Strategy, has an expanded scope that goes beyond protecting critical infrastructure, such as hospitals and power plants, to all organizations in…