Educational Resources

Links to articles, papers, books, etc. that contain useful educational materials relevant to the project.

Please add to this list!

Articles

Publication	Author	Date	Title and Link
WithSecure Consulting	Lily Bradshaw, Donato Capitella	21-Oct-24	Fine-Tuning LLMs to Resist Indirect Prompt Injection Attacks
WithSecure Consulting	Benjamin Hull, Donato Capitella	08-Apr-24	Domain-specific prompt injection detection with BERT classifier
WithSecure Consulting	Donato Capitella	21-Feb-24	Should you let ChatGPT control your browser? / YouTube Video
Prompt Injection Explanation with video examples	Arnav Bathla	12-Dec-23	Prompt Injection Explanation with video examples
WithSecure Labs	Donato Capitella	04-Dec-23	A Case Study in Prompt Injection for ReAct LLM Agents/ YouTube Video
Cyber Security Against AI Wiki	Aditya Rana	04-Dec-23	Cyber Security AI Wiki
iFood Cybersec Team	Emanuel Valente	04-Sep-23	Prompt Injection: Exploring, Preventing & Identifying Langchain Vulnerabilities
PDF	Sandy Dunn	15-Oct-23	AI Threat Mind Map
Medium	Ken Huang	11-Jun-23	LLM-Powered Applications’ Architecture Patterns and Security Controls
Medium	Avinash Sinha	02-Feb-23	AI-ChatGPT-Decision Making Ability- An Over Friendly Conversation with ChatGPT
Medium	Avinash Sinha	06-Feb-23	AI-ChatGPT-Decision Making Ability- Hacking the Psychology of ChatGPT- ChatGPT Vs Siri
Wired	Matt Burgess	13-Apr-23	The Hacking of ChatGPT Is Just Getting Started
The Math Company	Arjun Menon	23-Jan-23	Data Poisoning and Its Impact on the AI Ecosystem
IEEE Spectrum	Payal Dhar	24-Mar-23	Protecting AI Models from “Data Poisoning”
AMB Crypto	Suzuki Shillsalot	30-Apr-23	Here’s how anyone can Jailbreak ChatGPT with these top 4 methods
Techopedia	Kaushik Pal	22-Apr-23	What is Jailbreaking in AI models like ChatGPT?
The Register	Thomas Claburn	26-Apr-23	How prompt injection attacks hijack today's top-end AI – and it's tough to fix
Itemis	Rafael Tappe Maestro	14-Feb-23	The Rise of Large Language Models ~ Part 2: Model Attacks, Exploits, and Vulnerabilities
Hidden Layer	Eoin Wickens, Marta Janus	23-Mar-23	The Dark Side of Large Language Models: Part 1
Hidden Layer	Eoin Wickens, Marta Janus	24-Mar-23	The Dark Side of Large Language Models: Part 2
Embrace the Red	Johann Rehberger (wunderwuzzi)	29-Mar-23	AI Injections: Direct and Indirect Prompt Injections and Their Implications
Embrace the Red	Johann Rehberger (wunderwuzzi)	15-Apr-23	Don't blindly trust LLM responses. Threats to chatbots
MufeedDVH	Mufeed	9-Dec-22	Security in the age of LLMs
danielmiessler.com	Daniel Miessler	15-May-23	The AI Attack Surface Map v1.0
Dark Reading	Gary McGraw	20-Apr-23	Expert Insight: Dangers of Using Large Language Models Before They Are Baked
Honeycomb.io	Phillip Carter	25-May-23	All the Hard Stuff Nobody Talks About when Building Products with LLMs
Wired	Matt Burgess	25-May-23	The Security Hole at the Heart of ChatGPT and Bing
BizPacReview	Terresa Monroe-Hamilton	30-May-23	‘I was unaware’: NY attorney faces sanctions after using ChatGPT to write brief filled with ‘bogus’ citations
Washington Post	Pranshu Verma	18-May-23	A professor accused his class of using ChatGPT, putting diplomas in jeopardy
Kudelski Security Research	Nathan Hamiel	25-May-23	Reducing The Impact of Prompt Injection Attacks Through Design
AI Village	GTKlondike	7-June-23	Threat Modeling LLM Applications
Embrace the Red	Johann Rehberger	28-Mar-23	ChatGPT Plugin Exploit Explained
NVIDIA Developer	Will Pearce, Joseph Lucas	14-Jun-23	NVIDIA AI Red Team: An Introduction
Kanaries	Naomi Clarkson	7-Apr-23	Google Bard Jailbreak

Official Guidance and Regulations

Institution	Date	Title and Link
NIST	8-March-2023	White Paper NIST AI 100-2e2023 (Draft)
UK Information Commisioner's Office (ICO)	3-April-2023	Generative AI: eight questions that developers and users need to ask
UK National Cyber Security Centre (NCSC)	2-June-2023	ChatGPT and large language models: what's the risk?
UK National Cyber Security Centre (NCSC)	31 August 2022	Principles for the security of machine learning
European Parliament	31 August 2022	EU AI Act: first regulation on artificial intelligence

Research Papers

Publication	Author	Date	Title and Link
Arxiv	Samuel Gehman, et al	24-Sep-20	REALTOXICITYPROMPTS: Evaluating Neural Toxic Degeneration in Language Models
Arxiv	Fabio Perez, Ian Ribeiro	17-Nov-22	Ignore Previous Prompt: Attack Techniques For Language Models
Arxiv	Nicholas Carlini, et al	14-Dec-20	Extracting Training Data from Large Language Models
NCC Group	Chris Anley	06-Jul-22	Practical Attacks on Machine Learning Systems
NCC Group	Jose Selvi	5-Dec-22	Exploring Prompt Injection Attacks
Arxiv	Varshini Subhash	22-Feb-2023	Can Large Language Models Change User Preference Adversarially?
?	Jing Yang et al	23 May 2023	A Systematic Literature Review of Information Security in Chatbots
Arxiv	Isaac et al	18 April 2023	AI Product Security: A Primer for Developers
OpenAI	OpenAI	15-Mar-23	GPT-4 Technical Report
Arxiv	Kai Greshake, et al	05-May-23	Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection
Arxiv	Alexander Wan, et al	01-May-23	Poisoning Language Models During Instruction Tuning
Arxiv	Leon Derczynski, et al	31-Mar-23	Assessing Language Model Deployment with Risk Cards
Arxiv	Jan von der Assen, et al	11-Mar-24	Asset-driven Threat Modeling for AI-based Systems

White Papers

Publication	Author	Date	Title and Link
Deloitte	Deloitte AI Institute	13-Mar-23	A new frontier in artificial intelligence - Implications of Generative AI for businesses
Team8	Team8 CISO Village	18-Apr-23	Generative AI and ChatGPT Enterprise Risks
Trail of Bits	Heidy Khlaaf	7-Mar-23	Toward Comprehensive Risk Assessments and Assurance of AI-Based Systems
Security Implications of ChatGPT	Cloud Security Alliance (CSA)	23-Apr-2023	Security Implications of ChatGPT

Videos

Service	Channel	Date	Title and Link
YouTube	LLM Chronicles	29-Mar-24	Prompt Injection in LLM Browser Agents
YouTube	Layerup	03-Mar-24	GenAI Worms Explained: The Emerging Cyber Threat to LLMs
YouTube	RALFKAIROS	05-Feb-23	ChatGPT for Attack and Defense- AI Risks: Privacy, IP, Phishing, Ransomware-By Avinash Sinha
YouTube	AI Explained	25-Mar-23	'Governing Superintelligence' - Synthetic Pathogens, The Tree of Thoughts Paper and Self-Awareness
YouTube	LiveOverflow	14-Apr-23	'Attacking LLM - Prompt Injection'
YouTube	LiveOverflow	27-Apr-23	'Accidental LLM Backdoor - Prompt Tricks'
YouTube	LiveOverflow	11-May-23	'Defending LLM - Prompt Injection'
YouTube	Cloud Security Podcast	30-May-23	'CAN LLMs BE ATTACKED!'
YouTube	API Days	28-Jun-23	Language AI Security at the API level: Avoiding Hacks, Injections and Breaches

Live Presentations

Service	Channel	Date	Title and Link
YouTube	API Days	28-Jun-23	Securing LLM and NLP APIs: A Journey to Avoiding Data breaches, Attacks and More

CTFs and Wargames

Name	Type	Note	Link
MyLLMBank	Attack	This challenge allows you to experiment with jailbreaks/prompt injection against LLM chat agents that use ReAct to call tools.	https://myllmbank.com/
MyLLMDoc	Attack	This is an advanced challenge focusing on multi-chain prompt injection scenarios, way beyond the standard chatbot jailbreak.	https://myllmdoc.com/
Dreadnode Crucible	Machine Learning Red Teaming	Crucible is a "Capture the flag" platform made for security researchers, data scientists, and developers with an interest in AI security. You'll get access to a variety of challenges which are designed to build your skills in adversarial machine learning and model security. These challenges include dataset analysis, model inversion, adversarial attacks, code execution, and more.	https://crucible.dreadnode.io/
SecDim	Attack and Defence	An attack and defence challenge where players should protect their chatbot secret phrase while attacking other players chatbot to exfiltrate theirs.	https://play.secdim.com/game/ai-battle
GPT Prompt Attack	Attack	Goal of this game is to come up with the shortest user input that tricks the system prompt into returning the secret key back to you.	https://ggpt.43z.one
Gandalf	Attack	Your goal is to make Gandalf reveal the secret password for each level. However, Gandalf will level up each time you guess the password, and will try harder not to give it away	https://gandalf.lakera.ai

Companion Projects

Open Source Tools

Name	Type	Note	Link
GuardR(ai)l	AI/ML Threat Modeling Solution	Project GuardRail provides a questionnaire that includes a set of threat modeling questions for AI/ML applications. There are questions specific to generative AI applications.	https://github.com/Comcast/ProjectGuardRail

Click the "Edit" button in the top-right corner to make changes/additions. Don't clone it. Just edit in place.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly