AI Model Vulnerabilities: Backdoors in LLMs and Beyond

Author: Stefano Rossi (www.rossistefano.ch)
Publish Date: 2025-05-04
Last Update: 2025-05-04

Keywords

AI, LLM, backdoor, prompt injection, vulnerability, security, machine learning, adversarial attack, data poisoning, python, tutorial, example, demo, research, study, overview, detection, defense, mitigation, CVSS

Description

This personal research explores abuses and vulnerabilities in AI models, with a focus on backdoor attacks in Large Language Models (LLMs) and related systems.

The main content is the Jupyter Notebook Abuses and Vulnerabilities in AI Models.ipynb, which provides:

An overview of backdoor attacks during training and inference.
Discussion of prompt injection and other vulnerabilities.
Methods for detecting and defending against backdoors.
A practical demonstration of a backdoor attack on a text classifier, including code and analysis.
References to relevant research papers.
A summary of the CVSS (Common Vulnerability Scoring System) for evaluating the severity of vulnerabilities.

See the notebook for a detailed exploration of these topics. Or get pdf version Abuses and Vulnerabilities in AI Models.pdf is also available.

Slides

A presentation summarizing the research is available at the following link: AI Model Vulnerabilities: Backdoors in LLMs and Beyond

To deploy locally, using pnpm and slidev, run the following commands:

pnpm install
pnpm dev

Copyright

This work is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License. You are free to share and adapt the material, but you must give appropriate credit, provide a link to the license, and indicate if changes were made. You may not use the material for commercial purposes.

2.1 KiB Raw Blame History

AI Model Vulnerabilities: Backdoors in LLMs and Beyond

Keywords

Description

Slides

Copyright

2.1 KiB

Raw Blame History