AI Model Vulnerabilities: Backdoors in LLMs and Beyond https://rossistefano.ch
Find a file
2025-07-10 00:51:37 +02:00
components first commit 2025-07-10 00:44:37 +02:00
docs first commit 2025-07-10 00:44:37 +02:00
images first commit 2025-07-10 00:44:37 +02:00
pages first commit 2025-07-10 00:44:37 +02:00
snippets first commit 2025-07-10 00:44:37 +02:00
styles first commit 2025-07-10 00:44:37 +02:00
.gitignore first commit 2025-07-10 00:44:37 +02:00
.gitlab-ci.yml first commit 2025-07-10 00:44:37 +02:00
.npmrc first commit 2025-07-10 00:44:37 +02:00
netlify.toml first commit 2025-07-10 00:44:37 +02:00
package.json first commit 2025-07-10 00:44:37 +02:00
pnpm-lock.yaml first commit 2025-07-10 00:44:37 +02:00
pnpm-workspace.yaml first commit 2025-07-10 00:44:37 +02:00
README.md updated links for the new repo 2025-07-10 00:51:37 +02:00
slides.md first commit 2025-07-10 00:44:37 +02:00
vercel.json first commit 2025-07-10 00:44:37 +02:00

AI Model Vulnerabilities: Backdoors in LLMs and Beyond

Keywords

AI, LLM, backdoor, prompt injection, vulnerability, security, machine learning, adversarial attack, data poisoning, python, tutorial, example, demo, research, study, overview, detection, defense, mitigation, CVSS

Description

This personal research explores abuses and vulnerabilities in AI models, with a focus on backdoor attacks in Large Language Models (LLMs) and related systems.

The main content is the Jupyter Notebook Abuses and Vulnerabilities in AI Models.ipynb, which provides:

  • An overview of backdoor attacks during training and inference.
  • Discussion of prompt injection and other vulnerabilities.
  • Methods for detecting and defending against backdoors.
  • A practical demonstration of a backdoor attack on a text classifier, including code and analysis.
  • References to relevant research papers.
  • A summary of the CVSS (Common Vulnerability Scoring System) for evaluating the severity of vulnerabilities.

See the notebook for a detailed exploration of these topics. Or get pdf version Abuses and Vulnerabilities in AI Models.pdf is also available.

Slides

A presentation summarizing the research is available at the following link: AI Model Vulnerabilities: Backdoors in LLMs and Beyond

To deploy locally, using pnpm and slidev, run the following commands:

pnpm install
pnpm dev

This work is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License. You are free to share and adapt the material, but you must give appropriate credit, provide a link to the license, and indicate if changes were made. You may not use the material for commercial purposes.