942 B
942 B
Major Benchmarks for LLM Security
Meta's CyberSecEval 2
Introduced in April 2024, this benchmark suite evaluates both LLM security risks and cybersecurity capabilities.
SEvenLLM-Bench
A multiple-choice Q&A benchmark with 1300 test samples for evaluating LLM cybersecurity capabilities.
SecLLMHolmes
A generalized, automated framework for evaluating LLM performance in vulnerability detection.
SECURE
The Security Extraction, Understanding & Reasoning Evaluation benchmark designed to assess LLM performance in realistic cybersecurity scenarios.