The goal of RobustBench is to systematically track the real progress in adversarial robustness. There are already more than 2'000 papers on this topic, but it is still unclear which approaches really work and which only lead to overestimated robustness. We start from benchmarking common corruptions, \(\ell_\infty\)- and \(\ell_2\)-robustness since these are the most studied settings in the literature. We use AutoAttack, an ensemble of white-box and black-box attacks, to standardize the evaluation (for details see our paper) of the \(\ell_p\) robustness and CIFAR-10-C for the evauation of robustness to common corruptions. Additionally, we open source the RobustBench library that contains models used for the leaderboard to facilitate their usage for downstream applications.
Up-to-date leaderboard based
on 30+ recent papers
Unified access to 20+ state-of-the-art
robust models via Model Zoo
# !pip install git+https://github.com/RobustBenchfirstname.lastname@example.org from robustbench.utils import load_model # Load a model from the model zoo model = load_model(model_name='Carmon2019Unlabeled', dataset='cifar10', threat_model='Linf') # Evaluate the Linf robustness of the model using AutoAttack from robustbench.eval import benchmark clean_acc, robust_acc = benchmark(model, dataset='cifar10', threat_model='Linf')
Leaderboard: CIFAR-10, \( \ell_\infty = 8/255 \), Untargeted, AutoAttack
Leaderboard: CIFAR-10, \( \ell_2 = 0.5 \), Untargeted, AutoAttack
Leaderboard: CIFAR-10, Common Corruptions, CIFAR-10-C
Leaderboard: CIFAR-100, \( \ell_\infty = 8/255 \), Untargeted, AutoAttack
Leaderboard: CIFAR-100, Common Corruptions, CIFAR-100-C
➤ Wait, how does this leaderboard differ from the
AutoAttack leaderboard? 🤔
The AutoAttack leaderboard is maintained simultaneously with the RobustBench L2 / Linf leaderboards by Francesco Croce, and all the changes to either of them will be synchronized (given that the 3 restrictions on the models are met for the RobustBench leaderboard). One can see the current L2 / Linf RobustBench leaderboard as a continuously updated fork of the AutoAttack leaderboard extended by adaptive evaluations, Model Zoo, and clear restrictions on the models we accept. And in the future, we will extend RobustBench with other threat models and potentially with a different standardized attack if it's shown to perform better than AutoAttack.
➤ Wait, how is it different from
robust-ml.org focuses on adaptive evaluations, but we provide a standardized benchmark. Adaptive evaluations are great (e.g., see Tramer et al., 2020), but very time-consuming and cannot be standardized. Instead, we argue that one can estimate robustness accurately without adaptive attacks but for this one has to introduce some restrictions on the considered models. See our paper for more details.
➤ How is it related to libraries like
These libraries provide implementations of different attacks. Besides the standardized benchmark, RobustBench additionally provides a repository of the most robust models. So you can start using the robust models in one line of code (see the tutorial here).
➤ Why is Lp-robustness still interesting in 2020? 🤔
There are numerous interesting applications of Lp-robustness that span transfer learning (Salman et al. (2020), Utrera et al. (2020)), interpretability (Tsipras et al. (2018), Kaur et al. (2019), Engstrom et al. (2019)), security (Tramèr et al. (2018), Saadatpanah et al. (2019)), generalization (Xie et al. (2019), Zhu et al. (2019), Bochkovskiy et al. (2020)), robustness to unseen perturbations (Xie et al. (2019), Kang et al. (2019)), stabilization of GAN training (Zhong et al. (2020)).
➤ Does this benchmark only focus on Lp-robustness? 🤔
Lp-robustness is the most well-studied area, so we focus on it first. However, in the future, we plan to extend the benchmark to other perturbations sets beyond Lp-balls.
➤ What about verified adversarial robustness? 🤔
We specifically focus on defenses which improve empirical robustness, given the lack of clarity regarding which approaches really improve robustness and which only make some particular attacks unsuccessful. For methods targeting verified robustness, we encourage the readers to check out Salman et al. (2019) and Li et al. (2020).
➤ What if I have a better attack than the one used in this
We will be happy to add a better attack or any adaptive evaluation that would complement our default standardized attacks.