Instructors: Ryan Feng, Haizhong Zheng
GSI: Elisa Tsai
Contact information: (rtfeng, hzzheng, eltsai)

Class times: MW 3:00 - 4:30 PM, DOW 2150
Office Hours: Th 2:00 - 3:00 PM at BBB 2717 (Ryan), W 11:00 AM - 12:00 PM at BBB 2717 (Haizhong),
Th 10:30 - 11:30 AM at BBB 1637 (Elisa), starts 1/11
Syllabus: [Link]


Schedule subject to change.
Date Topic Papers Things Due
Week 1
Wed Jan 4
Introduction / Adversarial Machine Learning Background reading for more info:
Adversarial Classification, (Dalvi et al.)
Evasion Attacks against Machine Learning at Test Time, (Biggio et al.)
Intriguing Properties of Neural Networks, (Szegedy et al.)
Explaining and Harnessing Adversarial Examples, (Goodfellow et al.)
Week 2
Mon Jan 9
Adversarial Machine Learning Background reading for more info:
Intriguing Properties of Neural Networks, (Szegedy et al.)
Explaining and Harnessing Adversarial Examples, (Goodfellow et al.)
Towards Deep Learning Models Resistant to Adversarial Attacks (Madry et al.)
Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks (Xu et al.)
Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples (Athalye et al.)
PyTorch tutorial
Week 2
Wed Jan 11
Model Stealing / Poisoning Background reading for more info:
Stealing Machine Learning Models via Prediction APIs (Tramer et al.)
Knockoff Nets: Stealing Functionality of Black-Box Models (Orekondy et al.)
Data-Free Model Extraction (Truong et al.)
Towards Data-Free Model Stealing in a Hard Label Setting (Sanyal et al.)
BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain (Gu et al.)
Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning (Chen et al.)
Input-Aware Dynamic Backdoor Attack (Nguyen and Tran)
Anti-Backdoor Learning: Training Clean Models on Poisoned Data (Li et al.)
BackdoorBench: A Comprehensive Benchmark of Backdoor Learning (Wu et al.)
Week 3
Mon Jan 16
No class - MLK Day
Week 3
Wed Jan 18
Poisoning (cont.) / Privacy Background reading for more info:
Membership Inference Attacks against Machine Learning Models (Shokri et al.)
Deep Learning with Differential Privacy (Abadi et al.)
Extracting Training Data from Large Language Models (Carlini et al.)
Paper Preferences Form due on Wed. Jan 18

Homework 1 due on Fri Jan 20
Week 4
Mon Jan 23
Project Discussion
Week 4
Wed Jan 25
Deepfake Lecture / Early AML Papers Background reading for more info for deepfake lecture:
Progressive Growing of GANs for Improved Quality, Stability, and Variation (Karras et al.)
A Style-Based Generator Architecture for Generative Adversarial Networks (Karras et al.)
Leveraging Frequency Analysis for Deep Fake Image Recognition (Frank et al.)
Evading Deepfake-Image Detectors with White- and Black-Box Attacks (Carlini and Farid)

Paper Presentation:
(2) Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks (Papernot et al.)
Week 4 Paper Review
Week 5
Mon Jan 30
Adversarial Attacks (3) Towards Evaluating the Robustness of Neural Networks (Carlini and Wagner)
(4) Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples (Athalye et al.)
Week 5 Paper Review

Homework 2 due Feb 1
Week 5
Wed Feb 1
Adversarial Attacks / Adversarial Training I (5) Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks (Croce and Hein)
(6) Theoretically Principled Trade-off between Robustness and Accuracy (Zhang et al.)
Week 6
Mon Feb 6
Efficient Adversarial Training (7) Fast is better than free: Revisiting adversarial training (Wong et al.)
(8) Efficient Adversarial Training with Transferable Adversarial Examples (Zheng et al.)
Week 6 Paper Review

Week 6
Wed Feb 8
Black-box Attacks (9) HopSkipJumpAttack: A Query-Efficient Decision-Based Attack (Chen et al.)
(10) Black-box Adversarial Attacks with Limited Queries and Information (Ilyas et al.)
Week 7
Mon Feb 13
Physical Attacks (11) Synthesizing Robust Adversarial Examples (Athalye et al.)
(12) GRAPHITE: Generating Automatic Physical Examples for Machine-Learning Attacks on Computer Vision Systems (Feng et al.)
Week 7 Paper Review
Week 7
Wed Feb 15
Certified Defenses (13) Certified Adversarial Robustness via Randomized Smoothing (Cohen et al.)
(14) (Certified!!) Adversarial Robustness for Free! (Carlini et al.)
Week 8
Mon Feb 20
Patch Attacks / Defenses (15) Adversarial Patch (Brown et al.)
(16) PatchGuard: A Provably Robust Defense against Adversarial Patches via Small Receptive Fields and Masking (Xiang et al.)
Week 8 Paper Review
Week 8
Wed Feb 22
Automatic Speech Recognition (17) Imperceptible, Robust, and Targeted Adversarial Examples for Automatic Speech Recognition (Qin et al.)
(18) Towards More Robust Keyword Spotting for Voice Assistants (Ahmed et al.)

Spring Break

Spring Break
Week 9
Mon Mar 6
Theory (19) Robustness May Be at Odds with Accuracy (Tsipras et al.)
(20) Detecting Adversarial Examples Is (Nearly) As Hard As Classifying Them (Tramer)
Week 9 Paper Review
Week 9
Wed Mar 8
Theory / Diffusion (21) Adversarial Examples Are Not Bugs, They Are Features (Ilyas et al.)
(22) Diffusion Models for Adversarial Purification (Nie et al.)
Week 10
Mon Mar 13
Model Stealing Attacks (23) Entangled Watermarks as a Defense against Model Extraction (Jia et al.)
(24) Dataset Inference: Ownership Resolution in Machine Learning (Maini et al.)
Week 10 Paper Review
Week 10
Wed Mar 15
Poisoning I (25) Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks (Shafahi et al.)
(26) Just How Toxic is Data Poisoning? A Unified Benchmark for Backdoor and Data Poisoning Attacks (Schwarzschild et al.)
Week 11
Mon Mar 20
Poisoning II (27) Spectral Signatures in Backdoor Attacks (Tran et al.)
(28) Adversarial Neuron Pruning Purifies Backdoored Deep Models (Wu and Wang)
Week 11 Paper Review
Midterm report
Week 11
Wed Mar 22
Fairness (29) Fairness Without Demographics in Repeated Loss Minimization (Hashimoto et al.)
(30) Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification (Buolamwini and Gebru)
Week 12
Mon Mar 27
Privacy I (31) Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning (Hitaj et al.)
(32) Label-Only Membership Inference Attacks (Choquette-Choo et al.)
Week 12 Paper Review
Week 12
Wed Mar 29
Privacy II (33) Extracting Training Data from Large Language Models (Carlini et al.)
(34) Scalable Private Learning with PATE (Papernot et al.)
Week 13
Mon Apr 3
Unlearning / Research vs. Industry (35) Machine Unlearning (Bourtoule et al.)
(36) "Real Attackers Don't Compute Gradients": Bridging the Gap Between Adversarial ML Research and Practice (Apruzzese et al.)
Week 13 Paper Review
Week 13
Wed Apr 5
LLM Security (37) A Watermark for Large Language Models (Kirchenbauer et al.)
(38) Can AI-Generated Text be Reliably Detected? (Sadasivan et al.)
Week 14
Mon Apr 10
Guest Lecture: Neal Mangaokar
Week 14
Wed Apr 12
Final Project Presentations Final Project Presentation
Week 15
Mon Apr 17
Final Project Presentations Final Project Report Due TBD


  • The instructors acknowledge the support from the National Science Foundation for supporting and inspiring the preparation of the course material. This course material is based upon work supported by the National Science Foundation under Grant No. 2039445. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation