Search

Your search keyword '"Tramèr, Florian"' showing total 108 results

Search Constraints

Start Over You searched for: Author "Tramèr, Florian" Remove constraint Author: "Tramèr, Florian"
108 results on '"Tramèr, Florian"'

Search Results

1. Extracting Training Data from Document-Based VQA Models

2. Adversarial Search Engine Optimization for Large Language Models

3. Blind Baselines Beat Membership Inference Attacks for Foundation Models

4. AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents

5. Adversarial Perturbations Cannot Reliably Protect Artists From Generative AI

6. Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition

7. Evaluations of Machine Learning Privacy Defenses are Misleading

8. Competition Report: Finding Universal Jailbreak Backdoors in Aligned LLMs

9. Foundational Challenges in Assuring Alignment and Safety of Large Language Models

10. Privacy Backdoors: Stealing Data with Corrupted Pretrained Models

11. JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models

12. Stealing Part of a Production Language Model

13. Query-Based Adversarial Prompt Generation

14. Universal Jailbreak Backdoors from Poisoned Human Feedback

15. Privacy Side Channels in Machine Learning Systems

16. Backdoor Attacks for In-Context Learning with Language Models

17. Are aligned neural networks adversarially aligned?

18. Evaluating Superhuman Models with Consistency Checks

19. Evading Black-box Classifiers Without Breaking Eggs

20. Randomness in ML Defenses Helps Persistent Attackers and Hinders Evaluators

21. Poisoning Web-Scale Training Datasets is Practical

22. Tight Auditing of Differentially Private Machine Learning

23. Extracting Training Data from Diffusion Models

24. Position: Considerations for Differentially Private Learning with Large-Scale Public Pretraining

25. Preventing Verbatim Memorization in Language Models Gives a False Sense of Privacy

26. Preprocessors Matter! Realistic Decision-Based Attacks on Machine Learning Systems

27. Red-Teaming the Stable Diffusion Safety Filter

28. SNAP: Efficient Extraction of Private Properties with Poisoning

29. Measuring Forgetting of Memorized Training Examples

30. Increasing Confidence in Adversarial Robustness Evaluations

31. (Certified!!) Adversarial Robustness for Free!

32. The Privacy Onion Effect: Memorization is Relative

33. Truth Serum: Poisoning Machine Learning Models to Reveal Their Secrets

34. Debugging Differential Privacy: A Case Study for Privacy Auditing

35. Quantifying Memorization Across Neural Language Models

36. What Does it Mean for a Language Model to Preserve Privacy?

37. Counterfactual Memorization in Neural Language Models

38. Membership Inference Attacks From First Principles

39. Large Language Models Can Be Strong Differentially Private Learners

40. On the Opportunities and Risks of Foundation Models

41. NeuraCrypt is not private

42. Detecting Adversarial Examples Is (Nearly) As Hard As Classifying Them

43. Data Poisoning Won't Save You From Facial Recognition

44. Antipodes of Label Differential Privacy: PATE and ALIBI

45. Extracting Training Data from Large Language Models

46. Differentially Private Learning Needs Better Features (or Much More Data)

47. Is Private Learning Possible with Instance Encoding?

48. Label-Only Membership Inference Attacks

49. On Adaptive Attacks to Adversarial Example Defenses

50. Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial Perturbations

Catalog

Books, media, physical & digital resources