Search:
Match:
4 results
Research#llm📝 BlogAnalyzed: Dec 29, 2025 18:32

Nicholas Carlini on AI Security, LLM Capabilities, and Model Stealing

Published:Jan 25, 2025 21:22
1 min read
ML Street Talk Pod

Analysis

This article summarizes a podcast interview with Nicholas Carlini, a researcher from Google DeepMind, focusing on AI security and LLMs. The discussion covers critical topics such as model-stealing research, emergent capabilities of LLMs (specifically in chess), and the security vulnerabilities of LLM-generated code. The interview also touches upon model training, evaluation, and practical applications of LLMs. The inclusion of sponsor messages and a table of contents provides additional context and resources for the reader.
Reference

The interview likely discusses the security pitfalls of LLM-generated code.

Research#llm📝 BlogAnalyzed: Dec 29, 2025 06:09

Stealing Part of a Production Language Model with Nicholas Carlini - #702

Published:Sep 23, 2024 19:21
1 min read
Practical AI

Analysis

This article summarizes a podcast episode of Practical AI featuring Nicholas Carlini, a research scientist at Google DeepMind. The episode focuses on adversarial machine learning and model security, specifically Carlini's 2024 ICML best paper, which details the successful theft of the last layer of production language models like ChatGPT and PaLM-2. The discussion covers the current state of AI security research, the implications of model stealing, ethical concerns, attack methodologies, the significance of the embedding layer, remediation strategies by OpenAI and Google, and future directions in AI security. The episode also touches upon Carlini's other ICML 2024 best paper regarding differential privacy in pre-trained models.
Reference

The episode discusses the ability to successfully steal the last layer of production language models including ChatGPT and PaLM-2.

Research#llm📝 BlogAnalyzed: Dec 29, 2025 07:37

Watermarking Large Language Models to Fight Plagiarism with Tom Goldstein - 621

Published:Mar 20, 2023 20:04
1 min read
Practical AI

Analysis

This article from Practical AI discusses Tom Goldstein's research on watermarking Large Language Models (LLMs) to combat plagiarism. The conversation covers the motivations behind watermarking, the technical aspects of how it works, and potential deployment strategies. It also touches upon the political and economic factors influencing the adoption of watermarking, as well as future research directions. Furthermore, the article draws parallels between Goldstein's work on data leakage in stable diffusion models and Nicholas Carlini's research on LLM data extraction, highlighting the broader implications of data security in AI.
Reference

We explore the motivations behind adding these watermarks, how they work, and different ways a watermark could be deployed, as well as political and economic incentive structures around the adoption of watermarking and future directions for that line of work.

Research#llm📝 BlogAnalyzed: Dec 29, 2025 07:37

Privacy and Security for Stable Diffusion and LLMs with Nicholas Carlini - #618

Published:Feb 27, 2023 18:26
1 min read
Practical AI

Analysis

This article from Practical AI discusses privacy and security concerns in the context of Stable Diffusion and Large Language Models (LLMs). It features an interview with Nicholas Carlini, a research scientist at Google Brain, focusing on adversarial machine learning, privacy issues in black box and accessible models, privacy attacks in vision models, and data poisoning. The conversation explores the challenges of data memorization and the potential impact of malicious actors manipulating training data. The article highlights the importance of understanding and mitigating these risks as AI models become more prevalent.
Reference

In our conversation, we discuss the current state of adversarial machine learning research, the dynamic of dealing with privacy issues in black box vs accessible models, what privacy attacks in vision models like diffusion models look like, and the scale of “memorization” within these models.