Description
Training data can be deliberately manipulated ("poisoned") to create hidden vulnerabilities ("backdoors") that an attacker can later exploit. Since LLMs are trained on data from untrusted sources like the internet, they are susceptible to such attacks, but the extent of this vulnerability and effective defenses are not well understood.