Web LLM
LLMs are AI algorithms that generate plausible responses by predicting word sequences from user inputs.
Methodology
- Identify the LLM’s inputs, including direct (e.g., a prompt) and indirect (e.g., training data).
- Determine the data and APIs accessible to the LLM
- Examine this attack surface for vulnerabilities.
Mapping LLM API attack surface
- Ask the LLM which APIs it can access
- Providing misleading context and re-asking the question
- Claim that you are the LLM’s developer and so should have a higher level of privilege
Chaining vulnerabilities in LLM APIs
The idea is to map the APIs and then send classic web exploits to all identified APIs.
- Suppose you normally have access to a “Newsletter Subscription” feature but you can’t control any parameters.
- Imagine that also LLM has access to “Newsletter Subscription” API. You can try to control how this API is called…
- E.g., if a system command is used you might get an RCE if you ask the LLM to call the Newsletter Subscription API with the argument
$(whoami)@your-email.com
Insecure output handling
A web app uses an LLM to generate content from user prompts without sanitization. You could submit a crafted prompt causing the LLM to return unsanitized JavaScript, leading to XSS/CSRF etc.