💡This is a reader's perspective on the paper written for IEEE S&P 2024
Brief Description
Observations
First of all, the idea of "generating prompts" instead of the actual content makes sense, because the attacker can feed the prompts back to the LLM and achieve different results. However, this work is more a guide on "prompt techniques" than a demonstration of some specific properties of LLMs.
Initial Questions
I guess that the first question that I had while reading through the paper was if that strategy of creating prompts is really useful? In the end, I got the impression that even though that strategy provides randomization, creating the pages does not seem to be too relevant.
Something else I wanted to point out was the "human verification process". The author pointed out that two different people used the chat services to create the web pages, however
Where do the experiment ideas come from?
I like the idea of creating a classifier as an experiment for the pipeline they proposed, I am just not sure that classifying the prompts is really a thing. But I guess it was a natural process of "Now that we created a tool for attackers, let us see how we can classify the approach", even though the prompt would be inside of the black box of the attack, in which case the defender would not be able to access it.
What are the interesting ideas/results?
The first thing that I really liked about reading it, was the idea of using real phishing email messages from APWG, to generate more using LLMs (By using the strategy of generating prompts). The only thing I wanted to know is how it bypass Scam detection that email providers implement.
One of the ideas that they propose is creating a classifier for prompts. However, it appeared to me that it was basically a solution they created themselves.
No comments:
Post a Comment