Para testar a importância de prompts bem elaborados, eu criei dois similares.
Diferença: a remoção da frase final (sobre o objetivo da empresa com essa análise).
Prompt 1:
Hi, GPT! An air fryer manufacturer wants to analyze its products' online reviews over their product performance and functionalities. Between """ and """ are the reviews to be analyzed:
"""
- Review 1: : I use the air fryer a lot. It works very well. I made hamburgers in it last night, the were wonderful! The rack you put the food on is a little hard to clean. It has lots of crevices that food gets in. The basket itself is very easy to clean. I also do not trust the rack not to fall out when dumping out food, like French fries.
- Review 2: Air fryer works fine until the handle breaks off. There's only 3 screws that attach the handle to a plastic piece on the basket. The plastic piece breaks and the screws fall out so there's no way to attach the handle to the basket rendering the air fryer useless. If there were 4 screws or if the screws attached to metal the handle attachment would have been much more secure and less prone to breaking.
- Review 3: Even though I returned it for the bigger one of the same company I liked the fryer. Ferry ez to use But hard to judge. Seafood button isn't fish and it was a guessing game how to cook the product. I way over cooked it and the chicken legs came out crispy, but the other side was raw. But, this is my first time using an air fryer so maybe it's me. Very sleek on the counter buttons light up real nice. I made tater tots and let me tell you they came out perfect and stayed hot. I like it so much that I returned this one and got the same company Bigger one. But I had to buy it from the place itself not from Amazon and I don't lie that at all. Hope this review helps.
"""
For each review, the company expects the following results, please:
- A summary with 30 words max, written in portuguese;
- A list with the product's strengths;
- A list with the product's weaknesses;
- The reviewer's general feeling of the product (e.g. positive, neutral or negative feelings).
Output format:
- Presented in JSON, with the following 5 keys (for each review):
- ID da resenha
- Resumo
- Lista de pontos fortes:
- Lista de pontos fracos:
- Sentimento:
The company's goal is to identify product improvement opportunities and to enhance its marketing strategy.
Prompt 2:
Hi, GPT! An air fryer manufacturer wants to analyze[...]
Output format:
- Presented in JSON, with the following 5 keys (for each review):
- ID da resenha
- Resumo
- Lista de pontos fortes:
- Lista de pontos fracos:
- Sentimento:
Análise do resultado
Ambos retornaram como pedido, em JSON. Para compará-los:
Now, I have two analyses with the same source and conditions, but created using slightly different prompts. Could you please compare these both analyses and tell me what's different between them?
Analysis 1: [...]
Analysis 2: [...]
If you need any clarification, ask me before generating any result, please.
Resultado:
Analysis 1
More:
- literal,
- direct,
- close to the original text,
- descriptive,
- conservative in interpretation.
It tends to:
- repeat information almost exactly as written,
- avoid adding implications,
- stay safer and more objective.
Analysis 2
More:
- interpretative,
- business-oriented,
- polished,
- inferential,
- analytical.
It tends to:
- infer product/design implications,
- rewrite ideas more naturally,
- transform complaints into business insights,
- sound more professional/consulting-oriented.
Ele também detalhou, com exemplos, as diferenças entre cada resultado em relação a:
- Estilo de escrita
- Nível de interpretação (principal diferença)
- Valor para Business Intelligence
- Extração de pontos fortes e de fracos
- Tom
- Risco de alucinações / Superinferência
E a sua conclusão foi que não há uma opção melhor, apenas a mais adequada para cada necessidade.
O primeiro resultado seria melhor para a criação de um NLP dataset, por exemplo, devido à sua extração precisa de dados.
Já o segundo seria mais voltado para interações com stakeholders, por exemplo, onde o objetivo é a comunicação de insights acionáveis (informações, não dados).
Minha análise
Ao incluir o contexto da análise no prompt, eu esperava um resultado mais analítico, o que, surpreendentemente, só aconteceu ao removê-lo (no segundo prompt).
Também questionei o modelo sobre isso e entendi que apenas a menção ao contexto, sem instruções analíticas claras, fez com que ele entendesse a importância desta análise e gerasse um resultado "mais seguro", já que serviria como base para decisões de negócio.
Reforçando, assim:
- Contexto de Prompt ≠ Instruções de Prompt
- O "porquê" da informação nem sempre controla o "como"