小本本系列:prompt engineering课堂笔记

由 instructions tune 说起

工作中出于 ROI 的考虑,碰到一个问题如何在不做模型训练的前提下让通用预模型(如 qwen-max)具备特定领域的推理能力(例如安全领域的代码分析能力),于是我接触到了指令调优(instructions tune)的理念。

指令调优是一种在指令提示和相应输出的标记数据集上微调大型语言模型( LLMs )的技术。它不仅提高了特定任务上的模型性能,而且还提高了遵循一般指令的模型性能,从而有助于调整预训练模型以适应实际使用。

我发现在做指令调优的时候,除了优质的标记数据以外,prompt 的质量也是非常重要,这让我觉得需要系统的学习一下 prompt engineering,我选择吴恩达的 DeepLearningAI 课程,以下就是我的学习笔记。

编写 prompt 两个原则

原则一

write specific and clear instructions.

策略一

use delimiters:

  • triple quotes: """
  • triple backticks: ```
  • triple dashes: ---
  • angle bracket: < >
  • XML tags: <tag></tag>

策略二

use structured output:

  • JSON
  • HTML

策略三

  • Check wether the conditions are satisfied.

Check assumptions required to do the task.

prompt = f"""
You will be provided with text delimited by triple quotes. 
If it contains a sequence of instructions, \ 
re-write those instructions in the following format:

Step 1 - ...
Step 2 - …
…
Step N - …

If the text does not contain a sequence of instructions, \ 
then simply write \"No steps provided.\"

\"\"\"{text_2}\"\"\"
"""

策略四

  • "few-shot" prompting

Give successful examples of completing tasks, Then ask model to perform the task.

原则二

give the model time to think.

策略一

  • Specify the steps required to complete a task

策略二

  • Instruct the model to work out its own solution before rushing to a conclusion
not_worked_prompt = f"""
Determine if the student's solution is correct or not.

Question:
I'm building a solar power installation and I need \
 help working out the financials. 
- Land costs $100 / square foot
- I can buy solar panels for $250 / square foot
- I negotiated a contract for maintenance that will cost \ 
me a flat $100k per year, and an additional $10 / square \
foot
What is the total cost for the first year of operations 
as a function of the number of square feet.

Student's Solution:
Let x be the size of the installation in square feet.
Costs:
1. Land cost: 100x
2. Solar panel cost: 250x
3. Maintenance cost: 100,000 + 100x
Total cost: 100x + 250x + 100,000 + 100x = 450x + 100,000
"""

worked_prompt = f"""
Your task is to determine if the student's solution \
is correct or not.
To solve the problem do the following:
- First, work out your own solution to the problem including the final total. 
- Then compare your solution to the student's solution \ 
and evaluate if the student's solution is correct or not. 
Don't decide if the student's solution is correct until 
you have done the problem yourself.

Use the following format:
Question:
'''
question here
'''
Student's solution:
'''
student's solution here
'''
Actual solution:
'''
steps to work out the solution and your solution here
'''
Is the student's solution the same as actual solution \
just calculated:
'''
yes or no
'''
Student grade:
'''
correct or incorrect
'''

Question:
'''
I'm building a solar power installation and I need help \
working out the financials. 
- Land costs $100 / square foot
- I can buy solar panels for $250 / square foot
- I negotiated a contract for maintenance that will cost \
me a flat $100k per year, and an additional $10 / square \
foot
What is the total cost for the first year of operations \
as a function of the number of square feet.
'''
Student's solution:
'''
Let x be the size of the installation in square feet.
Costs:
1. Land cost: 100x
2. Solar panel cost: 250x
3. Maintenance cost: 100,000 + 100x
Total cost: 100x + 250x + 100,000 + 100x = 450x + 100,000
'''
Actual solution:
"""

如何开发 prompt

遵守 prompt 编写原则

开发 prompt 的时候首先需要遵守上面提到的原则,采用对应的策略进行编写。

迭代开发 prompt

prompt 可以做的事情

通过 prompt 可以让大模型帮助做的事情包括:

  • 文本总结
  • 逻辑推断
  • 文本转换(翻译,语气变化,格式转换,语法、拼写检查)
  • 文本扩写

summary

prompt = f"""
Your task is to generate a short summary of a product \
review from an ecommerce site to give feedback to the \
Shipping deparmtment. 

Summarize the review below, delimited by triple 
backticks, in at most 30 words, and focusing on any aspects \
that mention shipping and delivery of the product. 

Review: ```{prod_review}```
"""

inferring

prompt = f"""
Identify the following items from the review text: 
- Sentiment (positive or negative)
- Is the reviewer expressing anger? (true or false)
- Item purchased by reviewer
- Company that made the item

The review is delimited with triple backticks. \
Format your response as a JSON object with \
"Sentiment", "Anger", "Item" and "Brand" as the keys.
If the information isn't present, use "unknown" \
as the value.
Make your response as short as possible.
Format the Anger value as a boolean.

Review text: '''{lamp_review}'''
"""

transforming

1. translate

prompt = f"""
Translate the following English text to Spanish: \ 
```Hi, I would like to order a blender```
"""

2. tone transformation

prompt = f"""
Translate the following from slang to a business letter: 
'Dude, This is Joe, check out this spec on this standing lamp.'
"""

3. format conversion

data_json = { "resturant employees" :[ 
    {"name":"Shyam", "email":"shyamjaiswal@gmail.com"},
    {"name":"Bob", "email":"bob32@gmail.com"},
    {"name":"Jai", "email":"jai87@gmail.com"}
]}

prompt = f"""
Translate the following python dictionary from JSON to an HTML \
table with column headers and title: {data_json}
"""

4. Spellcheck/Grammar check

prompt = f"""Proofread and correct the following text
and rewrite the corrected version. If you don't find
and errors, just say "No errors found". Don't use 
any punctuation around the text:
```{t}```
"""

expanding

prompt = f"""
You are a customer service AI assistant.
Your task is to send an email reply to a valued customer.
Given the customer email delimited by ```, \
Generate a reply to thank the customer for their review.
If the sentiment is positive or neutral, thank them for \
their review.
If the sentiment is negative, apologize and suggest that \
they can reach out to customer service. 
Make sure to use specific details from the review.
Write in a concise and professional tone.
Sign the email as `AI customer agent`.
Customer review: ```{review}```
Review sentiment: {sentiment}
"""

chatbot

role

  • system
  • user
  • assistant

memory

def get_completion_from_messages(messages, model="gpt-3.5-turbo", temperature=0):
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=temperature, # this is the degree of randomness of the model's output
    )
#     print(str(response.choices[0].message))
    return response.choices[0].message["content"]

messages =  [  
{'role':'system', 'content':'You are friendly chatbot.'},    
{'role':'user', 'content':'Yes,  can you remind me, What is my name?'}  ]
response = get_completion_from_messages(messages, temperature=1)
print(response)

messages =  [  
{'role':'system', 'content':'You are friendly chatbot.'},
{'role':'user', 'content':'Hi, my name is Isa'},
{'role':'assistant', 'content': "Hi Isa! It's nice to meet you. \
Is there anything I can help you with today?"},
{'role':'user', 'content':'Yes, you can remind me, What is my name?'}  ]
response = get_completion_from_messages(messages, temperature=1)
print(response)

课程学习总结

  • 写 prompt 的两大原则:
    1. 指令(instructions)需要清晰和具体,几个策略
      1. 使用分隔符来区分指令中不同的内容
      2. 使用结构化的输出,指定 JSON、HTML 等
      3. 让模型校验指令中的条件是否满足
      4. 指令中给出一些样例/样本
    2. 给模型一些时间「思考」
      1. 指令中给出完成任务的步骤
      2. 让模型自己推理给出解决方法然后再得出结论
  • prompt 开发
    • 原则:遵守上面的两大原则和编写策略
    • 过程:开发prompt --> 分析结果(没有给出期望的结果) --> 增加指令(让模型自己推理或者给出步骤) --> 通过批量的样例对 prompt 优化
  • prompt 常见的几种能力:
    • 总结概括
    • 分析推断
    • 解析转换(翻译、语气、格式、语法等)
    • 补充扩写
    • 对话聊天(带记忆能力)

References

  1. What Is Instruction Tuning? | IBM