【吴恩达deeplearning.ai】基于LangChain开发大语言应用模型（上）

以下内容均整理来自deeplearning.ai的同名课程
Location课程访问地址
DLAI - Learning Platform Beta (deeplearning.ai)

一、什么是LangChain

1、LangChain介绍

LangChain是一个框架，用于开发由大语言模型驱动的应用程序。开发者相信，最强大的、差异化的应用不仅会调用语言模型，而且还会具备以下原则：

数据感知：将语言模型与其他数据源连接起来。

代理性：允许语言模型与环境互动

LangChain支持python和javascript两种语言。专注于组合和模块化。

官方文档：/en/latest/

中文文档：/

2、LangChain的模块化能力

包括大量的整合对话模型、聊天模型；提示词模板，输出分析器，示例选择器。

支持检索和调用其他数据源，包括不限于文本、数组，支持多个数据检索工具。

支持搭建对话链模板，按输入信息，自动生成标准化加工后的输出结果。

可调用多个预设或者自定义的算法和小工具。

二、模型、提示词和输出解析器Models, Prompts and Output Parsers

1、Prompt template提示词模板

通常来说，我们通过以下方式调用gpt

def get_completion(prompt, model="gpt-3.5-turbo"):messages = [{"role": "user", "content": prompt}]response = openai.ChatCompletion.create(model=model,messages=messages,temperature=0, )return response.choices[0].message["content"]# 创建一个调用函数prompt = f"""Translate the text \that is delimited by triple backticks into a style that is {style}.text: ```{customer_email}```"""# 编写提示语response = get_completion(prompt)#调用生成结果

现在看下langchain怎么基于模型进行调用

from langchain.chat_models import ChatOpenAIchat = ChatOpenAI(temperature=0.0)# 加载langchain对话模型，并设置对话随机性为0template_string = """Translate the text \that is delimited by triple backticks \into a style that is {style}. \text: ```{text}```"""# 设计模板信息from langchain.prompts import ChatPromptTemplateprompt_template = ChatPromptTemplate.from_template(template_string)# 加载提示语模板，载入模板信息customer_style = """American English \in a calm and respectful tone"""customer_email = """Arrr, I be fuming that me blender lid \flew off and splattered me kitchen walls \with smoothie! And to make matters worse, \the warranty don't cover the cost of \cleaning up me kitchen. I need yer help \right now, matey!"""# 定义模板中可变字段的变量信息customer_messages = prompt_template.format_messages(style=customer_style,text=customer_email)# 调用模板，对模板中的变量进行赋值，并生成最终提示语customer_response = chat(customer_messages)# 调用提示语，生成对话结果

通过“创建包含变量信息的提示词模板”，可以按照需求场景，灵活的通过改变变量信息，生成新的提示词。实现了模板的复用。

2、Output Parsers输出解析器

将大语言模型生成的结果，转换为特定结构的输出，如字典，数组等

from langchain.output_parsers import ResponseSchemafrom langchain.output_parsers import StructuredOutputParser# 加载输出解析器gift_schema = ResponseSchema(name="gift",description="Was the item purchased\as a gift for someone else? \Answer True if yes,\False if not or unknown.")delivery_days_schema = ResponseSchema(name="delivery_days",description="How many days\did it take for the product\to arrive? If this \information is not found,\output -1.")price_value_schema = ResponseSchema(name="price_value",description="Extract any\sentences about the value or \price, and output them as a \comma separated Python list.")response_schemas = [gift_schema, delivery_days_schema,price_value_schema]# 创建一组解析规则output_parser = StructuredOutputParser.from_response_schemas(response_schemas)format_instructions = output_parser.get_format_instructions()#编译解析规则review_template_2 = """\For the following text, extract the following information:gift: Was the item purchased as a gift for someone else? \Answer True if yes, False if not or unknown.delivery_days: How many days did it take for the product\to arrive? If this information is not found, output -1.price_value: Extract any sentences about the value or price,\and output them as a comma separated Python list.text: {text}{format_instructions}"""# 创建一个提示词模板，将编译好的解析规则添加到模板中prompt = ChatPromptTemplate.from_template(template=review_template_2)messages = prompt.format_messages(text=customer_review, format_instructions=format_instructions)# 通过模板生成提示词信息response = chat(messages)# 生成结果output_dict = output_parser.parse(response.content)# 将生成结果存入字典中

三、Memory内存组件

大语言模型在通过接口调用过程中，并不会自动记忆历史问答/上下文（来进行回答）。而通过调用memory组件。langchain提供了多种记忆历史问答/上下文的方式。

Outline概要

ConversationBufferMemoryConversationBufferWindowMemoryConversationTokenBufferMemoryConversationSummaryMemory

ConversationBufferMemory对话内存

from langchain.chat_models import ChatOpenAIfrom langchain.chains import ConversationChainfrom langchain.memory import ConversationBufferMemory# 加载所需包llm = ChatOpenAI(temperature=0.0)memory = ConversationBufferMemory()conversation = ConversationChain(llm=llm, memory = memory,verbose=True)# 船创建一个对话，创建一个上下文储存区，创建一个链式沟通会话。conversation.predict(input="Hi, my name is Andrew")conversation.predict(input="What is 1+1?")conversation.predict(input="What is my name?")#在会话中添加会话内容，程序会自动将提问和回答一起保存到上下文储存区print(memory.buffer)memory.load_memory_variables({})#显示上下文储存区内保存的会话内容memory.save_context({"input": "Hi"}, {"output": "What's up"})#直接对上下文储存区内的会话内容进行赋值（赋值内容为问答对）

ConversationBufferWindowMemory有限对话内存

from langchain.memory import ConversationBufferWindowMemory# 加载组件memory = ConversationBufferWindowMemory(k=1)# 添加一个只有1空间的记忆内存memory.save_context({"input": "Hi"},{"output": "What's up"})memory.save_context({"input": "Not much, just hanging"},{"output": "Cool"})# 此时，上下文储存区里面，只有第二个对话的记忆，即在1空间情况下，程序只会记忆最新的1空间的问答记忆。

ConversationTokenBufferMemory有限词汇内存

from langchain.memory import ConversationTokenBufferMemoryfrom langchain.llms import OpenAIllm = ChatOpenAI(temperature=0.0)# 加载组件memory = ConversationTokenBufferMemory(llm=llm, max_token_limit=30)# 创建一个只有30词汇大小的记忆空间（因为有限空间的判断也会用到大预言模型，所以需要加载llm）memory.save_context({"input": "AI is what?!"},{"output": "Amazing!"})memory.save_context({"input": "Backpropagation is what?"},{"output": "Beautiful!"})memory.save_context({"input": "Chatbots are what?"}, {"output": "Charming!"})# 在这种情况下，程序只会保存不大于30个词汇的最新的问答，此时并不会强行保证问答都存在，仅包含答案也行。memory.load_memory_variables({})# 显示结果：{'history': 'AI: Beautiful!\nHuman: Chatbots are what?\nAI: Charming!'}

ConversationSummaryMemory总结式记忆内存

from langchain.memory import ConversationSummaryBufferMemory# 加载包schedule = "There is a meeting at 8am with your product team. \You will need your powerpoint presentation prepared. \9am-12pm have time to work on your LangChain \project which will go quickly because Langchain is such a powerful tool. \At Noon, lunch at the italian resturant with a customer who is driving \from over an hour away to meet you to understand the latest in AI. \Be sure to bring your laptop to show the latest LLM demo."# 一个长内容memory = ConversationSummaryBufferMemory(llm=llm, max_token_limit=100)# 创建一个最大词汇量为100的上下文总结式记忆空间（需要大预言模型进行总结，所以加载模型）memory.save_context({"input": "Hello"}, {"output": "What's up"})memory.save_context({"input": "Not much, just hanging"},{"output": "Cool"})memory.save_context({"input": "What is on the schedule today?"}, {"output": f"{schedule}"})# 添加对话memory.load_memory_variables({})# 显示结果为总结后的内容，通过总结将记忆内容缩短到100个词汇以内：{'history': "System: The human and AI engage in small talk before discussing the day's schedule. The AI informs the human of a morning meeting with the product team, time to work on the LangChain project, and a lunch meeting with a customer interested in the latest AI developments."}conversation = ConversationChain(llm=llm, memory = memory,verbose=True)conversation.predict(input="What would be a good demo to show?")# 特别的，在对话中调用总结式记忆空间。会自动保存最新一段AI答的原文（不总结归纳）# 并把其他对话内容进行总结。这样做可能是为了更好的获取回答，最后一段AI答价值很大，不宜信息缩减。

三、Chains对话链

Outline

LLMChainSequential Chains SimpleSequentialChainSequentialChainRouter Chain

LLMChain基础链

from langchain.chat_models import ChatOpenAIfrom langchain.prompts import ChatPromptTemplatefrom langchain.chains import LLMChainllm = ChatOpenAI(temperature=0.9)# 加载包prompt = ChatPromptTemplate.from_template("What is the best name to describe \a company that makes {product}?")# 创建一个待变量product的提示词chain = LLMChain(llm=llm, prompt=prompt)# 创建一个基础对话链product = "Queen Size Sheet Set"chain.run(product)# 提示词变量赋值，并获得回答

SimpleSequentialChain一般序列链

一般序列链可以将前一个链的输出结果，作为后一个链的输入。一般序列链有唯一输入和输出变量。

from langchain.chains import SimpleSequentialChainllm = ChatOpenAI(temperature=0.9)# 加载包first_prompt = ChatPromptTemplate.from_template("What is the best name to describe \a company that makes {product}?")# 提示词模板1，变量为productchain_one = LLMChain(llm=llm, prompt=first_prompt)# 链1second_prompt = ChatPromptTemplate.from_template("Write a 20 words description for the following \company:{company_name}")# 提示词模板2，变量为company_namechain_two = LLMChain(llm=llm, prompt=second_prompt)# 链2overall_simple_chain = SimpleSequentialChain(chains=[chain_one, chain_two],verbose=True)overall_simple_chain.run(product)# 组合链1、链2，获取结果

SequentialChain序列链

序列链中包含多个链，其中一些链的结果可以作为另一个链的输入。序列链可以支持多个输入和输出变量。

from langchain.chains import SequentialChainllm = ChatOpenAI(temperature=0.9)# 加载first_prompt = ChatPromptTemplate.from_template("Translate the following review to english:""\n\n{Review}"chain_one = LLMChain(llm=llm, prompt=first_prompt, output_key="English_Review")# 链1：输入Review，输出English_Reviewsecond_prompt = ChatPromptTemplate.from_template("Can you summarize the following review in 1 sentence:""\n\n{English_Review}")chain_two = LLMChain(llm=llm, prompt=second_prompt, output_key="summary")# 链2：输入English_Review，输出summarythird_prompt = ChatPromptTemplate.from_template("What language is the following review:\n\n{Review}")chain_three = LLMChain(llm=llm, prompt=third_prompt,output_key="language")# 链3：输入Review，输出languagefourth_prompt = ChatPromptTemplate.from_template("Write a follow up response to the following ""summary in the specified language:""\n\nSummary: {summary}\n\nLanguage: {language}")chain_four = LLMChain(llm=llm, prompt=fourth_prompt,output_key="followup_message")# 链4：输入summary、language，输出followup_messageoverall_chain = SequentialChain(chains=[chain_one, chain_two, chain_three, chain_four],input_variables=["Review"],output_variables=["English_Review", "summary","followup_message"],verbose=True)# 构建完整链，输入Review，输出"English_Review", "summary","followup_message"overall_chain(review)

Router Chain路由链

路由链类似一个while else的函数，根据输入值，选择对应的路由（路径）进行后续的链路。整个路由链一般一个输入，一个输出。

physics_template = """You are a very smart physics professor. \You are great at answering questions about physics in a concise\and easy to understand manner. \When you don't know the answer to a question you admit\that you don't know.Here is a question:{input}"""math_template = """You are a very good mathematician. \You are great at answering math questions. \You are so good because you are able to break down \hard problems into their component parts, answer the component parts, and then put them together\to answer the broader question.Here is a question:{input}"""history_template = """You are a very good historian. \You have an excellent knowledge of and understanding of people,\events and contexts from a range of historical periods. \You have the ability to think, reflect, debate, discuss and \evaluate the past. You have a respect for historical evidence\and the ability to make use of it to support your explanations \and judgements.Here is a question:{input}"""computerscience_template = """ You are a successful computer scientist.\You have a passion for creativity, collaboration,\forward-thinking, confidence, strong problem-solving capabilities,\understanding of theories and algorithms, and excellent communication \skills. You are great at answering coding questions. \You are so good because you know how to solve a problem by \describing the solution in imperative steps \that a machine can easily interpret and you know how to \choose a solution that has a good balance between \time complexity and space complexity. Here is a question:{input}"""# 创建4种提示词模板prompt_infos = [{"name": "physics", "description": "Good for answering questions about physics", "prompt_template": physics_template},{"name": "math", "description": "Good for answering math questions", "prompt_template": math_template},{"name": "History", "description": "Good for answering history questions", "prompt_template": history_template},{"name": "computer science", "description": "Good for answering computer science questions", "prompt_template": computerscience_template}]# 提示词要点信息from langchain.chains.router import MultiPromptChainfrom langchain.chains.router.llm_router import LLMRouterChain,RouterOutputParserfrom langchain.prompts import PromptTemplatellm = ChatOpenAI(temperature=0)# 加载destination_chains = {}for p_info in prompt_infos:name = p_info["name"]prompt_template = p_info["prompt_template"]prompt = ChatPromptTemplate.from_template(template=prompt_template)chain = LLMChain(llm=llm, prompt=prompt)destination_chains[name] = chain destinations = [f"{p['name']}: {p['description']}" for p in prompt_infos]destinations_str = "\n".join(destinations)# 根据提示词要点信息，生成4个链，存入destination中default_prompt = ChatPromptTemplate.from_template("{input}")default_chain = LLMChain(llm=llm, prompt=default_prompt)# 创建默认提示词和链MULTI_PROMPT_ROUTER_TEMPLATE = """Given a raw text input to a \language model select the model prompt best suited for the input. \You will be given the names of the available prompts and a \description of what the prompt is best suited for. \You may also revise the original input if you think that revising\it will ultimately lead to a better response from the language model.<< FORMATTING >>Return a markdown code snippet with a JSON object formatted to look like:```json{{{{"destination": string \ name of the prompt to use or "DEFAULT""next_inputs": string \ a potentially modified version of the original input}}}}```REMEMBER: "destination" MUST be one of the candidate prompt \names specified below OR it can be "DEFAULT" if the input is not\well suited for any of the candidate prompts.REMEMBER: "next_inputs" can just be the original input \if you don't think any modifications are needed.<< CANDIDATE PROMPTS >>{destinations}<< INPUT >>{{input}}<< OUTPUT (remember to include the ```json)>>"""# 创建一个提示词模板，包含destination和input两个变量router_template = MULTI_PROMPT_ROUTER_TEMPLATE.format(destinations=destinations_str)# 提示词模板赋值destinationrouter_prompt = PromptTemplate(template=router_template,input_variables=["input"],output_parser=RouterOutputParser(),)# 提示词模板赋值router_chain = LLMRouterChain.from_llm(llm, router_prompt)chain = MultiPromptChain(router_chain=router_chain, destination_chains=destination_chains, default_chain=default_chain, verbose=True)# 生成路由链chain.run("xxx")