LangChain for LLM Application Development 笔记2

1. Q&A over Documents

先看一个例子，对于 OutdoorClothingCatalog_1000.csvCSV 文档，我们用 VectorstoreIndexCreator 构建向量索引后就能查询对应的内容了。这是怎么实现的呢？

 from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders  import CSVLoader
from langchain.vectorstores import DocArrayInMemorySearch
from IPython.display import display, Markdown
from langchain.indexes import VectorstoreIndexCreator
 
#1. 导入 csv
file = 'OutdoorClothingCatalog_1000.csv'
loader = CSVLoader(file_path=file)
 
#2. 构建关于文档的向量索引
index = VectorstoreIndexCreator(vectorstore_cls=DocArrayInMemorySearch)\
    .from_loaders([loader])
 
query ="Please list all your shirts with sun protection \
in a table in markdown and summarize each one."
respose = index.query(query)
display(Markdown(response))

其实对于文档而言经过 LLM 编码后就是一个向量，然后我们通过返回与 query 自身向量相近的向量 top- k 个向量来实现这个查询功能。

但是一般而言文档都是很大的，或者说我们有很多类似的文档。这时需要分割成小的 chunks，再经过 LLM 得到对应的 embedding 向量。然后去查找与 query 向量相近的 top- k 个向量，这样就找到了相近的 chunks。这时，返回这些 chunks 就好了。

有了向量数据库，怎么使用呢？对于文档向量数据库的使用方法有下面几种：

直接用 index 来查询，

    from langchain.embeddings import OpenAIEmbeddings
   embeddings = OpenAIEmbeddings()
 
   file = 'OutdoorClothingCatalog_1000.csv'
   loader = CSVLoader(file_path=file)
   docs = loader.load()
 
   llm = ChatOpenAI(temperature = 0.0)
 
   index = VectorstoreIndexCreator(vectorstore_cls=DocArrayInMemorySearch)\
       .from_loaders([loader])
   response = index.query(query, llm=llm)

db.similarity_search(query)：直接使用 similarity_search 搜索

 db = DocArrayInMemorySearch.from_documents(
    docs,
    embeddings
)
 
query = "Please suggest a shirt with sunblocking"
docs = db.similarity_search(query)
print(docs)

db 作为检索器

 retriever = db.as_retriever()
 
 
qdocs = "".join([docs[i].page_content for i in range(len(docs))])
response = llm.call_as_llm(f"{qdocs} Question: Please list all your \
shirts with sun protection in a table in markdown and summarize each one.")

RetrievalQA 使用 chain 来搜索答案

 qa_stuff = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever,
    verbose=True
)
response = qa_stuff.run(query)

还有其它方法，查看。

2. Evaluation

langchain.debug = True 检查 qa 过程是否正确
QAEvalChain 验证生成答案和实际答案是否一致。

 from langchain.evaluation.qa import QAEvalChain
llm = ChatOpenAI(temperature=0)
eval_chain = QAEvalChain.from_llm(llm)
 
graded_outputs = eval_chain.evaluate(examples, predictions)
for i, eg in enumerate(examples):
    print(f"Example {i}:")
    print("Question:" + predictions[i]['query'])
    print("Real Answer:" + predictions[i]['answer'])
    print("Predicted Answer:" + predictions[i]['result'])
    print("Predicted Grade:" + graded_outputs[i]['text'])
    print()

3. Agents

一般步骤：具体查看 agent 官方文档。

实例化 agent 和 tools 加载哪些领域知识
初始化 agent
agent.run(question)，来回答问题。

 from langchain.agents.agent_toolkits import create_python_agent
from langchain.agents import load_tools, initialize_agent
from langchain.agents import AgentType
from langchain.tools.python.tool import PythonREPLTool
from langchain.chat_models import ChatOpenAI
 
 
llm = ChatOpenAI(temperature=0)
tools = load_tools(["wikipedia", "llm-math"], llm=llm)
agent = initialize_agent(tools, llm,
                         agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
                         verbose=True)
 
agent.run("Which country is in the northest?")
#agent.run("How to solve the center gravity of a triangle？")

create_python_agent 构建自己的机器人助手，比如对顾客姓名排序。

 agent = create_python_agent(
    llm,
    tool=PythonREPLTool(),
    verbose=True
)
 
customer_list = [["Harrison", "Chase"], 
                 ["Lang", "Chain"],
                 ["Dolly", "Too"],
                 ["Elle", "Elem"], 
                 ["Geoff","Fusion"], 
                 ["Trance","Former"],
                 ["Jen","Ayai"]
                ]
agent.run(f"""Sort these customers by \
last name and then first name \
and print the output: {customer_list}""")

@tool 装饰器可以构建自己的工具如：

 from langchain.agents import tool
from datetime import date
@tool
def time(text: str) -> str:
    """Returns todays date, use this for any \
    questions related to knowing todays date. \
    The input should always be an empty string, \
    and this function will always return todays \
    date - any date mathmatics should occur \
    outside this function."""
    return str(date.today())
 
agent= initialize_agent(tools + [time], 
    llm, 
    agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION,
    handle_parsing_errors=True,
    verbose = True)

本门课程主要讲解了 langchain 的使用，主要包括：

Models, Prompts and parsers
Memory
Chains
QA
Evaluation
Agents

正文完

发表至： OpenAI

2023-11-26

转载说明：除特殊说明外本站文章皆由CC-4.0协议发布，转载请联系tensortimes@gmail.com。

LangChain for LLM Application Development 笔记2

吴恩达联手OpenAI推出Prompt Engineering 课程2

吴恩达联手OpenAI推出Prompt Engineering 课程 1. Guidelines

吴恩达和OpenAI再次推出Building Systems with the ChatGPT API 课程

tarfile 和zipfile解压zip、tar文件

LangChain for LLM Application Development 笔记2

1. Q&A over Documents

2. Evaluation

3. Agents

Cursor Free VIP 工具 0.48.x 版本全面介绍

2025年最新高质量Agent项目全面报告

Krillin AI: 一站式视频本地化与增强解决方案

	from langchain.chains import RetrievalQA
	from langchain.chat_models import ChatOpenAI
	from langchain.document_loaders import CSVLoader
	from langchain.vectorstores import DocArrayInMemorySearch
	from IPython.display import display, Markdown
	from langchain.indexes import VectorstoreIndexCreator

	#1. 导入 csv
	file = 'OutdoorClothingCatalog_1000.csv'
	loader = CSVLoader(file_path=file)

	#2. 构建关于文档的向量索引
	index = VectorstoreIndexCreator(vectorstore_cls=DocArrayInMemorySearch)\
	.from_loaders([loader])

	query ="Please list all your shirts with sun protection \
	in a table in markdown and summarize each one."
	respose = index.query(query)
	display(Markdown(response))

	from langchain.embeddings import OpenAIEmbeddings
	embeddings = OpenAIEmbeddings()

	file = 'OutdoorClothingCatalog_1000.csv'
	loader = CSVLoader(file_path=file)
	docs = loader.load()

	llm = ChatOpenAI(temperature = 0.0)

	index = VectorstoreIndexCreator(vectorstore_cls=DocArrayInMemorySearch)\
	.from_loaders([loader])
	response = index.query(query, llm=llm)

	db = DocArrayInMemorySearch.from_documents(
	docs,
	embeddings
	)

	query = "Please suggest a shirt with sunblocking"
	docs = db.similarity_search(query)
	print(docs)

	retriever = db.as_retriever()


	qdocs = "".join([docs[i].page_content for i in range(len(docs))])
	response = llm.call_as_llm(f"{qdocs} Question: Please list all your \
	shirts with sun protection in a table in markdown and summarize each one.")

	qa_stuff = RetrievalQA.from_chain_type(
	llm=llm,
	chain_type="stuff",
	retriever=retriever,
	verbose=True
	)
	response = qa_stuff.run(query)

	from langchain.evaluation.qa import QAEvalChain
	llm = ChatOpenAI(temperature=0)
	eval_chain = QAEvalChain.from_llm(llm)

	graded_outputs = eval_chain.evaluate(examples, predictions)
	for i, eg in enumerate(examples):
	print(f"Example {i}:")
	print("Question:" + predictions[i]['query'])
	print("Real Answer:" + predictions[i]['answer'])
	print("Predicted Answer:" + predictions[i]['result'])
	print("Predicted Grade:" + graded_outputs[i]['text'])
	print()

	from langchain.agents.agent_toolkits import create_python_agent
	from langchain.agents import load_tools, initialize_agent
	from langchain.agents import AgentType
	from langchain.tools.python.tool import PythonREPLTool
	from langchain.chat_models import ChatOpenAI


	llm = ChatOpenAI(temperature=0)
	tools = load_tools(["wikipedia", "llm-math"], llm=llm)
	agent = initialize_agent(tools, llm,
	agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
	verbose=True)

	agent.run("Which country is in the northest?")
	#agent.run("How to solve the center gravity of a triangle？")

	agent = create_python_agent(
	llm,
	tool=PythonREPLTool(),
	verbose=True
	)

	customer_list = [["Harrison", "Chase"],
	["Lang", "Chain"],
	["Dolly", "Too"],
	["Elle", "Elem"],
	["Geoff","Fusion"],
	["Trance","Former"],
	["Jen","Ayai"]
	]
	agent.run(f"""Sort these customers by \
	last name and then first name \
	and print the output: {customer_list}""")

	from langchain.agents import tool
	from datetime import date
	@tool
	def time(text: str) -> str:
	"""Returns todays date, use this for any \
	questions related to knowing todays date. \
	The input should always be an empty string, \
	and this function will always return todays \
	date - any date mathmatics should occur \
	outside this function."""
	return str(date.today())

	agent= initialize_agent(tools + [time],
	llm,
	agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION,
	handle_parsing_errors=True,
	verbose = True)