LangChain for LLM Application Development 笔记2

1. Q&A over Documents

先看一个例子,对于 OutdoorClothingCatalog_1000.csvCSV 文档,我们用 VectorstoreIndexCreator 构建向量索引后就能查询对应的内容了。这是怎么实现的呢?

from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders  import CSVLoader
from langchain.vectorstores import DocArrayInMemorySearch
from IPython.display import display, Markdown
from langchain.indexes import VectorstoreIndexCreator

#1. 导入 csv
file = 'OutdoorClothingCatalog_1000.csv'
loader = CSVLoader(file_path=file)

#2. 构建关于文档的向量索引
index = VectorstoreIndexCreator(vectorstore_cls=DocArrayInMemorySearch)\
    .from_loaders([loader])

query ="Please list all your shirts with sun protection \
in a table in markdown and summarize each one."
respose = index.query(query)
display(Markdown(response))

其实对于文档而言经过 LLM 编码后就是一个向量,然后我们通过返回与 query 自身向量相近的向量 top- k 个向量来实现这个查询功能。

LangChain for LLM Application Development 笔记 2

但是一般而言文档都是很大的,或者说我们有很多类似的文档。这时需要分割成小的 chunks,再经过 LLM 得到对应的 embedding 向量。然后去查找与 query 向量相近的 top- k 个向量,这样就找到了相近的 chunks。这时,返回这些 chunks 就好了。

LangChain for LLM Application Development 笔记 2

有了向量数据库,怎么使用呢?对于文档向量数据库的使用方法有下面几种:

  1. 直接用 index 来查询,
   from langchain.embeddings import OpenAIEmbeddings
   embeddings = OpenAIEmbeddings()

   file = 'OutdoorClothingCatalog_1000.csv'
   loader = CSVLoader(file_path=file)
   docs = loader.load()

   llm = ChatOpenAI(temperature = 0.0)

   index = VectorstoreIndexCreator(vectorstore_cls=DocArrayInMemorySearch)\
       .from_loaders([loader])
   response = index.query(query, llm=llm)
  1. db.similarity_search(query):直接使用 similarity_search 搜索
db = DocArrayInMemorySearch.from_documents(
    docs,
    embeddings
)

query = "Please suggest a shirt with sunblocking"
docs = db.similarity_search(query)
print(docs)
  1. db 作为检索器
retriever = db.as_retriever()


qdocs = "".join([docs[i].page_content for i in range(len(docs))])
response = llm.call_as_llm(f"{qdocs} Question: Please list all your \
shirts with sun protection in a table in markdown and summarize each one.") 
  1. RetrievalQA 使用 chain 来搜索答案
qa_stuff = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever,
    verbose=True
)
response = qa_stuff.run(query)

还有其它方法, 查看

LangChain for LLM Application Development 笔记 2

2. Evaluation

  1. langchain.debug = True 检查 qa 过程是否正确
  2. QAEvalChain 验证生成答案和实际答案是否一致。
from langchain.evaluation.qa import QAEvalChain
llm = ChatOpenAI(temperature=0)
eval_chain = QAEvalChain.from_llm(llm)

graded_outputs = eval_chain.evaluate(examples, predictions)
for i, eg in enumerate(examples):
    print(f"Example {i}:")
    print("Question:" + predictions[i]['query'])
    print("Real Answer:" + predictions[i]['answer'])
    print("Predicted Answer:" + predictions[i]['result'])
    print("Predicted Grade:" + graded_outputs[i]['text'])
    print()

3. Agents

一般步骤: 具体查看 agent 官方文档

  1. 实例化 agent 和 tools 加载哪些领域知识
  2. 初始化 agent
  3. agent.run(question),来回答问题。
from langchain.agents.agent_toolkits import create_python_agent
from langchain.agents import load_tools, initialize_agent
from langchain.agents import AgentType
from langchain.tools.python.tool import PythonREPLTool
from langchain.chat_models import ChatOpenAI


llm = ChatOpenAI(temperature=0)
tools = load_tools(["wikipedia", "llm-math"], llm=llm)
agent = initialize_agent(tools, llm,
                         agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
                         verbose=True)

agent.run("Which country is in the northest?")
#agent.run("How to solve the center gravity of a triangle?")

create_python_agent 构建自己的机器人助手,比如对顾客姓名排序。

agent = create_python_agent(
    llm,
    tool=PythonREPLTool(),
    verbose=True
)

customer_list = [["Harrison", "Chase"], 
                 ["Lang", "Chain"],
                 ["Dolly", "Too"],
                 ["Elle", "Elem"], 
                 ["Geoff","Fusion"], 
                 ["Trance","Former"],
                 ["Jen","Ayai"]
                ]
agent.run(f"""Sort these customers by \
last name and then first name \
and print the output: {customer_list}""") 

@tool 装饰器可以构建自己的工具如:

from langchain.agents import tool
from datetime import date
@tool
def time(text: str) -> str:
    """Returns todays date, use this for any \
    questions related to knowing todays date. \
    The input should always be an empty string, \
    and this function will always return todays \
    date - any date mathmatics should occur \
    outside this function."""
    return str(date.today())

agent= initialize_agent(tools + [time], 
    llm, 
    agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION,
    handle_parsing_errors=True,
    verbose = True)

本门课程主要讲解了 langchain 的使用,主要包括:

  1. Models, Prompts and parsers
  2. Memory
  3. Chains
  4. QA
  5. Evaluation
  6. Agents
 
正文完
 
admin
版权声明:本站原创文章,由 admin 2023-11-26发表,共计3776字。
转载说明:除特殊说明外本站文章皆由CC-4.0协议发布,转载请联系tensortimes@gmail.com。
评论(没有评论)
验证码