【Langchain大语言模型开发教程】评估
创始人
2024-11-18 14:37:04

 🔗 LangChain for LLM Application Development - DeepLearning.AI

学习目标

1、Example generation

2、Manual evaluation and debug

3、LLM-assisted evaluation

4、LangChain evaluation platform

1、引包、加载环境变量;

import os  from dotenv import load_dotenv, find_dotenv _ = load_dotenv(find_dotenv()) # read local .env file  from langchain.chains import RetrievalQA from langchain_openai import ChatOpenAI from langchain.document_loaders import CSVLoader from langchain.indexes import VectorstoreIndexCreator from langchain.vectorstores import DocArrayInMemorySearch

2、加载数据;

file = 'OutdoorClothingCatalog_1000.csv' loader = CSVLoader(file_path=file, encoding='utf-8') data = loader.load()

3、创建向量数据库(内存警告⚠);

model_name = "bge-large-en-v1.5" embeddings = HuggingFaceEmbeddings(     model_name=model_name, )  db = DocArrayInMemorySearch.from_documents(data, embeddings) retriever = db.as_retriever()

 4、初始化一个LLM并创建一个RetrievalQ链;

llm = ChatOpenAI(api_key=os.environ.get('ZHIPUAI_API_KEY'),                          base_url=os.environ.get('ZHIPUAI_API_URL'),                          model="glm-4",                          temperature=0.98)  qa = RetrievalQA.from_chain_type(     llm=llm,      chain_type="stuff",      retriever=retriever,     verbose=True,     chain_type_kwargs = {         "document_separator": "<<<<>>>>>"     } )

 Example generation

from langchain.evaluation.qa import QAGenerateChain  example_gen_chain = QAGenerateChain.from_llm(llm)  new_examples = example_gen_chain.apply_and_parse(     [{"doc": t} for t in data[:5]] )

这里我们打印一下这个生成的example,发现是一个列表长下面这个样子;

 [{'qa_pairs': {'query': "What is the unique feature of the innersole in the Women's Campside Oxfords?", 'answer': 'The innersole has a vintage hunt, fish, and camping motif.'}}, {'qa_pairs': {'query': 'What is the name of the dog mat that is ruggedly constructed from recycled plastic materials, helping to keep dirt and water off the floors and plastic out of landfills?', 'answer': 'The name of the dog mat is Recycled Waterhog Dog Mat, Chevron Weave.'}}, {'qa_pairs': {'query': 'What is the name of the product described in the document that is suitable for Infant and Toddler Girls?', 'answer': "The product is called 'Infant and Toddler Girls' Coastal Chill Swimsuit, Two-Piece'."}}, {'qa_pairs': {'query': 'What is the primary material used in the construction of the Refresh Swimwear V-Neck Tankini, and what percentage of it is recycled?', 'answer': 'The primary material is nylon, with 82% of it being recycled nylon.'}}, {'qa_pairs': {'query': 'What is the material used for the EcoFlex 3L Storm Pants, according to the document?', 'answer': 'The EcoFlex 3L Storm Pants are made of 100% nylon, exclusive of trim.'}}]

所以这里我们需要进行一步提取;

for example in new_examples:     examples.append(example["qa_pairs"])  print(examples)  qa.invoke(examples[0]["query"])

 Manual Evaluation

import langchain langchain.debug = True #开始debug模式,查看chain中的详细步骤

 我们再次执行来查看chain中的细节;

 LLM-assisted evaluation

那我们是不是可以使用语言模型来评估呢;

langchain.debug = False #关闭debug模式  from langchain.evaluation.qa import QAEvalChain

让大语言模型来为我们每个example来生成答案; 

predictions = qa.apply(examples)

我们初始化一个评估链;

eval_chain = QAEvalChain.from_llm(llm)

让大语言模型对实际答案和预测答案进行对比并给出一个评分;

graded_outputs = eval_chain.evaluate(examples, predictions)

最后,我们可以打印一下看看结果; 

for i, eg in enumerate(examples):     print(f"Example {i}:")     print("Question: " + predictions[i]['query'])     print("Real Answer: " + predictions[i]['answer'])     print("Predicted Answer: " + predictions[i]['result'])     print("Predicted Grade: " + graded_outputs[i]['results'])     print()

相关内容

热门资讯

裸辞做“一人公司”,我后悔了 去年这个时候,一位以色列程序员正在东南亚旅行。他顺手把一个在脑子里转了很久的想法做成了产品,一个让任...
南京建成国内首个Pre-6G试... 4月21日,2026全球6G技术与产业生态大会在南京开幕。全息互动技术展台前,一名远在北京的工作人员...
超梵求职受邀参加“2025抖音... 超梵求职受邀参加“2025抖音巨量引擎成人教育行业生态大会”,探讨分享优质内容传播,服务万千学员。 ...
摩托罗拉Razr 2026(R... IT之家 4 月 22 日消息,摩托罗拉宣布新一代 Razr 折叠手机将于 4 月 29 日在美国发...
库克卸任,特纳斯领航:苹果新纪... 苹果首席执行官蒂姆·库克将卸任,硬件工程主管约翰·特纳斯将接任,苹果公司今天宣布此事。 库克将在夏季...