Step 6: Iteratively implement & evaluate quality fixes#
Requirements#
Based on your root cause analysis, you have identified a potential fixes to either retrieval or generation to implement and evaluate
Your POC application (or another baseline chain) is logged to an MLflow Run with an Agent Evaluation evaluation stored in the same Run
You can find all of the sample code referenced throughout this section here.
Expected outcome#
Instructions#
Based on which type of fix you want to make, modify the 00_global_config
, 02_data_pipeline
, or the 03_agent_proof_of_concept
. Use the 05_evaluate_poc_quality
notebook to evaluate the resulting chain versus your baseline configuration (at first, this is your POC) and pick a “winner”. This notebook will help you pick the winning experiment and deploy it to the Review App or a production-ready, scalable REST API.
Open the
B_quality_iteration/02_evaluate_fixes
NotebookBased on the type of fix you are implementing:
Follow these instructions to create the new data pipeline.
Modify the
00_global_config
.
Create a modified chain code file similar to
agents/function_calling_agent_w_retriever_tool
and reference it from it to the03_agent_proof_of_concept
notebook.
Run the
05_evaluate_poc_quality
notebook and use MLflow toEvaluate each fix
Determine the fix with the best quality/cost/latency metrics
Deploy the best one to the Review App and a production-ready REST API to get stakeholder feedback