Step 6: Iteratively implement & evaluate quality fixes

Step 6: Iteratively implement & evaluate quality fixes#

../_images/workflow_iterate.png

Requirements#

  1. Based on your root cause analysis, you have identified a potential fixes to either retrieval or generation to implement and evaluate

  2. Your POC application (or another baseline chain) is logged to an MLflow Run with an Agent Evaluation evaluation stored in the same Run

Code Repository

You can find all of the sample code referenced throughout this section here.

Expected outcome#

../_images/mlflow-eval-agent.gif

Instructions#

Based on which type of fix you want to make, modify the 00_global_config, 02_data_pipeline, or the 03_agent_proof_of_concept. Use the 05_evaluate_poc_quality notebook to evaluate the resulting chain versus your baseline configuration (at first, this is your POC) and pick a “winner”. This notebook will help you pick the winning experiment and deploy it to the Review App or a production-ready, scalable REST API.

  1. Open the B_quality_iteration/02_evaluate_fixes Notebook

  2. Based on the type of fix you are implementing:

  3. Run the 05_evaluate_poc_quality notebook and use MLflow to

    • Evaluate each fix

    • Determine the fix with the best quality/cost/latency metrics

    • Deploy the best one to the Review App and a production-ready REST API to get stakeholder feedback