Backtesting lets you run a new prompt version against real historical inputs. Use it when you want to understand how a prompt change would have affected production or staging traffic.Documentation Index
Fetch the complete documentation index at: https://docs.promptlayer.com/llms.txt
Use this file to discover all available pages before exploring further.
Create a historical dataset
Go to Datasets and click Add from Request History. This opens a request log browser where you can filter and select requests.
Run a backtest
Create an evaluation that runs your new prompt version against the historical dataset. Add columns for:- New prompt output: The response from your updated prompt version
- Comparison: An equality comparison, semantic similarity check, LLM-as-judge score, or human review column


