Skip to main content
Legacy Evaluations, Reports, and Datasets are deprecated for new workflows. Use Tables for new evaluation, dataset, report, backtesting, and batch workflows. See Migrate from Evaluations and Datasets.
Use request history when you want to build datasets from real production or staging traffic. This is a strong fit for backtesting, regression testing, and creating evaluation sets from the prompts, metadata, and outcomes your system has already seen. Creating a dataset from history is straightforward using the Dataset dialog. PromptLayer can build a dataset from your request history, including metadata, input variable context, tags, and the request response. This is especially useful when you want to evaluate a new prompt version against real historical examples. Go to Datasets and click Add from Request History. This opens a request log browser where you can filter and select requests.
Adding from request history
When creating a dataset from history, you can narrow what gets included by filtering on:
  • Time range
  • Metadata key-value pairs
  • Prompt templates and version numbers
  • Search query
  • Scores
  • Tags
After you save the dataset, use it in an evaluation pipeline to backtest a new prompt version against real historical inputs. See Backtest Prompt Changes. The filter-params endpoint is the recommended way to create a dataset from history in one step. The draft, add-request-log, and save-draft endpoints support a more advanced manual workflow when you want precise control over how rows are assembled.