> ## Documentation Index
> Fetch the complete documentation index at: https://docs.promptlayer.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Compare Models

> Compare prompt outputs across providers and models with an evaluation pipeline.

<Warning>
  Legacy Evaluations, Reports, and Datasets are deprecated for new workflows. Use [Tables](/features/tables/overview) for new evaluation, dataset, report, backtesting, and batch workflows. See [Migrate from Evaluations and Datasets](/features/tables/migrate-from-evaluations-and-datasets).
</Warning>

Use model comparison when you want to test the same prompt across GPT, Claude, Gemini, or another provider before choosing a production model.

## Before you start

You need:

* A saved prompt template
* A dataset with the input variables your prompt expects
* Provider API keys configured for the models you want to compare

## Create a comparison evaluation

Create a new evaluation and select your dataset.

Add multiple **Prompt Template** columns. Configure each column with the same prompt template, then set a different provider or model override for each column.

<Frame>
  <img src="https://mintcdn.com/promptlayer/2Nw4D0YQ3AERsqEA/new-quickstart-images/model-comparison.png?fit=max&auto=format&n=2Nw4D0YQ3AERsqEA&q=85&s=ab87559fb7eb88a96668c36a98426e15" alt="Comparing models" width="2506" height="1160" data-path="new-quickstart-images/model-comparison.png" />
</Frame>

Run the evaluation. Each row shows the prompt output from every model side by side.

## Score the outputs

Add an **LLM-as-judge**, human grading, equality comparison, or code evaluator column to score the model outputs against your criteria.

For example, you can score whether each output:

* Follows the requested format
* Answers the user correctly
* Avoids hallucinated details
* Meets latency or cost expectations for the use case

Use the results to choose the best price, latency, and quality balance.

## Next steps

* [Evaluation pipelines](/features/evaluations/building-pipelines)
* [Evaluation types](/features/evaluations/eval-types)
* [Supported providers](/features/supported-providers)
* [Custom providers](/features/custom-providers)
