How Column Sources Work
Columns can reference data from two places:- Dataset columns - Reference data directly from your dataset by using the dataset column name
- Other evaluation columns - Reference the output of a previous column by using that column’s
name
source or include a column name in sources, the system first looks for an evaluation column with that name, then falls back to looking for a dataset column.
Columns are executed in order based on their
position. A column can only reference other columns that come before it in the pipeline.Example: Chaining Columns Together
A common pattern is to chain columns: run a prompt, extract a field from the JSON output, then compare it to a ground truth value from the dataset.Execution Types
These columns execute prompts, code, or external services.PROMPT_TEMPLATE
PROMPT_TEMPLATE
Runs a prompt template from the registry against each row.
Complete example with all input variables:If your prompt template has variables
| Field | Type | Required | Description |
|---|---|---|---|
template.name | string | Yes | Name of the prompt template |
template.version_number | integer | No | Specific version number. Uses latest if omitted |
template.label | string | No | Release label to use, e.g. “production” |
prompt_template_variable_mappings | object | Yes | Maps template input variables to dataset/column names |
engine | object | No | Override the template’s default model settings |
engine.provider | string | No | Provider name, e.g. “openai”, “anthropic” |
engine.model | string | No | Model name, e.g. “gpt-4”, “claude-3-opus” |
engine.parameters | object | No | Model parameters like temperature, max_tokens |
The
prompt_template_variable_mappings object maps prompt input variables (keys) to dataset or column names (values). The key is the variable name in your prompt template (e.g., {{question}}), and the value is where to get the data from.{{company}}, {{product}}, and {{query}}, map each one:CODE_EXECUTION
CODE_EXECUTION
Executes custom Python or JavaScript code. The code receives a
data dictionary containing all column values for the current row.| Field | Type | Required | Description |
|---|---|---|---|
code | string | Yes | The code to execute |
language | string | Yes | ”PYTHON” or “JAVASCRIPT” |
ENDPOINT
ENDPOINT
Calls an external HTTP endpoint. The request body contains all column values for the current row.
| Field | Type | Required | Description |
|---|---|---|---|
url | string | Yes | The HTTP endpoint URL |
headers | object | No | HTTP headers to include |
WORKFLOW
WORKFLOW
Runs a PromptLayer workflow.
| Field | Type | Required | Description |
|---|---|---|---|
workflow_id | integer | Yes | ID of the workflow to run |
workflow_version_number | integer | No | Specific version. Uses latest if omitted |
workflow_label | string | No | Release label to use |
input_mappings | object | Yes | Maps workflow inputs to column names |
MCP
MCP
Executes an MCP (Model Context Protocol) action.
| Field | Type | Required | Description |
|---|---|---|---|
mcp_server_id | integer | Yes | ID of the MCP server |
tool_name | string | Yes | Name of the tool to call |
input_mappings | object | Yes | Maps tool inputs to column names |
HUMAN
HUMAN
Adds a column for manual human evaluation.
| Field | Type | Required | Description |
|---|---|---|---|
data_type | string | Yes | ”number” or “string” |
ui_element | object | Yes | UI configuration for the input |
CONVERSATION_SIMULATOR
CONVERSATION_SIMULATOR
Simulates multi-turn conversations to test chatbots and conversational agents. An AI-powered user persona engages in realistic dialogue with your prompt template, allowing you to evaluate how well your agent handles extended interactions.
Basic example with static persona:Dynamic personas from dataset:For comprehensive testing, store different user personas in your dataset to test various scenarios:Where your dataset has a You can also use Where your dataset has a
| Field | Type | Required | Description |
|---|---|---|---|
template.name | string | Yes | Name of the prompt template to test |
template.version_number | integer | No | Specific version number |
template.label | string | No | Release label to use |
prompt_template_variable_mappings | object | Yes | Maps template input variables to dataset columns |
user_persona | string | Conditional | Static persona description. Required if user_persona_source not set |
user_persona_source | string | Conditional | Column name containing the persona. Required if user_persona not set |
conversation_completed_prompt | string | No | Guidance for when to consider the conversation complete (e.g., “End when the user confirms their order” or “Complete when the assistant calls the submit_order tool”) |
conversation_completed_prompt_source | string | No | Column name containing the completion guidance. Use instead of conversation_completed_prompt for dynamic guidance |
is_user_first | boolean | No | If true, simulated user sends the first message (default: false) |
max_turns | integer | No | Maximum conversation turns (default: system setting, max: 150) |
conversation_samples | array | No | Example conversations to guide the simulation style |
The
user_persona defines how the simulated user behaves - their goals, communication style, and what questions they ask. Use user_persona_source to pull different personas from your dataset for varied test scenarios.The
conversation_completed_prompt provides explicit guidance for determining when a conversation should end. This is useful for defining specific end conditions like tool calls, confirmation messages, or goal achievement. The guidance can be holistic (general rules) or specific (look for a certain phrase or tool call).test_persona column with different personas:- Row 1: “You are a busy executive who needs quick answers. Be impatient if responses are too long.”
- Row 2: “You are a technical user who asks detailed follow-up questions about implementation.”
- Row 3: “You are price-sensitive and keep asking about discounts and alternatives.”
conversation_completed_prompt to define specific end conditions for your conversations:conversation_completed_prompt_source to pull completion guidance from your dataset:completion_condition column with different end conditions:- Row 1: “End when the user says ‘thank you’ or indicates satisfaction”
- Row 2: “Complete when the assistant provides a ticket number”
- Row 3: “End when the refund_process tool is called”
LLM_ASSERTION to evaluate the full conversation:Evaluation Types
These columns evaluate or compare data and typically return boolean or numeric scores.LLM_ASSERTION
LLM_ASSERTION
Uses an LLM to evaluate content against a natural language prompt. Returns a boolean indicating pass/fail.
Basic example with static prompt:Dynamic prompts from dataset:Use Where your dataset has an Where your dataset’s The output will be a dictionary with each assertion as a key and its boolean result as the value.
| Field | Type | Required | Description |
|---|---|---|---|
source | string | Yes | Column name containing the content to evaluate |
prompt | string | Conditional | The assertion prompt. Required if prompt_source not set |
prompt_source | string | Conditional | Column name containing the prompt. Required if prompt not set |
prompt_source to pull assertion prompts from a dataset column. This lets you define different assertions per row.assertions column containing the prompt text for each row.Multiple assertions per row:You can run multiple assertions against the same content by providing a JSON array of prompts. Each assertion is evaluated independently, and the results are returned as a dictionary.llm_assertions column contains a JSON array:COMPARE
COMPARE
Compares two values for equality. Supports string comparison and JSON comparison with optional JSONPath.
With JSON path:
| Field | Type | Required | Description |
|---|---|---|---|
sources | array | Yes | Array of exactly 2 column names to compare |
comparison_type.type | string | Yes | ”STRING” or “JSON” |
comparison_type.json_path | string | No | JSONPath to extract before comparing. Only for JSON type |
CONTAINS
CONTAINS
Checks if a value contains a substring (case-insensitive).
| Field | Type | Required | Description |
|---|---|---|---|
source | string | Yes | Column name to search in |
value | string | Conditional | Static substring to find. Required if value_source not set |
value_source | string | Conditional | Column name containing the substring. Required if value not set |
REGEX
REGEX
Tests if content matches a regular expression pattern. Returns boolean.
| Field | Type | Required | Description |
|---|---|---|---|
source | string | Yes | Column name to test |
regex_pattern | string | Yes | Regular expression pattern |
COSINE_SIMILARITY
COSINE_SIMILARITY
Calculates semantic similarity between two texts using embeddings. Returns a float between 0 and 1.
| Field | Type | Required | Description |
|---|---|---|---|
sources | array | Yes | Array of exactly 2 column names to compare |
ABSOLUTE_NUMERIC_DISTANCE
ABSOLUTE_NUMERIC_DISTANCE
Calculates the absolute difference between two numeric values.
| Field | Type | Required | Description |
|---|---|---|---|
sources | array | Yes | Array of exactly 2 column names containing numbers |
AI_DATA_EXTRACTION
AI_DATA_EXTRACTION
Uses an LLM to extract specific information from content based on a natural language query.
| Field | Type | Required | Description |
|---|---|---|---|
source | string | Yes | Column name containing the content |
query | string | Yes | Natural language description of what to extract |
Extraction Types
These columns extract or parse data from other columns.JSON_PATH
JSON_PATH
Extracts data from JSON using JSONPath expressions.
| Field | Type | Required | Description |
|---|---|---|---|
source | string | Yes | Column name containing JSON data |
json_path | string | Yes | JSONPath expression (e.g., “.items[0].name”) |
return_first_match | boolean | No | Return only first match (default: true) or all matches |
XML_PATH
XML_PATH
Extracts data from XML using XPath expressions.
| Field | Type | Required | Description |
|---|---|---|---|
source | string | Yes | Column name containing XML data |
xml_path | string | Yes | XPath expression |
type | string | No | ”find” for first match or “findall” for all matches. Default: “find” |
return_text | boolean | No | Return text content only or full XML. Default: true |
REGEX_EXTRACTION
REGEX_EXTRACTION
Extracts content matching a regular expression pattern. Returns an array of all matches.
| Field | Type | Required | Description |
|---|---|---|---|
source | string | Yes | Column name to extract from |
regex_pattern | string | Yes | Regular expression pattern |
PARSE_VALUE
PARSE_VALUE
Parses and converts a value to a specific type.
| Field | Type | Required | Description |
|---|---|---|---|
source | string | Yes | Column name to parse |
type | string | Yes | Target type: “string”, “number”, “boolean”, or “object” |
Transformation Types
These columns transform, combine, or validate data.VARIABLE
VARIABLE
Creates a static value that can be referenced by other columns.
String variable:JSON variable:
| Field | Type | Required | Description |
|---|---|---|---|
value.type | string | Yes | ”string” or “json” |
value.value | any | Yes | The static value |
ASSERT_VALID
ASSERT_VALID
Validates that data is in a valid format. Returns boolean.
| Field | Type | Required | Description |
|---|---|---|---|
source | string | Yes | Column name to validate |
type | string | Yes | Expected format: “object” for valid JSON, “number”, or “sql” |
COALESCE
COALESCE
Returns the first non-null value from multiple sources.
| Field | Type | Required | Description |
|---|---|---|---|
sources | array | Yes | Array of column names, minimum 2 |
COMBINE_COLUMNS
COMBINE_COLUMNS
Combines multiple column values into a single dictionary object.
| Field | Type | Required | Description |
|---|---|---|---|
sources | array | Yes | Array of column names to combine |
COUNT
COUNT
Counts occurrences in text content.
| Field | Type | Required | Description |
|---|---|---|---|
source | string | Yes | Column name to count in |
type | string | Yes | What to count: “chars”, “words”, “sentences”, or “paragraphs” |
MATH_OPERATOR
MATH_OPERATOR
Performs numeric comparisons. Returns boolean.
Compare to static value:Compare two columns:
| Field | Type | Required | Description |
|---|---|---|---|
sources | array | Yes | Array with first source column, and optionally second source column |
operator | string | Yes | Comparison operator: “lt” for less than, “le” for less or equal, “gt” for greater than, “ge” for greater or equal |
value | number | Conditional | Static value to compare against. Required if second source not provided |
MIN_MAX
MIN_MAX
Finds the minimum or maximum value from an array or JSON structure.
| Field | Type | Required | Description |
|---|---|---|---|
source | string | Yes | Column name containing the data |
type | string | Yes | ”min” or “max” |
json_path | string | No | JSONPath to extract values from, if source is JSON |

