> ## Documentation Index
> Fetch the complete documentation index at: https://docs.promptlayer.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Python

<Card title="Python SDK" icon="github" href="https://github.com/MagnivOrg/prompt-layer-library?tab=readme-ov-file">
  Official Python SDK for interacting with the PromptLayer API.
</Card>

## Installation

```bash theme={null}
pip install promptlayer
```

## Using the `run` Method (Recommended)

The easiest way to use PromptLayer is with the `run()` method. It fetches a prompt template from the [Prompt Registry](/features/prompt-registry/new-overview), executes it against your configured LLM provider, and logs the result — all in one call.

```python theme={null}
from promptlayer import PromptLayer
promptlayer_client = PromptLayer()

response = promptlayer_client.run(
    prompt_name="my-prompt",
    input_variables={"topic": "poetry"},
    tags=["getting-started"],
    metadata={"user_id": "123"}
)

print(response["prompt_blueprint"]["prompt_template"]["messages"][-1]["content"])
```

<Info>Your LLM API keys (OpenAI, Anthropic, etc.) are **never** sent to our servers. All LLM requests are made locally from your machine, PromptLayer just logs the request.</Info>

The `run()` method works with any provider configured in your prompt template — OpenAI, Anthropic, Google, and more. See the [Run documentation](/sdks/python#using-the-run-method-recommended) for full details.

After making your first few requests, you should be able to see them in the PromptLayer dashboard!

<img src="https://mintcdn.com/promptlayer/jUVR1Bx755pIFGwB/images/prompt-in-dashboard.png?fit=max&auto=format&n=jUVR1Bx755pIFGwB&q=85&s=3ed96f6e53858aa99b16435f0edb124e" width="2000" height="1234" data-path="images/prompt-in-dashboard.png" />

### Basic Usage

<Note>
  For any LLM provider you plan to use, you must set its corresponding API key as an environment variable (for example, `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, `GOOGLE_API_KEY` etc.). The PromptLayer client does not support passing these keys directly in code. If the relevant environment variables are not set, any requests to those LLM providers will fail.
</Note>

<Accordion title="Provider-Specific Configuration">
  #### Using Gemini models through Vertex AI

  **Python SDK**: Set these environment variables:

  * `GOOGLE_GENAI_USE_VERTEXAI=true`
  * `GOOGLE_CLOUD_PROJECT="<google_cloud_project_id>"`
  * `GOOGLE_CLOUD_LOCATION="region"`
  * `GOOGLE_APPLICATION_CREDENTIALS="path/to/google_service_account_file.json"`

  #### Using Claude models through Vertex AI

  **Python SDK**: Set these environment variables:

  * `ANTHROPIC_VERTEX_PROJECT_ID="<google_cloud_project_id>"`
  * `CLOUD_ML_REGION="region"`
  * `GOOGLE_APPLICATION_CREDENTIALS="path/to/google_service_account_file.json"`
</Accordion>

```python Python theme={null}
from promptlayer import PromptLayer

pl = PromptLayer(api_key="your_api_key")

response = pl.run(
    prompt_name="your-prompt-name",
    input_variables={"variable_name": "value"}
)

print(response["prompt_blueprint"]["prompt_template"]["messages"][-1]["content"][-1]["text"])
```

### Parameters

* `prompt_name` / `promptName` (str, required): The name of the prompt to run.
* `prompt_version` / `promptVersion` (int, optional): Specific version of the prompt to use.
* `prompt_release_label` / `promptReleaseLabel` (str, optional): Release label of the prompt (e.g., "prod", "staging").
* `input_variables` / `inputVariables` (Dict\[str, Any], optional): Variables to be inserted into the prompt template.
* `tags` (List\[str], optional): Tags to associate with this run.
* `metadata` (Dict\[str, str], optional): Additional metadata for the run.
* `model_parameter_overrides` / `modelParameterOverrides` (Union\[Dict\[str, Any], None], optional): Model-specific parameter overrides.
* `stream` (bool, default=False): Whether to stream the response.
* `provider` (str, optional): The LLM provider to use (e.g., "openai", "anthropic", "google"). This is useful if you want to override the provider specified in the prompt template.
* `model` (str, optional): The model to use (e.g., "gpt-4o", "claude-3-7-sonnet-latest", "gemini-2.5-flash"). This is useful if you want to override the model specified in the prompt template.

### Return Value

The method returns a dictionary (Python) or object (JavaScript) with the following keys:

* `request_id`: Unique identifier for the request.
* `raw_response`: The raw response from the LLM provider.
* `prompt_blueprint`: The prompt blueprint used for the request.

### Advanced Usage

#### Streaming

To stream the response:

```python Python theme={null}
for chunk in pl.run(prompt_name="your-prompt", stream=True):
    # Access raw streaming response
    print(chunk["raw_response"])

    # Access progressively built prompt blueprint
    if chunk["prompt_blueprint"]:
        current_response = chunk["prompt_blueprint"]["prompt_template"]["messages"][-1]
        if current_response.get("content"):
            print(f"Current response: {current_response['content']}")
```

When streaming is enabled, each chunk includes both the raw streaming response and the progressively built `prompt_blueprint`, allowing you to track how the response is constructed in real-time. The `request_id` is only included in the final chunk.

#### Using Different Versions or Release Labels

```python Python theme={null}
response = pl.run(
    prompt_name="your-prompt",
    prompt_version=2,  # or
    prompt_release_label="staging"
)
```

#### Adding Tags and Metadata

```python Python theme={null}
response = pl.run(
    prompt_name="your-prompt",
    tags=["test", "experiment"],
    metadata={"user_id": "12345"}
)
```

#### Overriding Model Parameters

You can also override `provider` and `model` at runtime to choose a different LLM provider or model. This is useful if you want to use a different provider than the one specified in the prompt template. PromptLayer will automatically return the correct `llm_kwargs` for the specified provider and model with default values for the parameters corresponding to the `provider` and `model`.

<Warning>
  **Provider-Specific Schema Notice**

  The `llm_kwargs` and `raw_response` objects have provider-specific structures that may change as LLM providers update their APIs. PromptLayer passes through the native format required by each provider.

  For stable, provider-agnostic prompt data, use `prompt_blueprint.prompt_template` instead of relying on the structure of provider-specific objects.
</Warning>

```python Python SDK theme={null}
response = pl.run(
    prompt_name="your-prompt",
    provider="openai",  # or "anthropic", "google", etc.
    model="gpt-4",  # or "claude-2", "gemini-1.5-pro", etc.
)
```

<Tip>
  Make sure to set both `model` and `provider` in order to run the request against correct LLM provider with correct parameters.
</Tip>

## Running Workflows

Use `run_workflow()` to execute a PromptLayer Workflow from the Python SDK. Workflows are multi-step pipelines that can combine prompt, tool, code, and conditional nodes.

```python Python theme={null}
from promptlayer import PromptLayer

pl = PromptLayer(api_key="your_api_key")

response = pl.run_workflow(
    workflow_id_or_name="Data Analysis Workflow",
    input_variables={"dataset_url": "https://example.com/data.csv"},
)

print(response)
```

### Workflow Parameters

* `workflow_id_or_name` (str or int, required): The Workflow name or ID to run.
* `input_variables` (Dict\[str, Any], optional): Variables to pass into the Workflow.
* `metadata` (Dict\[str, str], optional): Metadata to attach to the Workflow run.
* `workflow_label_name` (str, optional): Label name for the Workflow version, such as `"production"`.
* `workflow_version` (int, optional): Specific Workflow version number to run.
* `return_all_outputs` (bool, default=False): Whether to return outputs for every Workflow node.
* `timeout` (int or float, optional): Maximum time, in seconds, to wait for the Workflow to complete.

<Info>
  `workflow_name` is still supported for backward compatibility, but `workflow_id_or_name` is the preferred parameter.
</Info>

### Workflow Return Value

By default, `run_workflow()` returns the final output node's value. When `return_all_outputs=True`, it returns a dictionary keyed by node name, including each node's status, value, errors, and whether the node is an output node.

```python Python theme={null}
response = pl.run_workflow(
    workflow_id_or_name="Data Analysis Workflow",
    input_variables={"dataset_url": "https://example.com/data.csv"},
    metadata={"user_id": "12345"},
    workflow_label_name="production",
    return_all_outputs=True,
    timeout=300,
)
```

Example response with `return_all_outputs=True`:

```json theme={null}
{
  "Load Dataset": {
    "status": "SUCCESS",
    "value": "Loaded 100 rows",
    "error_message": null,
    "raw_error_message": null,
    "is_output_node": false
  },
  "Summarize Dataset": {
    "status": "SUCCESS",
    "value": "The dataset contains customer feedback grouped by region.",
    "error_message": null,
    "raw_error_message": null,
    "is_output_node": true
  }
}
```

To run Workflows asynchronously, use `AsyncPromptLayer`. See [Async Workflow Execution](#example-2-async-workflow-execution) for an async example.

## SDK Cache

The PromptLayer Python SDK supports an in-memory template cache to reduce fetch latency and improve resilience when the PromptLayer API has transient failures.

Enable cache when you want to:

* Reduce repeated template fetch latency
* Lower dependency on real-time PromptLayer API availability
* Continue serving recently known-good templates during temporary API issues

Pass `cache_ttl_seconds` when creating a client:

```python theme={null}
from promptlayer import PromptLayer

promptlayer_client = PromptLayer(
    api_key="pl_****",
    cache_ttl_seconds=300,  # each prompt template is cached for 5 minutes
)
```

Async client works the same way:

```python theme={null}
from promptlayer import AsyncPromptLayer

async_promptlayer_client = AsyncPromptLayer(
    api_key="pl_****",
    cache_ttl_seconds=300,
)
```

### How It Works

When cache is enabled, `templates.get()` and `run()` use this flow:

1. Return a fresh cached template if available.
2. If cache is stale or missing, fetch from API and refresh cache.
3. If API fetch fails with a transient error and a stale template exists, serve the stale template.

<Info>
  Stale fallback only applies to transient API errors (for example, timeout, connection, or internal server errors).
</Info>

### Important Behavior

* Cache is in-memory and process-local (not shared across machines/containers).
* Requests with `metadata_filters` or `model_parameter_overrides` bypass cache.
* Publishing via `templates.publish()` invalidates cache for that prompt name.

### Practical Guidance

* Start with `cache_ttl_seconds` between `60` and `300`.
* Use a shorter TTL if your prompts change frequently.
* Use a longer TTL if your prompts are stable and lower latency matters most.
* Keep `throw_on_error=True` if you want hard failures when no cache entry is available.

## Custom Logging with `log_request`

If you need more control — for example, using your own LLM client, a custom provider, or background processing — you can use `log_request` to manually log requests to PromptLayer.

```python theme={null}
from openai import OpenAI
from promptlayer import PromptLayer
import time

pl_client = PromptLayer()
client = OpenAI()

messages = [
    {"role": "system", "content": "You are an AI."},
    {"role": "user", "content": "Compose a poem please."}
]

request_start_time = time.time()
completion = client.chat.completions.create(
    model="gpt-4o",
    messages=messages
)
request_end_time = time.time()

# Log to PromptLayer
pl_client.log_request(
    provider="openai",
    model="gpt-4o",
    input={"type": "chat", "messages": [
        {"role": m["role"], "content": [{"type": "text", "text": m["content"]}]}
        for m in messages
    ]},
    output={"type": "chat", "messages": [
        {"role": "assistant", "content": [{"type": "text", "text": completion.choices[0].message.content}]}
    ]},
    request_start_time=request_start_time,
    request_end_time=request_end_time,
    tags=["getting-started"]
)
```

This works with any LLM provider, including Anthropic:

```python theme={null}
import anthropic
from promptlayer import PromptLayer
import time

pl_client = PromptLayer()
client = anthropic.Anthropic()

messages = [{"role": "user", "content": "How many toes do dogs have?"}]

request_start_time = time.time()
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=100,
    messages=messages
)
request_end_time = time.time()

# Log to PromptLayer
pl_client.log_request(
    provider="anthropic",
    model="claude-sonnet-4-20250514",
    input={"type": "chat", "messages": [
        {"role": m["role"], "content": [{"type": "text", "text": m["content"]}]}
        for m in messages
    ]},
    output={"type": "chat", "messages": [
        {"role": "assistant", "content": [{"type": "text", "text": response.content[0].text}]}
    ]},
    request_start_time=request_start_time,
    request_end_time=request_end_time,
    tags=["animal-toes"]
)
```

See the [Custom Logging documentation](/features/prompt-history/custom-logging) and [Log Request API Reference](/reference/log-request) for full details.

## Error Handling

PromptLayer provides robust error handling with specialized exception classes and configurable error behavior.

### Exception Classes

The library includes specific exception types following industry best practices:

```python theme={null}
from promptlayer import (
    PromptLayerAPIError,              # General API errors
    PromptLayerBadRequestError,       # 400 errors
    PromptLayerAuthenticationError,   # 401 errors
    PromptLayerNotFoundError,         # 404 errors
    PromptLayerValidationError,       # Input validation errors
    PromptLayerAPIConnectionError,    # Connection failures
    PromptLayerAPITimeoutError,       # Timeout errors
    PromptLayerRateLimitError,        # 429 rate limit errors
)
```

### Using `throw_on_error`

By default, PromptLayer throws exceptions when errors occur. You can control this behavior using the `throw_on_error` parameter:

```python theme={null}
from promptlayer import PromptLayer

# Default behavior: throws exceptions on errors
promptlayer_client = PromptLayer(api_key="pl_****", throw_on_error=True)

# Alternative: logs warnings instead of throwing exceptions
promptlayer_client = PromptLayer(api_key="pl_****", throw_on_error=False)
```

**Example with exception handling:**

```python theme={null}
from promptlayer import PromptLayer, PromptLayerNotFoundError, PromptLayerValidationError

promptlayer_client = PromptLayer()

try:
    # Attempt to get a template that might not exist
    template = promptlayer_client.templates.get("NonExistentTemplate")
except PromptLayerNotFoundError as e:
    print(f"Template not found: {e}")
except PromptLayerValidationError as e:
    print(f"Invalid input: {e}")
```

**Example with warnings (throw\_on\_error=False):**

```python theme={null}
from promptlayer import PromptLayer

# Initialize with throw_on_error=False to get warnings instead of exceptions
promptlayer_client = PromptLayer(throw_on_error=False)

# This will log a warning instead of throwing an exception if the template doesn't exist
template = promptlayer_client.templates.get("NonExistentTemplate")
# Returns None if not found, with a warning logged
```

### Automatic Retry Mechanism

PromptLayer includes a built-in retry mechanism to handle transient failures gracefully. This ensures your application remains resilient when temporary issues occur.

**Retry Behavior:**

* **Total Attempts**: 4 attempts (1 initial + 3 retries)
* **Exponential Backoff**: Retries wait progressively longer between attempts (2s, 4s, 8s)
* **Max Wait Time**: 15 seconds maximum wait between retries

**What Triggers Retries:**

* **5xx Server Errors**: Internal server errors, service unavailable, etc.
* **429 Rate Limit Errors**: When rate limits are exceeded

**What Fails Immediately (No Retries):**

* **Connection Errors**: Network connectivity issues
* **Timeout Errors**: Request timeouts
* **4xx Client Errors** (except 429): Bad requests, authentication errors, not found, etc.

<Info>
  The retry mechanism operates transparently in the background. You don't need to implement retry logic yourself - PromptLayer handles it automatically for recoverable errors.
</Info>

### Logging

PromptLayer uses Python's built-in `logging` module for all log output:

```python theme={null}
import logging
from promptlayer import PromptLayer

# Configure logging to see PromptLayer logs
logging.basicConfig(level=logging.INFO)

promptlayer_client = PromptLayer()

# Now you'll see log output from PromptLayer operations
```

**Setting log levels:**

```python theme={null}
import logging

# Get the PromptLayer logger
logger = logging.getLogger("promptlayer")

# Set to WARNING to only see warnings and errors
logger.setLevel(logging.WARNING)

# Set to DEBUG to see detailed information
logger.setLevel(logging.DEBUG)
```

**Viewing Retry Logs:**

When retries occur, PromptLayer logs warnings before each retry attempt:

```python theme={null}
import logging
from promptlayer import PromptLayer

# Set up logging to see retry attempts
logging.basicConfig(
    level=logging.WARNING,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)

promptlayer_client = PromptLayer()

# If a retry occurs, you'll see log messages like:
# "Retrying in 2 seconds..."
# "Retrying in 4 seconds..."
```

## Async Support

PromptLayer supports asynchronous operations, ideal for managing concurrent tasks in non-blocking environments like web servers, microservices, or Jupyter notebooks.

### Initializing the Async Client

To use asynchronous non-blocking methods, initialize AsyncPromptLayer as shown:

```python theme={null}
from promptlayer import AsyncPromptLayer

# Initialize an asynchronous client with your API key
async_promptlayer_client = AsyncPromptLayer(api_key="pl_****")
```

### Async Usage Examples

The asynchronous client functions similarly to the synchronous version, but allows for non-blocking execution with `asyncio`. Below are example uses.

#### Example 1: Async Template Management

Use asynchronous methods to manage templates:

```python theme={null}
import asyncio
from promptlayer import AsyncPromptLayer

async def main():
    async_promptlayer_client = AsyncPromptLayer(api_key="pl_****")

    # Fetch a template asynchronously
    template = await async_promptlayer_client.templates.get("Test1")
    print(template)

    # Fetch all templates asynchronously
    templates = await async_promptlayer_client.templates.all()
    print(templates)

# Run the async function
asyncio.run(main())
```

#### Example 2: Async Workflow Execution

Run Workflows asynchronously for better efficiency:

```python theme={null}
import asyncio
from promptlayer import AsyncPromptLayer

async def main():
    async_promptlayer_client = AsyncPromptLayer(api_key="pl_****")

    response = await async_promptlayer_client.run_workflow(
        workflow_name="example_workflow",
        workflow_version=1,
        input_variables={"num1": "1", "num2": "2"},
        return_all_outputs=True,
    )
    print(response)

# Run the async function
asyncio.run(main())
```

#### Example 3: Async Tracking and Logging

Track and log requests asynchronously:

```python theme={null}
import asyncio
from promptlayer import AsyncPromptLayer

async def main():
    async_promptlayer_client = AsyncPromptLayer(api_key="pl_****")

    # Track metadata asynchronously
    request_id = "pl_request_id_example"
    await async_promptlayer_client.track.metadata(request_id, {"key": "value"})

    # Log request asynchronously (for detailed logging, refer to the custom logging page)
    await async_promptlayer_client.log_request(
        provider="openai",
        model="gpt-3.5-turbo",
        input=prompt_template,
        output=output_template,
        request_start_time=1630945600,
        request_end_time=1630945605,
    )

# Run the async function
asyncio.run(main())
```

For more information on custom logging, please visit our [Custom Logging Documentation](/features/prompt-history/custom-logging).

#### Example 4: Asynchronous Prompt Execution with run Method

You can execute prompt templates asynchronously using the run method. This allows you to run a prompt template by name with given input variables.

```python theme={null}
import asyncio
from promptlayer import AsyncPromptLayer

async def main():
    async_promptlayer_client = AsyncPromptLayer(api_key="pl_****")

    # Execute a prompt template asynchronously
    response = await async_promptlayer_client.run(
        prompt_name="TestPrompt",
        input_variables={"variable1": "value1", "variable2": "value2"}
    )
    print(response)

# Run the async function
asyncio.run(main())
```

#### Example 5: Asynchronous Streaming Prompt Execution with run Method

You can run streaming prompt template using the run method as well.

```python theme={null}

import asyncio
import os
from promptlayer import AsyncPromptLayer


async def main():
    async_promptlayer_client = AsyncPromptLayer(
        api_key=os.environ.get("PROMPTLAYER_API_KEY")
    )

    response_generator = await async_promptlayer_client.run(
        prompt_name="TestPrompt",
        input_variables={"variable1": "value1", "variable2": "value2"}, stream=True
    )

    final_response = ""
    async for response in response_generator:
        # Access raw streaming response
        print("Raw streaming response:", response["raw_response"])
        
        # Access progressively built prompt blueprint
        if response["prompt_blueprint"]:
            current_response = response["prompt_blueprint"]["prompt_template"]["messages"][-1]
            if current_response.get("content"):
                print(f"Current response: {current_response['content']}")

# Run the async function
asyncio.run(main())
```

In this example, replace "TestPrompt" with the name of your prompt template, and provide any required input variables. When streaming is enabled, each chunk includes both the raw streaming response and the progressively built `prompt_blueprint`, allowing you to track how the response is constructed in real-time.

***

Want to say hi 👋, submit a feature request, or report a bug? [✉️ Contact us](mailto:hello@promptlayer.com)