In this tutorial, we will guide you through the process of data-driven prompt engineering.

PromptLayer provides a unified interface for working with different language models, making and scoring requests, and tracking your prompts and requests. It supports a variety of models, from OpenAI, Anthropic, HuggingFace, and more.

Whether you’re a data scientist, a machine learning engineer, or a developer, PromptLayer can help you manage your language models more effectively and efficiently. Let’s get started!

Setting Up Your Environment

Before we get started, we need to load our environment variables from a .env file. This file should contain your API keys for PromptLayer and OpenAI.

Get your PromptLayer API key through our dashboard by signing up at


We can load these variables using the dotenv package:

Once we have loaded our environment variables, we can import promptlayer and set up our API key for PromptLayer. Make sure to pip install promptlayer first if you are using Python.

Making Your First Request

PromptLayer is at its core a REST library. Using our Python SDK is equivalent to directly making requests to our API, just a little easier.

Because latency is so important, the best way to use PromptLayer is to first make your request to OpenAI and then log the request to PromptLayer. This is how our Python SDK works under the hood.

If you are used to working with the Python SDK, all you will need to do is swap out your import openai for openai = promptlayer_client.openai. The rest of your code stays the same!

In this step we’ll make a simple request to the OpenAI GPT-3 engine to generate a response for the prompt “My name is”.

The response you’ll see should be a continuation of the prompt, such as “John. Nice to meet you, John.”

Refresh the dashboard and voilà! ✨ Request log screenshot

Enriching Requests

Enriching requests often requires a PromptLayer request ID. All PromptLayer requests have unique IDs, and these can be optionally returned when logging the request.

Tagging a Request

We can also add tags (pl_tags) to our requests to make it easier to search through and organize requests on the dashboard.

Learn more about tags.

Filter by tags on the dashboard as seen below. Tags filtering screenshot

Scoring a Request

Using PromptLayer we can score a request with an integer 0 - 100. This is most often used to understand how effective certain prompts are in production.

Users use scores in many ways (learn more). Below are some examples:

  • 100 if the generated code compiles, 0 if not
  • 100 if the user denotes a thumbs-up, 0 for thumbs-down
  • LLM synthetic evaluation of how much the output matched the prompt

Here, we ask for the capital of New York, and then score the response based on whether it contains the correct answer.

To set the score, we make a second request to the PromptLayer API with the request_id and a score.

Scores can also be set visually in the dashboard. Scoring screenshot

Adding Metadata

We can add metadata to a request to store additional information about it. Metadata is a map of string keys to string values.

Metadata is used to associate requests with specific users, track rollouts, and to store things like error messages (maybe from generated code). You can then filter requests & analytics on the PromptLayer dashboard using metadata keys.

Learn more about metadata.

Here, we make a request to rate how much a person would enjoy a city based on their interests, and then add metadata such as the user’s ID and location:

Now that you have successfully tagged requests with tags & metadata, you can use these features to better sort through requests in the dashboard.

The Analytics page shows high-level graphs and statistics about your usage. You can use metadata keys or tags to filter analytics.

You can also take advantage of our advanced search by using metadata to search in the sidebar.

Prompt Templates

Creating a Prompt in the Registry

We can create a prompt in the PromptLayer Prompt Registry. This allows us to easily reuse this prompt in the future:

After creating a prompt template, we can retrieve it programmatically using the API. The Prompt Registry is often used as a prompt template CMS to avoid blocking prompt changes on eng rollouts.

The Prompt Registry handles versioning, just visually edit the prompt in the dashboard to save a new version. As you can see below, we can retrieve the latest prompt or a specific stable version.

Linking a Prompt to a Request

The Prompt Registry becomes the most useful when you start linking requests with prompt template versions. This makes it easy to compare prompt templates across latency, cost, and quality. It also let’s you easily understand the input variables and how they change.

Once a prompt is in the registry, we can link it to a request (learn more):

Prompt template evaluation

Now that you have created multiple versions of a prompt template and associated it with request logs, navigate back to the Prompt Registry to find statistics about each version.

PromptLayer lets you compare prompt templates across score, latency, and cost. You can also easily see which requests used which templates. Prompt template stats

Using Different Models

In addition to those provided by OpenAI, PromptLayer supports many other providers and model types.


Here’s an example of how to use a chat model from OpenAI:


We can also use models from Anthropic natively with the PromptLayer Python SDK:


Here’s an example of using the Falcon-7b model from HuggingFaceHub. By using LangChain with the PromptLayerCallbackHandler, you can access tons of LLMs. Learn more.

from langchain.callbacks import PromptLayerCallbackHandler
from langchain import HuggingFaceHub

falcon = "tiiuae/falcon-7b-instruct"

llm = HuggingFaceHub(
    model_kwargs={"temperature": 1.0, "max_length": 64}, 
request = llm("How do you make a layer cake?")

And that’s it! With this tutorial, you should now be able to use PromptLayer to work with different language models, make and score requests, and track your prompts and requests. Enjoy using PromptLayer!