Quickstart - Pt 2
Streaming
Streaming responses using promptlayer_client.run
is easy.
Learn more about OpenAI streams.
Organization
Workspaces
Shared workspaces allow you to collaborate with the rest of the team. Use separate workspaces to organize work between projects or deployment environments. We commonly see teams with “Prod” and “Dev” workspaces.
Prompt branching
Use “Duplicate” or “Copy a version” to organize and branch your work. You can use this to duplicate a prompt into another workspace or to pop out a version into a brand new prompt template.
Groups
Using groups, you can associate multiple request logs with eachother. This makes searching and debugging much easier.
To do this, you must first create a group ID.
Then, when running the request, just pass in this group ID.
Switching models
An important part of prompt engineering is finding the right model. PromptLayer makes it easy to switch between language models and test them out.
Prompt Blueprint
Prompt Blueprint is a model-agnostic data format that allows you to update models in PromptLayer without changing any code.
Instead of using response["raw_response"]
to access the LLM response (as done in earlier code snippets), we recommend using the standardized response["prompt_blueprint"]
.
Using it looks something like this:
Using the above code snippet, you can update from OpenAI -> Anthropic without any code changes.
For the exact schema, please look at the prompt_template
return type of get-prompt-template.
Migrating prompts
PromptLayer supports various models beyond OpenAI. You can easily switch between different models by updating the model parameter in your prompt template.
For details on comparing models, see our blog post on migrating prompts to open source models.
Updating the Base URL
To use your own self-hosted models or those from providers like HuggingFace, add a custom base URL to your workspace. In settings, scroll to “Provider Base URLs”.
Models must conform to one of the listed provider model-families.
Base URLs will work locally and in the PromptLayer Playground.
Fine-tuning
PromptLayer makes it incredibly easy to build fine-tuned models. It’s specifically useful for fine-tuning a cheaper gpt-3.5-turbo
model on more expensive gpt-4
historical request data.
Be warned, fine-tuning is hard to get right. We wrote a blog post on why most teams should not rely on fine-tuning.
Advanced prompt engineering
A/B releases (Prompt A/B Testing)
A/B Releases allow you to test different versions of your prompts in production, safely roll out updates, and segment users. With this feature, you can split traffic between prompt versions based on percentages or user segments, enabling gradual rollouts and targeted testing. This functionality is powered by Dynamic Release Labels, which let you overload release labels and dynamically route traffic to different prompt versions based on your configuration. Learn more.
Batch jobs and datasets
In PromptLayer, you can build datasets by either uploading new data or utilizing historical data. This is a crucial step for running batch jobs and evaluating the performance of your prompts.
Learn more about datasets. PromptLayer provides tools to label or annotate data, or to build datasets with requests that you have previously logged.
Once your datasets are ready, you can use the evals page to run a batch job. In this context, the datasets serve as the input variables to the prompt for each run in the batch.
Backtests
Backtesting is the easiest way to evaluate your prompts, allowing you to assess how new prompt versions would have performed under past conditions. To perform backtests, start by building a dataset from your request history. This can be done in a few clicks on the Datasets page.
Once you have your dataset, the next step is to create an evaluation pipeline. This pipeline will feed historical request contexts into your new prompt version and compare the new results to the old results. You can use simple string comparisons or more advanced techniques like cosine similarities to measure differences. For detailed instructions, visit the backtesting section. Backtesting is an effective way to detect potential regressions and validate improvements, ensuring that updates enhance rather than detract from the user experience.
Custom evals
For some prompts, it’s better to build tailored evaluation pipelines that meet your specific requirements. For example, you can use PromptLayer to build end-to-end RAG pipelines or unit test evaluations.
Custom evaluations can be integrated into your CI/CD pipeline to run on every new version of your prompt. Learn more about continuous integration and explore eval examples here. Evaluations provide a robust framework for continuously improving your prompts by rigorously testing them against a variety of scenarios and metrics.
Was this page helpful?