Did you read Quickstart Part One?
Streaming
Streaming responses usingpromptlayer_client.run
is easy. Streaming allows the API to return responses incrementally, delivering partial outputs as they’re generated rather than waiting for the complete response. This can significantly improve the perceived responsiveness of your application.
prompt_blueprint
, allowing you to track how the response is constructed in real-time. The request_id
is only included in the final chunk, indicating completion of the streaming response.
Learn more about OpenAI streams.
Organization
Workspaces
Shared workspaces allow you to collaborate with the rest of the team. Use separate workspaces to organize work between projects or deployment environments. We commonly see teams with “Prod” and “Dev” workspaces. Each workspace has its own unique PromptLayer API key, allowing you to maintain separate authentication for different environments.
Prompt branching
Use “Duplicate” or “Copy a version” to organize and branch your work. You can use this to duplicate a prompt into another workspace or to pop out a version into a brand new prompt template.
Switching models
An important part of prompt engineering is finding the right model. PromptLayer makes it easy to switch between language models and test them out. To switch between different models for your prompts, we recommend using our Prompt Blueprint feature.Prompt Blueprint
Prompt Blueprint is a model-agnostic data format that allows you to update models in PromptLayer without changing any code. Instead of usingresponse["raw_response"]
to access the LLM response (as done in earlier code snippets), we recommend using the standardized response["prompt_blueprint"]
.
Using it looks something like this:
prompt_template
return type of get-prompt-template.
Migrating prompts
PromptLayer supports various models beyond OpenAI. You can easily switch between different models by updating the model parameter in your prompt template. For details on comparing models, see our blog post on migrating prompts to open source models.If you're using a new model, make sure to add the new key to your .env file
If you're using a new model, make sure to add the new key to your .env file
For better security, you can use environment variables from a .env file:For more on setting up environment variables in Python, refer to this guide.If you are using Python, we recommend using Create a Python file called
.env
python-dotenv
:app.py
and load the environment variables:Updating the Base URL
To use your own self-hosted models or those from providers like HuggingFace, add a custom base URL to your workspace. In settings, scroll to “Provider Base URLs”. Models must conform to one of the listed provider model-families.
Fine-tuning
PromptLayer makes it incredibly easy to build fine-tuned models. It’s specifically useful for fine-tuning a cheapergpt-3.5-turbo
model on more expensive gpt-4
historical request data.
Be warned, fine-tuning is hard to get right. We wrote a blog post on why most teams should not rely on fine-tuning.