Documentation Index
Fetch the complete documentation index at: https://docs.promptlayer.com/llms.txt
Use this file to discover all available pages before exploring further.
JavaScript SDK
Official JavaScript/TypeScript SDK for interacting with the PromptLayer API from server-side runtimes.
Installation
Using the run Method (Recommended)
The easiest way to use PromptLayer is with the run() method. It fetches a prompt template from the Prompt Registry, executes it against your configured LLM provider, and logs the result — all in one call.
Your LLM API keys (OpenAI, Anthropic, etc.) are never sent to our servers. All LLM requests are
made locally from your machine, PromptLayer just logs the request.
run() method works with any provider configured in your prompt template — OpenAI, Anthropic, Google, and more. See the Run documentation for full details.
After making your first few requests, you should be able to see them in the PromptLayer dashboard!

Basic Usage
For any LLM provider you plan to use, you must set its corresponding API key as an environment variable (for example,
OPENAI_API_KEY, ANTHROPIC_API_KEY, GOOGLE_API_KEY etc.). The PromptLayer client does not support passing these keys directly in code. If the relevant environment variables are not set, any requests to those LLM providers will fail.Provider-Specific Configuration
Provider-Specific Configuration
Using Gemini models through Vertex AI
JavaScript SDK: Set these environment variables:VERTEX_AI_PROJECT_ID="<google_cloud_project_id>"VERTEX_AI_PROJECT_LOCATION="region"GOOGLE_APPLICATION_CREDENTIALS="path/to/google_service_account_file.json"
Using Claude models through Vertex AI
JavaScript SDK: Set these environment variables:GOOGLE_APPLICATION_CREDENTIALS="path/to/google_service_account_file.json"CLOUD_ML_REGION="region"
JavaScript
Parameters
prompt_name/promptName(str, required): The name of the prompt to run.prompt_version/promptVersion(int, optional): Specific version of the prompt to use.prompt_release_label/promptReleaseLabel(str, optional): Release label of the prompt (e.g., “prod”, “staging”).input_variables/inputVariables(Dict[str, Any], optional): Variables to be inserted into the prompt template.tags(List[str], optional): Tags to associate with this run.metadata(Dict[str, str], optional): Additional metadata for the run.model_parameter_overrides/modelParameterOverrides(Union[Dict[str, Any], None], optional): Model-specific parameter overrides.stream(bool, default=False): Whether to stream the response.provider(str, optional): The LLM provider to use (e.g., “openai”, “anthropic”, “google”). This is useful if you want to override the provider specified in the prompt template.model(str, optional): The model to use (e.g., “gpt-4o”, “claude-3-7-sonnet-latest”, “gemini-2.5-flash”). This is useful if you want to override the model specified in the prompt template.
Return Value
The method returns a dictionary (Python) or object (JavaScript) with the following keys:request_id: Unique identifier for the request.raw_response: The raw response from the LLM provider.prompt_blueprint: The prompt blueprint used for the request.
Advanced Usage
Streaming
To stream the response:JavaScript
prompt_blueprint, allowing you to track how the response is constructed in real-time. The request_id is only included in the final chunk.
Using Different Versions or Release Labels
JavaScript
Adding Tags and Metadata
JavaScript
Overriding Model Parameters
You can also overrideprovider and model at runtime to choose a different LLM provider or model. This is useful if you want to use a different provider than the one specified in the prompt template. PromptLayer will automatically return the correct llm_kwargs for the specified provider and model with default values for the parameters corresponding to the provider and model.
JavaScript
Running Workflows
UserunWorkflow() to execute a PromptLayer Workflow from the JavaScript SDK. Workflows are multi-step pipelines that can combine prompt, tool, code, and conditional nodes.
JavaScript
Workflow Parameters
workflowName(string, required): The Workflow name to run.inputVariables(object, optional): Variables to pass into the Workflow.metadata(object, optional): Metadata to attach to the Workflow run.workflowLabelName(string, optional): Label name for the Workflow version, such as"production".workflowVersion(number, optional): Specific Workflow version number to run.returnAllOutputs(boolean, default=false): Whether to return outputs for every Workflow node.
Workflow Return Value
By default,runWorkflow() returns the final output node’s value. When returnAllOutputs is true, it returns an object keyed by node name, including each node’s status, value, errors, and whether the node is an output node.
JavaScript
returnAllOutputs: true:
SDK Cache
The PromptLayer JavaScript SDK supports an in-memory template cache to reduce fetch latency and improve resilience when the PromptLayer API has transient failures. Enable cache when you want to:- Reduce repeated template fetch latency
- Lower dependency on real-time PromptLayer API availability
- Continue serving recently known-good templates during temporary API issues
cacheTtlSeconds when creating a client:
How It Works
When cache is enabled,templates.get() and run() use this flow:
- Return a fresh cached template if available.
- If cache is stale or missing, fetch from API and refresh cache.
- If API fetch fails with a transient error and a stale template exists, serve the stale template.
Stale fallback applies to transient API failures such as retryable HTTP errors (including
429 and 5xx) and network-level issues.Important Behavior
- Cache is in-memory and process-local (not shared across machines/containers).
- Requests with
metadataFiltersormodelParameterOverridesbypass cache. - Publishing via
templates.publish()invalidates cache for that prompt name. - Call
promptLayerClient.invalidate("prompt-name")to clear one prompt from cache. - Call
promptLayerClient.invalidate()to clear the full SDK cache.
Practical Guidance
- Start with
cacheTtlSecondsbetween60and300. - Use a shorter TTL if your prompts change frequently.
- Use a longer TTL if your prompts are stable and lower latency matters most.
- Keep
throwOnError: trueif you want hard failures when no cache entry is available.
Custom Logging with logRequest
If you need more control — for example, using your own LLM client, a custom provider, or background processing — you can use logRequest to manually log requests to PromptLayer.
OpenAI Example
Anthropic Example
Error Handling
PromptLayer provides robust error handling with configurable error behavior for JavaScript/TypeScript applications.Using throwOnError
By default, PromptLayer throws errors when API requests fail. You can control this behavior using the throwOnError parameter:
Automatic Retry Mechanism
PromptLayer includes a built-in retry mechanism using the industry-standard p-retry library to handle transient failures gracefully. This ensures your application remains resilient when temporary issues occur. Retry Behavior:- Total Attempts: 4 attempts (1 initial + 3 retries)
- Exponential Backoff: Retries wait progressively longer between attempts (2s, 4s, 8s)
- Max Wait Time: 15 seconds maximum wait between retries
- 5xx Server Errors: Internal server errors, service unavailable, etc.
- 429 RateLimit Errors: API RateLimit Error.
- Network Errors: Connection failures (ENOTFOUND, ECONNREFUSED, ETIMEDOUT, etc.)
- 4xx Client Errors: Bad requests, authentication errors, not found, validation errors, etc. except for 429 Ratelimit error.
The retry mechanism operates transparently in the background. You don’t need to implement retry logic yourself - PromptLayer handles it automatically for recoverable errors.
Logging
PromptLayer logs info to the console before each retry attempt. When a retry occurs, you’ll see log messages like:console.info output or use a logging library that intercepts console methods.
Edge
PromptLayer can be used with Edge functions. Use either therun() method, logRequest, or our REST API directly.

