OpenAI and compatible APIs

Installation

You need to install the openai library to be able to use the OpenAI API in Outlines. Or alternatively:

pip install "outlines[openai]"

OpenAI models

Outlines supports models available via the OpenAI Chat API, e.g. GPT-4o, ChatGPT and GPT-4. You can initialize the model by passing the model name to outlines.models.openai:

from outlines import models


model = models.openai("gpt-4o-mini")
model = models.openai("gpt-4o")

Check the OpenAI documentation for an up-to-date list of available models. You can pass any parameter you would pass to openai.AsyncOpenAI as keyword arguments:

import os
from outlines import models


model = models.openai(
    "gpt-4o-mini",
    api_key=os.environ["OPENAI_API_KEY"]
)

The following table enumerates the possible parameters. Refer to the OpenAI SDK's code for an up-to-date list.

Parameters:

Parameters	Type	Description	Default
`api_key`	`str`	OpenAI API key. Infered from `OPENAI_API_KEY` if not specified	`None`
`organization`	`str`	OpenAI organization id. Infered from `OPENAI_ORG_ID` if not specified	`None`
`project`	`str`	OpenAI project id. Infered from `OPENAI_PROJECT_ID` if not specified.	`None`
`base_url`	`str \| https.URL`	Base URL for the endpoint. Infered from `OPENAI_BASE_URL` if no specified.	`None`
`timeout`	`float`	Request timeout.	`NOT_GIVEN`
`max_retries`	`int`	Maximum number of retries for failing requests	`2`
`default_headers`	`Mapping[str, str]`	Default HTTP headers	`None`
`default_query`	`Mapping[str, str]`	Custom parameters added to the HTTP queries	`None`
`http_client`	`https.AsyncClient`	User-specified `httpx` client	`None`

Azure OpenAI models

Outlines also supports Azure OpenAI models:

from outlines import models


model = models.azure_openai(
    "azure-deployment-name",
    "gpt-4o-mini",
    api_version="2024-07-18",
    azure_endpoint="https://example-endpoint.openai.azure.com",
)

Why do I need to specify model and deployment name?

The model name is needed to load the correct tokenizer for the model. The tokenizer is necessary for structured generation.

You can pass any parameter you would pass to openai.AsyncAzureOpenAI. You can consult the OpenAI SDK's code for an up-to-date list.

Parameters:

Parameters	Type	Description	Default
`azure_endpoint`	`str`	Azure endpoint, including the resource. Infered from `AZURE_OPENAI_ENDPOINT` if not specified	`None`
`api_version`	`str`	API version. Infered from `AZURE_OPENAI_API_KEY` if not specified	`None`
`api_key`	`str`	OpenAI API key. Infered from `OPENAI_API_KEY` if not specified	`None`
`azure_ad_token`	`str`	Azure active directory token. Inference from `AZURE_OPENAI_AD_TOKEN` if not specified	`None`
`azure_ad_token_provider`	`AzureADTokenProvider`	A function that returns an Azure Active Directory token	`None`
`organization`	`str`	OpenAI organization id. Infered from `OPENAI_ORG_ID` if not specified	`None`
`project`	`str`	OpenAI project id. Infered from `OPENAI_PROJECT_ID` if not specified.	`None`
`base_url`	`str \| https.URL`	Base URL for the endpoint. Infered from `OPENAI_BASE_URL` if not specified.	`None`
`timeout`	`float`	Request timeout.	`NOT_GIVEN`
`max_retries`	`int`	Maximum number of retries for failing requests	`2`
`default_headers`	`Mapping[str, str]`	Default HTTP headers	`None`
`default_query`	`Mapping[str, str]`	Custom parameters added to the HTTP queries	`None`
`http_client`	`https.AsyncClient`	User-specified `httpx` client	`None`

Models that follow the OpenAI standard

Outlines supports models that follow the OpenAI standard. You will need to initialize the OpenAI client properly configured and pass it to outlines.models.openai

import os
from openai import AsyncOpenAI
from outlines import models
from outlines.models.openai import OpenAIConfig


client = AsyncOpenAI(
    api_key=os.environ.get("PROVIDER_KEY"),
    base_url="http://other.provider.server.com"
)
config = OpenAIConfig("model_name")
model = models.openai(client, config)

Warning

You need to pass the async client to be able to do batch inference.

Structured Generation Support

Outlines provides support for OpenAI Structured Outputs via outlines.generate.json, outlines.generate.choice

from pydantic import BaseModel, ConfigDict
import outlines.models as models
from outlines import generate

model = models.openai("gpt-4o-mini")

class Person(BaseModel):
    model_config = ConfigDict(extra='forbid')  # required for openai
    first_name: str
    last_name: str
    age: int

generator = generate.json(model, Person)
generator("current indian prime minister on january 1st 2023")
# Person(first_name='Narendra', last_name='Modi', age=72)

generator = generate.choice(model, ["Chicken", "Egg"])
print(generator("Which came first?"))
# Chicken

Warning

Structured generation support only provided to OpenAI-compatible endpoints which conform to OpenAI's standard. Additionally, generate.regex and generate.cfg are not supported.

Advanced configuration

For more advanced configuration option, such as support proxy, please consult the OpenAI SDK's documentation:

from openai import AsyncOpenAI, DefaultHttpxClient
from outlines import models
from outlines.models.openai import OpenAIConfig


client = AsyncOpenAI(
    base_url="http://my.test.server.example.com:8083",
    http_client=DefaultHttpxClient(
        proxies="http://my.test.proxy.example.com",
        transport=httpx.HTTPTransport(local_address="0.0.0.0"),
    ),
)
config = OpenAIConfig("model_name")
model = models.openai(client, config)

It is possible to specify the values for seed, presence_penalty, frequence_penalty, top_p by passing an instance of OpenAIConfig when initializing the model:

from outlines.models.openai import OpenAIConfig
from outlines import models


config = OpenAIConfig(
    presence_penalty=1.,
    frequency_penalty=1.,
    top_p=.95,
    seed=0,
)
model = models.openai("gpt-4o-mini", config)

Monitoring API use

It is important to be able to track your API usage when working with OpenAI's API. The number of prompt tokens and completion tokens is directly accessible via the model instance:

from openai import AsyncOpenAI
import outlines.models


model = models.openai("gpt-4o")

print(model.prompt_tokens)
# 0

print(model.completion_tokens)
# 0

These numbers are updated every time you call the model.