Skip to content

Regex

regex(model, regex_str, sampler=multinomial())

Generate structured text in the language of a regular expression.

Parameters:

Name Type Description Default
model

An instance of Transformer that represents a model from the transformers library.

required
regex_str str

The regular expression that the output must follow.

required
sampler Sampler

The sampling algorithm to use to generate token ids from the logits distribution.

multinomial()

Returns:

Type Description
A `SequenceGeneratorAdapter` instance that generates text constrained by the
regular expression.
Source code in outlines/generate/regex.py
@singledispatch
def regex(model, regex_str: str, sampler: Sampler = multinomial()):
    """Generate structured text in the language of a regular expression.

    Parameters
    ----------
    model:
        An instance of `Transformer` that represents a model from the
        `transformers` library.
    regex_str:
        The regular expression that the output must follow.
    sampler:
        The sampling algorithm to use to generate token ids from the logits
        distribution.

    Returns
    -------
    A `SequenceGeneratorAdapter` instance that generates text constrained by the
    regular expression.

    """
    from outlines.processors import RegexLogitsProcessor

    logits_processor = RegexLogitsProcessor(regex_str, tokenizer=model.tokenizer)
    return SequenceGeneratorAdapter(model, logits_processor, sampler)