Generate structured text in the language of a regular expression.
Parameters
model:
An instance of Transformer
that represents a model from the
transformers
library.
regex_str:
The regular expression that the output must follow.
sampler:
The sampling algorithm to use to generate token ids from the logits
distribution.
Returns
A SequenceGeneratorAdapter
instance that generates text constrained by the
regular expression.
Source code in outlines/generate/regex.py
| @singledispatch
def regex(model, regex_str: str, sampler: Sampler = multinomial()):
"""Generate structured text in the language of a regular expression.
Parameters
----------
model:
An instance of `Transformer` that represents a model from the
`transformers` library.
regex_str:
The regular expression that the output must follow.
sampler:
The sampling algorithm to use to generate token ids from the logits
distribution.
Returns
-------
A `SequenceGeneratorAdapter` instance that generates text constrained by the
regular expression.
"""
from outlines.processors import RegexLogitsProcessor
logits_processor = RegexLogitsProcessor(regex_str, tokenizer=model.tokenizer)
return SequenceGeneratorAdapter(model, logits_processor, sampler)
|