Regular expressions
Outlines can guarantee that the text generated by the LLM will be valid to a regular expression:
from outlines import models, generate
model = models.transformers("microsoft/Phi-3-mini-4k-instruct")
generator = generate.regex(
model,
r"((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)",
)
prompt = "What is the IP address of the Google DNS servers? "
answer = generator(prompt, max_tokens=30)
print(answer)
# What is the IP address of the Google DNS servers?
# 2.2.6.1
If you find yourself using generate.regex
to restrict the answers' type you can take a look at type-structured generation instead.
Performance
generate.regex
computes an index that helps Outlines guide generation. This can take some time, but only needs to be done once. If you want to generate several times using the same regular expression make sure that you only call generate.regex
once.