Samplers
BeamSearchSampler
Beam Search sampling algorithm.
Attributes
samples The number of samples taken for each input sequence. Equivalent to the number of beams.
Source code in outlines/samplers.py
__call__(next_token_logits, sequence_weights, _)
Call the beam search sampler.
Parameters
next_token_logits
A tensor of shape (n_seqs, vocab_size,)
that represents the
probability distribution of the next token over the vocabulary.
sequence_weights
A tensor of shape (n_seqs,)
that represents the cumulative
weight of each sequence.
rng
A random number generator.
Returns
A tuple with an array that contains the ids of the sampled tokens of
shape (n_seqs, 1)
, an array that contains the ancestors of each
sampled id of shape (n_seqs,)
and an array that contains the updated
cumulative weights of each sequence of shape (n_seqs,)
.
Source code in outlines/samplers.py
GreedySampler
Greedy Sampling algorithm.
Greedy sampling consists in choosing the token with the largest likelihood at every step.
We don't allow more than one sample. We could attribute this a meaning, for instance the k-th sample represents the k-th most likely token. In which case it would be equivalent to beam search without the sequence weights.
Attributes
samples The number of samples taken for each input sequence.
Source code in outlines/samplers.py
__call__(next_token_logits, sequence_weights, _)
Call the greedy sampler.
Parameters
next_token_logits
A tensor of shape (n_seqs, vocab_size,)
that represents the
probability distribution of the next token over the vocabulary.
sequence_weights
A tensor of shape (n_seqs,)
that represents the cumulative
weight of each sequence.
rng
A random number generator.
Returns
A tuple with an array that contains the ids of the sampled tokens of
shape (n_seqs, 1)
, an array that contains the ancestors of each
sampled id of shape (n_seqs,)
and an array that contains the updated
cumulative weights of each sequence of shape (n_seqs,)
.
Source code in outlines/samplers.py
MultinomialSampler
Multinomial sampling algorithm.
Multinomial sampling consists in randomly sampling the next token assuming its distribution is a Categorical distribution parametrized by the next-token logits.
Attributes
samples The number of samples taken for each input sequence.
Source code in outlines/samplers.py
83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 |
|
__call__(next_token_logits, sequence_weights, rng)
Call the multinomial sampler.
Parameters
next_token_logits
A tensor of shape (n_seqs, vocab_size,)
that represents the
probability distribution of the next token over the vocabulary.
sequence_weights
A tensor of shape (n_seqs,)
that represents the cumulative
weight of each sequence.
rng
A random number generator.
Returns
A tuple with an array that contains the ids of the sampled tokens of
shape (n_seqs, 1)
, an array that contains the ancestors of each
sampled id of shape (n_seqs,)
and an array that contains the updated
cumulative weights of each sequence of shape (n_seqs,)
.
Source code in outlines/samplers.py
keep_top_k_logits(k)
Build a function that masks logits values smaller than the top k
ones.
Parameters
k
The ranking below which logit values are replaced by -math.inf
.
Source code in outlines/samplers.py
keep_top_p_logits(p)
Build a function that masks the lowest probability tokens whose cumulative probability is below a certain threshold.
Parameters
p
The value of the threshold. We keep the highest probability tokens whose
cumulative distribution is greater than or equal to p
and mask the
others. Its value must be between 0 (excluded) and 1 (included).
Source code in outlines/samplers.py
rescale_logits(temperature)
Build a function that rescales the token probabilities exponentially.
Parameters
temperature The value by which we rescale the logits.