ls_mlkit.util.llm module¶

ls_mlkit.util.llm.add_maybe_special_tokens(model, tokenizer)[source]¶

Add maybe special tokens to the tokenizer and model, resize the model’s token embeddings if needed

Parameters:

model (torch.nn.Module) – the model to add the special tokens
tokenizer (Tokenizer) – the tokenizer to add the special tokens

Returns:

the model and the tokenizer

Return type:

tuple

ls_mlkit.util.llm.compute_metrics(eval_prediction: EvalPrediction)[source]¶

Compute the metrics for a model

Parameters:: eval_prediction (EvalPrediction) – the evaluation prediction
Returns:: the metrics
Return type:: dict

ls_mlkit.util.llm.get_data_collator(tokenizer, max_length=-1, ignore_masked_token=True, model_type='causal')[source]¶

Get a data collator for a model, Shifting the inputs and labels to align them happens inside the model, so the data collator just copies the inputs to create the labels.

Parameters:

tokenizer (Tokenizer) – the tokenizer to use
max_length (int, optional) – the maximum length of the input text. Defaults to -1.
ignore_masked_token (bool, optional) – whether to ignore the masked tokens. Defaults to True.
model_type (str, optional) – the type of the model. Defaults to “causal”.

Returns:

the data collator

Return type:

callable

ls_mlkit.util.llm.preprocess_logits_for_metrics(logits, labels)[source]¶

Preprocess the logits for metrics

Parameters:

logits (torch.Tensor) – the logits
labels (torch.Tensor) – the ground truth labels

Returns:

the prediction ids and the ground truth labels

Return type:

tuple

ls_mlkit.util.llm module¶

ls-mlkit

Navigation

Related Topics