ls_mlkit.util.offload.split module¶

ls_mlkit.util.offload.split.get_model_memory(model: Module, forward_factor: float = 1.3)[source]¶

Calculates the estimated memory usage of a model in gigabytes.

Parameters:

model (torch.nn.Module) – The model whose memory usage is to be calculated.
forward_factor (float, optional) – A factor to account for additional memory usage during the forward pass. Defaults to 1.3.

Returns:

The estimated memory usage of the model in gigabytes.

Return type:

float

ls_mlkit.util.offload.split.get_split_num(origin_type: str = 'bf16', quant_type: str = 'int8')[source]¶

Calculates the ratio of original type size to quantized type size.

Parameters:

origin_type (str, optional) – The data type of the original tensor. Defaults to “bf16”. Options are “fp32” and “bf16”.
quant_type (str, optional) – The data type of the quantized tensor. Defaults to “int8”. Options are “int8” and “nf4”.

Raises:

Returns:

The ratio of the original type size to the quantized type size.

Return type:

int

ls-mlkit