ls_mlkit.util.offload.split module¶
- ls_mlkit.util.offload.split.get_model_memory(model: Module, forward_factor: float = 1.3)[source]¶
Calculates the estimated memory usage of a model in gigabytes.
- Parameters:
model (torch.nn.Module) – The model whose memory usage is to be calculated.
forward_factor (float, optional) – A factor to account for additional memory usage during the forward pass. Defaults to 1.3.
- Returns:
The estimated memory usage of the model in gigabytes.
- Return type:
float
- ls_mlkit.util.offload.split.get_split_num(origin_type: str = 'bf16', quant_type: str = 'int8')[source]¶
Calculates the ratio of original type size to quantized type size.
- Parameters:
origin_type (str, optional) – The data type of the original tensor. Defaults to “bf16”. Options are “fp32” and “bf16”.
quant_type (str, optional) – The data type of the quantized tensor. Defaults to “int8”. Options are “int8” and “nf4”.
- Raises:
ValueError – If the origin_type is not “fp32” or “bf16”.
ValueError – If the quant_type is not “int8” or “nf4”.
- Returns:
The ratio of the original type size to the quantized type size.
- Return type:
int