ls_mlkit.dataset.lda_dataset module¶
- class ls_mlkit.dataset.lda_dataset.LDADataset(n_samples: int = 1000, n_local_topics: int = 1, n_total_topics: int = 10, n_words_per_topic: int = 7, seq_len: int = 100, fix_seq_len: bool = True, seed: int = 31, per_topic_strategy: str = 'cyclic', fix_local_topics_num: bool = True, topic_distribution: str = 'uniform')[source]¶
Bases:
DatasetDataset for LDA. no need to be tokenized.
- ls_mlkit.dataset.lda_dataset.get_lda_dataset(seed: int = 31, n_samples: int = 100, n_local_topics: int = 1, n_total_topics: int = 10, n_words_per_topic: int = 7, seq_len: int = 100, eval_ratio: float = 0.1, fix_seq_len: bool = True, fix_local_topics_num: bool = True, per_topic_strategy: str = 'cyclic', topic_distribution: str = 'uniform')[source]¶