-
Notifications
You must be signed in to change notification settings - Fork 9
Open
Labels
enhancementNew feature or requestNew feature or request
Description
I'm proposing a language identification helper module that can:
- Be used to build language id models using any of the rule based or learning algorithm available for doing this.
- Be used to identify languages.
Proposing a usage similar to:
from iranlowo.language import LanguageIdentifier
lang_a = 'eng'
lang_b = 'yor'
lang_a_corpus = 'path_to_corpus'
lang_b_corpus = 'path_to_corpus'
lang_model = LanguageIdentifier(langs=[lang_a, lang_b], corpus=[lang_a_corpus, lang_b_corpus], **kwargs)
lang_model.build(algo='', epoch=epoch, batch=batch, **kwargs)
lang_model.save('save_path')Then this model can be loaded and used to identify languages like:
from iranlowo.language import identify_language, load_model
language_id_model = load_model('save_path')
language_id = identify_language(language_id_model, 'text')Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request