Skip to content

Language Identification Helper #8

@Olamyy

Description

@Olamyy

I'm proposing a language identification helper module that can:

  1. Be used to build language id models using any of the rule based or learning algorithm available for doing this.
  2. Be used to identify languages.

Proposing a usage similar to:

from iranlowo.language import LanguageIdentifier

lang_a = 'eng'
lang_b = 'yor'
lang_a_corpus = 'path_to_corpus'
lang_b_corpus = 'path_to_corpus'

lang_model = LanguageIdentifier(langs=[lang_a, lang_b], corpus=[lang_a_corpus, lang_b_corpus], **kwargs)
lang_model.build(algo='', epoch=epoch, batch=batch, **kwargs)
lang_model.save('save_path')

Then this model can be loaded and used to identify languages like:

from iranlowo.language import identify_language, load_model

language_id_model = load_model('save_path')

language_id = identify_language(language_id_model, 'text')

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions