Skip to content

Pre-processing #3

@bittlingmayer

Description

@bittlingmayer

fastText did some pre-processing before training, so it's safest to do the same, especially lowercasing:

> (async () => { console.log(await lid.predict("Ótimo")) })()
> [ { lang: 'hu', prob: 0.3540060520172119 } ]
> (async () => { console.log(await lid.predict("ótimo")) })()
> [ { lang: 'pt', prob: 0.9964836835861206 } ]
> (async () => { console.log(await lid.predict("Uma boa experiencia")) })()
> [ { lang: 'en', prob: 0.5653783679008484 } ]
> (async () => { console.log(await lid.predict("uma boa experiencia")) })()
> [ { lang: 'pt', prob: 0.9949126839637756 } ]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions