N-gram Language Model for Autocomplete (Python)

Description:

This Python script implements a basic N-gram language model for text autocompletion. It allows users to enter a sequence of words, and the model suggests the most likely next word based on the frequency of word n-grams observed in the training data.

Features:

Processes and cleans Twitter data using regular expressions and tokenization.
Handles out-of-vocabulary (OOV) words using an unknown token ("").
Builds N-gram models (unigrams, bigrams, trigrams, etc.) up to a configurable maximum N.
Calculates probabilities for next words using Laplace smoothing.
Provides suggestions based on multiple N-gram models, allowing the user to choose the best fit.

Usage:

Save the script as n_gram_autocomplete.py.
Modify the file_path variable in the script to point to your Twitter data file (.txt format).
Run the script: python n_gram_autocomplete.py.
Enter a sequence of words and press Enter. The script will suggest the most likely next word(s).

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
AutoComplete.ipynb		AutoComplete.ipynb
README.md		README.md
requirements		requirements

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

N-gram Language Model for Autocomplete (Python)

About

Uh oh!

Releases

Packages

Languages

MINAMOREED/Auto-Complete-Bot

Folders and files

Latest commit

History

Repository files navigation

N-gram Language Model for Autocomplete (Python)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages