Course Link: Introduction to NLP
---
This Python package is created by Uditya Narayan Tiwari. It provides various text preprocessing utilities for natural language processing (NLP) tasks.
You can install this package using pip as follows:
pip install nlp_text_preprocessing
You can install this package from GitHub as follows:
pip install git+https://github.com/udityamerit/Text-Processing-Package-For-Natural-Language-Processing.git --upgrade --force-reinstall
To uninstall the package, use the following command:
pip uninstall nlp_text_preprocessingYou need to install these python packages.
python -m spacy download en_core_web_sm
spacy
textblob
beautifulsoup4
nltk
openpyxl
SpeechRecognition==3.10.4
pyaudio==0.2.14
PrettyTable
scikit-learn
wordcloud
lxml
pandas
numpy
matplotlib
import nlp_text_preprocessing as tp
text = "HELLO WORLD!"
processed_text = tp.to_lower_case(text)
print(processed_text) # Output: hello world!import nlp_text_preprocessing as tp
text = "I'm learning NLP."
processed_text = tp.contraction_to_expansion(text)
print(processed_text) # Output: I am learning NLP.import nlp_text_preprocessing as tp
text = "Contact me at example@example.com"
processed_text = tp.remove_emails(text)
print(processed_text) # Output: Contact me at import nlp_text_preprocessing as tp
text = "Check out https://example.com"
processed_text = tp.remove_urls(text)
print(processed_text) # Output: Check outimport nlp_text_preprocessing as tp
text = "<p>Hello World!</p>"
processed_text = tp.remove_html_tags(text)
print(processed_text) # Output: Hello World!import nlp_text_preprocessing as tp
text = "Hello @World! #NLP"
processed_text = tp.remove_special_chars(text)
print(processed_text) # Output: Hello World NLPimport nlp_text_preprocessing as tp
text = "running runs"
processed_text = tp.lemmatize(text)
print(processed_text) # Output: run runimport nlp_text_preprocessing as tp
text = "I love programming!"
sentiment = tp.sentiment_analysis(text)
print(sentiment) # Output: Sentiment(polarity=0.5, subjectivity=0.6)import nlp_text_preprocessing as tp
from googletrans import Translator
translator = Translator()
text = "Bonjour tout le monde"
lang = tp.detect_language(text, translator)
translated_text = tp.translate(text, 'en', translator)
print(f"Language: {lang}, Translated: {translated_text}")
# Output: Language: fr, Translated: Hello everyoneimport nlp_text_preprocessing as tp
text = "I love NLP."
count = tp.word_count(text)
print(count) # Output: 3import nlp_text_preprocessing as tp
text = "I love NLP."
count = tp.char_count(text)
print(count) # Output: 9import nlp_text_preprocessing as tp
text = "I love NLP"
ngrams = tp.n_grams(text, n=2)
print(ngrams) # Output: [('I', 'love'), ('love', 'NLP')]Here’s an example of how you might use several functions together to clean text data:
import nlp_text_preprocessing as tp
text = "I'm loving this NLP tutorial! Contact me at https://www.linkedin.com/in/uditya-narayan-tiwari-562332289/ Visit https://udityanarayantiwari.netlify.app/"
cleaned_text = tp.clean_text(text)
print(cleaned_text)
# Output: i am loving this nlp tutorial contact me at visitimport nlp_text_preprocessing as tp
tp.extract_features("I love NLP")- Be cautious when using heavy operations like
lemmatizeandspelling_correctionon very large datasets, as they can be time-consuming. - The package supports custom cleaning and preprocessing pipelines by using these modular functions together.
