Skip to content

Latest commit

 

History

History
21 lines (17 loc) · 999 Bytes

File metadata and controls

21 lines (17 loc) · 999 Bytes

Arabic Stopwords JSON Repository

This repository contains a JSON file listing different Arabic stopwords organized into clear classifications for easy use and integration. The classifications include:

Total stopwords (unique across all groups): 1492

Per-group counts:

  • additional_common_stopwords: 110
  • auxiliary_and_modal_verbs: 29
  • core_prepositions_conjunctions: 55
  • extra_stop_words: 500
  • misc_function_words: 47
  • negation_and_question: 33
  • news_boilerplate_verbs: 32
  • pronouns_determiners: 60
  • punctuation_symbols: 22
  • quantities_numbers_units: 54
  • temporal_and_date_words: 54
  • user_custom_expanded: 917

Designed for natural language processing tasks such as text cleaning, token filtering, and linguistic analysis, this JSON structure offers a comprehensive and well-organized collection of frequent Arabic stopwords. Its simple classification makes it easy to integrate into Arabic language processing workflows including machine learning preprocessing and search indexing.