Beta · v1.2.0 · Built by Fahad Mirza · Co-developed with Claude (Anthropic)
wordcloud2 is a Stata command that reads a string variable containing free text, counts word frequencies across all observations, and renders the results as a fully interactive, self-contained HTML/SVG word cloud. No external software, no Python, no R. Open the output in any modern browser and it just works.
Example 1: Default output
Example 2: Interactive tooltip on hover
Screenshots generated from the example do-file using the built-in synthetic survey dataset.
- Collision-free layout: Archimedean spiral + AABB collision detection ensures words never overlap, mirroring the algorithm used by Python's
amueller/word_cloud - Interactive HTML output: hover any word to see its exact frequency count in a tooltip
- In-browser static export: embed a one-click download button for PNG, JPG, or SVG without leaving the browser
- Full color control: set background color and title color using any CSS value (hex, named,
rgb(),rgba(),hsl()) - Palette integration: plug in any palette from Ben Jann's colorpalette package by name
- Font size scaling: sizes mapped on the square root of frequency so rare words stay legible
- Built-in English stopwords: based on the
amueller/word_clouddefault list, fully suppressible - Safe by design: uses
preserve/restoreinternally; your dataset is never modified
Run this single command in Stata to download and install wordcloud2 immediately:
net install wordcloud2, from("https://raw.githubusercontent.com/fahad-mirza/wordcloud2_stata/main/") replaceIf you'd like to review the package description first, run net describe and then confirm the install:
net describe wordcloud2, from("https://raw.githubusercontent.com/fahad-mirza/wordcloud2_stata/main/")This displays the package name, version, author, and the list of files that will be installed. When you're ready, install with:
net install wordcloud2, replaceSSL note: If you receive an SSL error, run
set sslrelax onbefore either command.
which wordcloud2
help wordcloud2To use the palette() option, install Ben Jann's packages from SSC:
ssc install palettes, replace
ssc install colrspace, replace
ssc install moremata, replacewordcloud2, textvar(varname) [options]| Group | Option | Description | Default |
|---|---|---|---|
| Required | textvar(varname) |
String variable containing free text | — |
| Text processing | maxwords(#) |
Maximum words to display | 100 |
minfreq(#) |
Minimum word frequency to include | 2 |
|
minlength(#) |
Minimum word character length | 3 |
|
stopwords(string) |
Extra stopwords (space-delimited) | — | |
nostopwords |
Suppress built-in English stopword list | — | |
noclean |
Suppress lowercasing and punctuation removal | — | |
| Appearance | title("text"[, color() size()]) |
Chart title with optional color and size | "Word Cloud" |
width(#) |
Canvas width in pixels | 900 |
|
height(#) |
Canvas height in pixels | 500 |
|
margin(#) |
Minimum gap between words in pixels | 6 |
|
maxfontsize(#) |
Font size of the most frequent word | 80 |
|
minfontsize(#) |
Font size of the least frequent word | 10 |
|
bgcolor(color) |
Background color (any CSS value) | #16213e |
|
palette(name) |
Color palette name from colorpalette |
built-in tab10 | |
| Output | savefile(filename) |
Output HTML file path | wordcloud2.html |
export(format) |
Embed download button: png, jpg, or svg |
— |
Both bgcolor() and the color() sub-option of title() accept any CSS color. No extra quotation marks needed around the value.
bgcolor(#16213e) * hex
bgcolor(white) * named color
bgcolor(rgb(0, 30, 60)) * RGB
bgcolor(rgba(0,30,60,0.9)) * RGBA with alpha
bgcolor(hsl(220,60%,15%)) * HSLThe title() option takes the text first, then optional sub-options after a comma:
title("My Title")
title("My Title", color(#ff6b6b))
title("My Title", size(2em))
title("My Title", color(rgb(0,128,255)) size(2em))
title("My Title", size(28px) color(navy))size() accepts any CSS font-size value. A plain number is treated as em.
Any palette name accepted by Ben Jann's colorpalette package can be used. wordcloud2 calls colorpalette name, n(nplaced) nograph internally, so the palette is automatically interpolated or cycled to exactly as many colors as words were placed.
Some commonly used palettes:
| Name | Description |
|---|---|
Set1 |
ColorBrewer qualitative - high contrast |
Set2 |
ColorBrewer qualitative - pastel |
Dark2 |
ColorBrewer qualitative - darker tones |
Accent |
ColorBrewer with accents |
tableau |
Tableau 10 (Tableau's default palette) |
viridis |
Perceptually uniform, blue-green-yellow |
plasma |
Perceptually uniform, purple-orange |
magma |
Perceptually uniform, black-purple-yellow |
If colorpalette is not installed or the palette name is unrecognised, wordcloud2 falls back to its built-in Tableau-inspired ten-color palette and prints a warning.
All examples below use this synthetic dataset of open-ended survey responses:
clear
input str200 response
"The product quality is excellent and the service was outstanding"
"Great quality but delivery was slow and customer service needs improvement"
"Amazing experience with the product and very helpful support team"
"Poor quality product arrived damaged and customer service was unhelpful"
"Outstanding quality and excellent delivery speed highly recommend"
"The service team was helpful but the product quality could be better"
"Excellent customer experience product quality exceeded my expectations"
"Good quality product but packaging was damaged during delivery"
"Highly recommend excellent quality and very fast delivery service"
"The support team was amazing and the product quality is outstanding"
"Product quality is good but customer service response was very slow"
"Great delivery service but product quality did not meet expectations"
"Excellent quality product and outstanding customer service experience"
"The product arrived damaged but the customer service team was helpful"
"Amazing support team excellent delivery and outstanding product quality"
endwordcloud2, textvar(response)Outputs wordcloud2.html in the current directory. Built-in English stopwords are applied, words appearing fewer than twice are dropped, and the built-in ten-color palette is used on a dark navy background.
wordcloud2, ///
textvar(response) ///
maxwords(80) ///
minfreq(1) ///
minlength(4) ///
title("Customer Feedback Word Cloud", ///
color(rgb(100,100,100)) size(4)) ///
palette(tableau) ///
width(1000) ///
height(550) ///
margin(1) ///
maxfontsize(100) ///
minfontsize(10) ///
nostopwords ///
bgcolor(rgb(255,255,255)) ///
savefile("feedback_cloud.html") ///
export(png)To open the output in your default browser after running:
* Windows
shell start feedback_cloud.html
* Mac
shell open feedback_cloud.html
* Linux
shell xdg-open feedback_cloud.html* Default title styling
wordcloud2, textvar(response) title("Customer Feedback")
* Colored title on dark background
wordcloud2, textvar(response) ///
title("Customer Feedback", color(#ff6b6b) size(2em)) ///
bgcolor(#1a1a2e)
* RGB title color on white background
wordcloud2, textvar(response) ///
title("Customer Feedback", color(rgb(0,128,255)) size(2em)) ///
bgcolor(white)wordcloud2, textvar(response) palette(Set1)
wordcloud2, textvar(response) palette(Dark2) bgcolor(white)
wordcloud2, textvar(response) palette(viridis) bgcolor(#16213e)* Raise thresholds to reduce noise
wordcloud2, textvar(response) minfreq(3) minlength(5)
* Suppress built-in list entirely (e.g. for non-English text)
wordcloud2, textvar(response) minfreq(1) nostopwords
* Add domain-specific stopwords on top of the built-in list
wordcloud2, textvar(response) stopwords(product service) minfreq(2)wordcloud2, textvar(response) savefile(feedback_cloud.html) export(png)
wordcloud2, textvar(response) savefile(feedback_cloud.html) export(jpg)
wordcloud2, textvar(response) savefile(feedback_cloud.html) export(svg)use "your_survey_data.dta", clear
wordcloud2, ///
textvar(open_ended_comments) ///
maxwords(120) ///
minfreq(3) ///
title("Open-Ended Survey Responses") ///
savefile("survey_wordcloud.html")wordcloud2 processes text in ten sequential steps. Your dataset is never modified — the command uses preserve/restore throughout.
- Concatenation - all observations are joined into a single string via a Mata routine (
wc_concat_rows), avoiding Stata macro length limits - Cleaning - text is lowercased and non-alphabetic characters are replaced with spaces (suppressed by
noclean) - Tokenisation - the master string is split on whitespace using Mata's
tokens(), giving one word per observation - Stopword filtering - the built-in English stopword list is applied (suppressible via
nostopwords), plus any words instopwords(); words belowminlengthare dropped - Frequency counting - unique words are counted with
contract, filtered byminfreq, sorted descending, and capped atmaxwords - Font sizing - sizes are linearly mapped on the square root of frequency onto [
minfontsize,maxfontsize], so rare words stay legible - Bounding boxes - each word's pixel box is estimated from font size and character count, padded by
margin - Layout - a Mata routine (
wc_place_words) places words largest-first along an Archimedean spiral from the canvas centre, with AABB collision detection; unplaceable words are silently omitted - Color assignment - colors come from
colorpaletteif installed andpalette()is set, otherwise from the built-in ten-color palette - HTML output - a self-contained HTML file is written with the SVG cloud, hover tooltip JavaScript, and (if
export()is set) the in-browser raster or SVG export function
| Requirement | Version / Notes |
|---|---|
| Stata | 15 or later (ustrregexra and strL support required) |
| colorpalette | Optional - for palette() option; install via ssc install palettes |
| colrspace | Optional - required by colorpalette; install via ssc install colrspace |
wordcloud2_stata/
├── wordcloud2.ado # Main command
├── wordcloud2.sthlp # Stata helpfile (type: help wordcloud2)
├── wordcloud2.pkg # Package manifest (required for net install)
├── stata.toc # Repository index (required for net install)
├── wordcloud2_example.do # Example do-file with synthetic data
├── screenshots/
│ ├── Wordcloud_Example_1.png # Default dark-background output
│ └── Wordcloud_Example_2.png # Interactive tooltip on hover
└── README.md
Fahad Mirza (Author and Developer)
Claude (Anthropic) | Editor and co-developer.
wordcloud2 is currently in Beta. The core functionality is stable but improvements, new options, and refinements are planned over time. Feedback and bug reports are welcome - please open an issue on this repository.
This program was built collaboratively between the author and Claude, which assisted with algorithm implementation, debugging, and feature development throughout.
See LICENSE for details.
Also see: colorpalette documentation by Ben Jann

