Animated word clouds are a dynamic visualization tool that display the frequencies of words over time. They provide a time perspective to the classic word cloud and can be generated using Python. The AnimatedWordCloud library offers features like n-gram frequency visualization and text cleaning. The library can be installed and used to generate frames and create a video from those frames. It is useful for presentations, teaching, and exploring text data. The library has some dependencies and limitations, but future releases may address them.
Introducing Animated Word Clouds for Data Storytelling
An animated word cloud is a dynamic visualization that displays the absolute frequencies of n-grams (sequences of text) over time. It adds a time perspective to the classic word cloud visualization, giving greater importance to words that appear more frequently in a source text.
Visualizing changes in text data over time can be challenging, especially when dealing with large datasets. Instead of creating multiple summary tables or graphs, you can now generate an MP4 video that tells a story and captivates your audience.
Key Features of AnimatedWordCloud Library:
- Provides n-gram frequency visualization for all Latin-alphabet languages
- Cleans text data from punctuation, numbers, and stopwords
- Generates yearly or monthly n-gram frequencies
How to Use AnimatedWordCloud
Step 1: Installation
Create a virtual environment for your project to avoid dependency conflicts. Install AnimatedWordCloud using pip:
pip install AnimatedWordCloud
Note: This library requires Python 3.8 and is best used with PyCharm community edition.
Step 2: Generate Frames
Import the necessary libraries and load your text data:
import pandas as pd
data = pd.read_csv('dataset.csv')
Next, import the animated_word_cloud method and pass in the required parameters:
from AnimatedWordCloud import animated_word_cloud
animated_word_cloud(text=data['contents'], time=data['date'], date_format='us', ngram=1, freq='Y', stopwords=['english', 'french', 'german', 'spanish'])
This code generates PNG frames for each period and stores them in the “postprocessing/frames” folder.
Step 3: Create a Video from Images
Download the ffmpeg folder and frames2video.bat file from the provided link and place them in the “postprocessing” folder. Run the frames2video.bat file to generate the desired output video file.
Practical Applications
Animated Word Clouds are particularly useful for presentations and teaching. They offer a more engaging way to present facts and analyze text data. Here are some practical applications:
- Teaching the history of science by analyzing article headlines or journal abstracts
- Understanding customer sentiment by analyzing product reviews
- Summarizing and exploring text datasets collected over time
- Modeling COVID-19-related discussions or analyzing US presidential debates
Technical Details
AnimatedWordCloud is built on the WordsSwarm project and utilizes the Arábica library for text processing and word frequency aggregation. It also uses PyBox2D for physics and clash detection, and Pyglet and PyGame for creating animations.
The library handles absolute word frequencies and scales the data to display word clouds for datasets of different sizes. It can handle missing values and mojibake errors. Future releases will include bigram frequencies and improved compatibility with different IDEs.
For more information and support, contact hello@itinai.com. Discover how AI can redefine your way of work and explore AI solutions at itinai.com.