Friday, February 7, 2014

Post #3 - Exploring Text Visualization Tools


I selected a book from Project Gutenberg titled as " The Evolution of Modern Medicine- a Series of Lectures Delivered at Yale University " by William Osler in April, 1913. 

The above word cloud was created using Wordle based on the whole text of the book.

And then I created a visualization of the same text using the Voyant Visualization tool. It gave me a summary as follows:

  • There is 1 document in this corpus with a total of 66,108 words and 9,270 unique words.
  • Most frequent words in the corpus: the (5,782), of (3,853), and (2,154), in (1,776), to (1,349). 

And a visualized graph was generated as shown below:
Since the book is about modern medicine, we are not surprised to find "medicine", "disease", "Galen", "anatomy", and "physician" to be the most frequently occurred words in the text.

The experiments with these text analysis tools were fun and pretty easy to use. I believe students will love this.


  1. Just an observation: I found if you use the "stop words" in Voyant, it will eliminate words like "the, of, an, in." The resulting visualization will therefore have more content words and probably be much more interesting!

  2. Your post highlights that a word cloud is such a good idea for a conversation starter for a classroom exercise. (Obviously, students wouldn't have read a whole book the first day, etc.) But this 'quick read' can give them a sense of what's going to be important in a class module or activity. (If you try out the stop-word list, it will give you more specific nouns that will reveal even more about this text's content.)

  3. You highlighted the most important words using word cloud, which is exciting, and I believe students will enjoy it. It would be great if would provide more details about the approach with which you intend to use these tools in your class.
