Visualizing text analytics using jsfiddle.net (and friends!)

Jul 26, 2013
Alex Piggott

In previous blog posts we’ve talked about ways of visualizing the results of various simple analytics generated using Infinit.e plugins: geographic analyses in Pentaho and R, and more complex temporal analytics in R. In this post, we’ll show one of our favorite ways of quickly creating and sharing valuable visualizations using the excellent jsfiddle web-site.

The jsfiddle web-site, based on open source technologies, is designed to allow simple HTML/javascript constructs and web page fragments to be developed. Jsfiddle is a great fit for creating free and quick views of structured data pulled from the results of Infinit.e analytics:

All “fiddles” are public and can be shared with anyone, but the JSONP requests to the Infinit.e server get the browser’s authorization cookie, which provides a built-in security layer, where only users with access to the data in question can see the visualization.

Jsfiddle provides 3 input windows, and 1 output window that can be made full screen once the code is working:

jsfiddle1

  • HTML: hosts the “view layer” – in our cases this is normally just a “<div>” (or similar) containing the component provided by an external library.
  • Javascript: this is where all the work happens – obtaining the results of the Infinit.e analytics in JSON, doing any minor re-formatting required, and loading into the chosen visualization library.
  • CSS: this can be used to style the final display – we normally leave it alone.

External libraries can be loaded easily using the sidebar for both asynchronous requests to the Infinit.e API and for the ensuing visualizations. For the following examples, we used:

The javascript portion for both examples is also straightforward, consisting of three sections:

  1. Load the data using Mootools
  2. Format the data according to the specifications of the visualization you want to use
  3. Launch the graph or visualization

Visualizing Daily Sentiment from E-Mails

jsfiddle1b

In this example we start off with the “standard” Enron email data, enriched with keyword extraction and sentiment from the Salience NLP engine. We have written a simple javascript map/reduce job to aggregate all the keyword’s sentiment per day to obtain a single timeline (for a more sophsticated breakdown, check out the “Making the most of sentiment scores using IKANOW and R” blog post I mentioned earlier). The map/reduce job can be seen here:http://jsfiddle.net/AlexAtIkanow/qaNwk/: since it was written in javascript, we can handily store it in a “non-functional” fiddle for easy sharing along with the visualization.

We used Mootools to load the data and then formatted it to the specifications for the Google Charts timeline (just like the finance.google.com graphs!). Finally, we launched the graph.  Each of these sections is outlined below:

jsfiddle2

If you want to share the visualization publicly (and have confirmed that the data has no access restrictions!) then you can simply save the results of the API call to disk and then share it on a file hosting site such as dropbox, box.net,or from your web hosting account. If the resulting data is small enough then it can also be pasted directly into another jsfiddle. The following link provides the full fiddle but with the API call commented out and replaced with a static version of the data requiring no authentication: http://jsfiddle.net/AlexAtIkanow/xusqA/.

Visualizing Clusters of Similar Tweets

jsfiddle3

In this slightly more complex example, we used Mahout to cluster 5,000 tweets discussing Superstorm Sandy using SVD. Both our community and enterprise editions provide some level of built-in support for this powerful Hadoop-based machine learning language. This particular analytic matches up shared keywords across the tweets to convert each one into a point in 3-dimensional space, where two points’ being close indicates that they have similar content.

Then we used protovis to show a scatter plot of the projected tweets, with color used as the third dimension. Protovis has since evolved into the more sophisticated d3.js – but we still often prefer the earlier version to get up and running quickly, as we’ve discussed in a previous post.

From the jsfiddle javascript code, it can be seen that it has the same format as the simpler example above:

  1. Load the data using Mootools
  2. Format the data according to the protovis requirements
  3. Launch the graph

Only the 3rd step is more complicated – because the functionality provided is more sophisticated and requires more complex configuration.

Conclusions

Jsfiddle is a powerful platform for data visualization when used in conjunction with the  openly available display javascript libraries and analytics platforms like Infinit.e that provide a JSON-based API.

The generated web pages can be shared either openly by saving the API call from the browser and uploading to a site like dropbox or connecting to the API each time to provide access security.

A forthcoming blog post will showcase some new widgets that provide configurable views of custom analytics within the existing Infinit.e GUI framework.


Interested in receiving quarterly newsletters from IKANOW?


 Learn more from IKANOW:

Visit the Resource Center