Sign up to gain exclusive access to email subscriptions, event invitations, competitions, giveaways, and much more.
Are you seeking tools for data visualisation and analysis that won't bust your budget? Here are dozens of ideas.
Free and low-cost tools for data visualization
Are you looking for new tools to analyze data? This slide presentation, originally given at the National Institute for Computer-Assisted Reporting conference this month, features dozens of possibilities. Surely you'll find at least one application, service or learning resource to improve your data work.
30+ free tools for data visualization and analysis
You can see tools mentioned here -- and more -- in chart form as well. Computerworld's sortable chart of 30+ free tools for data visualization and analysis includes information on what the tools does, skill level needed to learn and use it, and links to Computerworld reviews with more info.
Bookmark the chart at http://cwrld.us/DataVizToolsChart
Data cleaning with OpenRefine
Before you start analyzing data or putting it in visual form, you sometimes need to clean it. There could be data entry mistakes, such as numbers with missing or extra zeroes, or salary information where hourly and annual rates are mixed.
Or there could be multiple versions of the same item, making it tough to properly group data -- for example, many ways of entering the same company name, as Computerworld encountered when reporting on the top users of H-1B visas.
OpenRefine, formerly Google Refine, is a desktop application with algorithms to seek similarities in data records. Other features make it easier to do various types of data editing and clustering.
You can also compare entries in your data with items in another database to standardize records, as long as the reference database has a Reconciliation Service API. Possible reconciliation databases include OpenCorporates and Freebase.
Find it at: http://openrefine.org
Whether you use Excel, Google Docs or Open Office, your spreadsheet has visualization capabilities. Some data geeks dismiss Excel as lightweight, and some graphics professionals don't think much of spreadsheet visualization.
Nate Silver uses Excel
... you may have heard of this guy. New York Times blogger Nate Silver used data analysis to accurately predict 2012 election results. In a Reddit Ask Me Anything session, Silver said he uses Stata (which is definitely not analysis "on a shoestring") for his complex work and Excel for the rest, including quick charts for his blogs.
"It isn't that hard to make Excel charts look unExcellish if you take a few minutes and get away from the awful default settings," Silver wrote.
Better Excel charts
If you're looking for ideas on improving default Excel templates, try Storytelling With Data, a blog by Cole Nussbaumer, who works on Google's People Analytics team and teaches data visualization. If you search for Excel on her blog, you'll get some great tips and downloadable templates you can use as a starting point.
IBM's Many Eyes was a pioneer in the Web-based visualization space. It's well-known and very versatile, with 20 or so visualization types such as plots, bar charts, maps and network diagrams.
The Many Eyes project goal is to encourage people to share and analyze each other's data. That makes it easy to use, but it also has limits for sensitive projects: In order to use it, you must upload data to the project's site and make it publicly available.
Find it at: http://www-958.ibm.com/software/analytics/manyeyes/
Tableau Public is a general-purpose tool with a lot of capabilities. It's more powerful than Many Eyes but also has a significantly steeper learning curve. ou can use it to put together single visualizations or dashboards with multiple visualizations that can be filtered in unison.
If you're using the free Tableau Public service, everything is hosted on its servers. Once your visualization is published, anyone can download your data and workbook structure. There is a paid version that offers more features and the ability to keep your workbook and data private.
Windows-only Tableau Public desktop software is required to create visualizations; visualizations can be viewed on any OS with a modern browser.
Find it at: http://www.tableausoftware.com/public
Infogr.am is is a Latvian startup that features an easy-to-use interface for creating data visualizations – bar charts, column charts, treemaps and more. The company says the site was "built with newsrooms in mind" and is aimed at letting journalists create visualizations quickly, but anyone can use it.
There is minimal documentation and it's still somewhat a beta product, so this might not be the best choice for a major project that you hope will have a long life online -- but it might be useful for some quick one-off charts.
Find it at: http://infogr.am
Datawrapper version 1.0, also aimed at journalists who need to make quick and easy online charts, launched last November. This is an open-source project from ABZV, a German institute that trains print journalists. You can host your own version -- it's written in PHP -- or opt to use Datawrapper's hosting infrastructure (currently on Amazon) and embed charts on your website.
There are less than a dozen chart types, and even those have pretty limited customization. But if you're short on time and design expertise, you may want to check this out.
Find it at: http://datawrapper.de
Why use a paid library? Highcharts has professional-looking interactive graphics and is fairly easy to code -- there are a lot of samples to tweak and use.
Find it at: http://www.highcharts.com
D3 is extremely full-featured and robust, but along with being much more flexible than a library like Highcharts, it also has a steeper learning curve.
Find it at: http://d3js.org
Resources for learning D3
There are a lot of resources available if you want to learn D3 -- Mike Bostock has a large collection of tutorials on GitHub -- and O'Reilly Media is publishing a book based on Web tutorials by Scott Murray.
• InfoVis, which has some fairly polished graphics
• MIT's Exhibit for presenting data, timelines and maps
Cascading Tree Sheets
Find it at: http://www.treesheets.org
So, to create a bubble chart that looks like this
Google Chart Tools
Google used to have two different charting platforms. One created static JPG images, but that is being phased out. What's left is the chart tools API for making interactive charts to embed on a website.
Interestingly, Google's chart tools include what Google calls a "Chart tools data source" protocol. That lets you do SQL-like queries on your data, and that protocol is implemented by Google Spreadsheets and Fusion Tables.
However, keep in mind that Google cleans house and kills off services from time to time -- something to consider when choosing to rely on Google APIs for long-life projects.
Find it at: https://developers.google.com/chart/
Google Spreadsheets has some charts that can be published within the sheet and also embedded in an external website.
In addition, there are some more interesting things going on with Google Docs as a back end for data projects on the Web.
Find it at: http://www.google.com/drive/start/apps.html#product=sheets
Find it at: http://builtbybalance.com/Tabletop
Dataset is part of the Miso Project created by the Guardian and Boucoup, and it's funded by, among others, the Bill and Melinda Gates Foundation. Dataset is very well documented with a number of examples and tutorials.
Find it at: http://misoproject.com/dataset
Google Fusion Tables
If you want to use data stored in a Google app for visualizations, Google's own Fusion Tables makes it fairly easy to do several types of dataviz, but especially maps. You can join a table with data about a region with a table that defines those areas' geographies and then map it.
Find it at: http://www.google.com/drive/start/apps.html#fusiontables
Fusion Tables map example
This example maps U.S. Census data showing population changes in Massachusetts legislative districts.
For more on using the tool, see Computerworld's How to make a map in Google Fusion Tables. In addition, Google has posted a number of Fusion Tables tutorials.
ESRI Mapping for Everyone
ESRI's Mapping for Everyone is a free tool that lets you create and embed a few different types of maps on a Web page. The maps can only use data sets ESRI has included; you can't upload your own. If available data doesn't meet your needs, the Mapping for Everyone page also has links to other free ESRI tools, including mapping APIs.
Quantum GIS (QGIS)
QGIS is an open-source alternative to ArcGIS software -- perhaps not as polished or with as many labor-saving features, but highly capable and robust. The QGIS community has created a number of plugins that further extend its capabilities.
You can load multiple data tables into QGIS for multi-layered maps, join tables on common fields, and do full-fledged geospatial visualization and analysis.
Get it: http://www.qgis.org
R Project for Statistical Computing
The R Project for Statistical Computing is used heavily in the research and academic communities for data analysis, and it is also well suited for visualization. As with QGIS, there are a number of plug-ins as well as tools that extend its capabilities considerably.
Basic R is a free command-line tool that runs on Windows, Mac and various Unix platforms, but there's a whole ecosystem of tools around the platform as well. For example, RStudio is a free IDE designed for use with R. There's even a plug-in (free for non-commercial use only) for use within Excel.
Get it: http://www.r-project.org
As with many command-line environments, there's a learning curve for R. Some resources from the National Institute for Computer Assisted Reporting:
• R for Statistics: First Steps, PDF by Peter Aldhous
• Hands-on R, a step-by-step tutorial, PDF by Jacob Fenton
• Hadley Wickham's NICAR13 slides and code
R Chart Chooser
If you want to use R for data visualization, Chart Chooser in R has some good samples of graphics you can create -- and it includes downloadable code.
Get it: http://www.yaksis.com/posts/r-chart-chooser.html
Statwing aims to offer 1-click data analysis: You upload some data and then select different variables to be analyzed for frequencies, visualizations, correlations and such.
Naturally, you need to be careful about letting a cloud service auto-analyze your data. However, Statwing does explain in an advanced tab how it comes to conslusions like statistical significance and effect size.
It offers a free account that lets you upload up to 1MB of data that's stored for 24 hours. (If you want more storage and to have your data remain on the site, plans start at $25/month.)
Get it: https://www.statwing.com
For more on affordable data tools, see our chart of 30+ free tools for data visualization and analysis and the original accompanying story from an earlier NICAR conference (updated as tools change).
ARN Innovation Awards