15 Feb 2017, 20:53

Effective Cartograms

Earlier today I tweeted something that I had been contemplating for a little while about cartograms. For those of you unfamiliar, a cartogram is a type of map that uses a attribute (population, income etc) of a geographic feature to influence it’s area in some capacity. The cartogram that is most familar to most people is the ‘continuous irregular cartogram’, which can be seen as figure C in this image:

A and B are square cartogram and a continuous regular cartogram respectively. Cartograms have a long history spanning back as far as 1973, and there are over 25 noted algorithms for producing them. In addition there are many different types and variations. Today they are often popular in (but not limited to) academic journals, media companies and interest websites.

What are the potential pitfalls of cartograms?

I knew that the tweet might be mildly controversial in the mapping/data visualsiation community but I was interested to see others opinions on the matter:

Although in hindsight I wish I had worded it slightly differently, my underlying premise remains the same upon writing this post. My reasoning is as follows:

  • Cartograms explicitly distort the shape of a geographic area.
  • In many cases distortions away from the socially normalised Mercator are so extreme it makes it difficult for the viewer to interpret which geographic region is which.
  • There is no obvious way of determining what the attribute scale is, as such there can be no legend to aid the viewer.
  • As such it is very difficult to determine middle values. Implicitly it is the geographic region that is least distorted, which may be hard to perceive for the user.

The primary reason for using a map to visualise geographic data is to convey meaning to its viewer, to tell a story about what the underlying geographic trends are. If the user can’t decipher what geographic regions they are actually seeing the medium begins to undermine itself. Cartograms make the interpreter have to constantly compare the distorted image they are seeing to their perceived mental image of what that area should look like. Let me give an example from one the top results on Google Images:

As you can see the world is highly distorted on top of the distortion provided by the Mercator projection. In some ways the map fulfills it’s purpose. We can see the US has low distortion, Australia is relatively large. However, we can’t determine what that hectares value is for any nation. It becomes a lot harder to determine anything of substance about the location or value of low ranking countries other than ‘It must be very low and it’s somewhere in this array of squashed lines’. Through the critical lens, it is a logical conclusion that cartograms primary purpose is to leave the user with an abstract idea of the relative values of some factor between geographic regions.

At a higher level, legibility and simplicity are two of the Ordnance Survey’s key cartographic design principles, both of which I would contend extreme cartograms tend not to conform too. Indeed, my main reasoning for seeing cartograms as unfavourable is not that I think that they are inherently bad, but rather there are more suitable ways to represent said data. I would put forward that a simple choropleth map would often be more suitable than most cartogram representations.

How can cartograms be used effectively?

Having explored why cartograms can be easily misapplied, there are some potential ways that we can make cartograms more effective for users. My goal for this post was to find some constructive ways to help make better cartograms. The main starting point is to consider why we do geographic representations. Sometimes it is very easy to fall into the pitfall of assuming a key factor is ‘does this look cool?’, but arguably we do not often ask ‘does this help users understand the point we’re trying to make?’. To put it plainly, simple and clear tends to beat cool and technical in most use cases.

The use case for cartograms is often to explore relative values of some data set. We forfeit geographic fidelity for (arguably) a more interesting visualisation. So, if we are to utilize this method, how can we use them effectively? Here are some ideas around that:

  • Consider, does the data suit a cartogram? Highly skewed or anomalous data may not work exceptionally well in conjunction with a cartogram.
  • Avoiding algorithms that produce extreme results that may remove any meaningful geographic shape from the visualisation will make the map more readable and digestible.
  • Not mixing the method with other methods like choropleths (i.e. using colours to visualise another variable) can improve readability.
  • Would it work well as an animation? An animation from a normal map to a cartogram might help the user deduce which region is which. This reduces the mental mapping of undistorted regions to the cartogram.
  • Using labels might be another way to help the user determine what each distorted shape represents, preventing them having to mental gymnastics to figure it out. This may be difficult if distortions are extreme.
  • Would the cartogram work well if it was interactive? This would allow users to interogate data using mouse/touch, revealing values and region names.
  • Another point is considering using Dorling cartograms, these use uniform shapes to represent regions. These might provide a more palatable user friendly visualisation. These visualisation may bring there own host of problems however.

Feedback & Credits

Do you have an alternative opinion? Something to add? Feel free to comment here or reach out to me on Twitter!

Credits

22 Jan 2017, 18:33

Dense Spatial Data and User Experience

Across many disciplines we are increasingly seeing the inflation in size of datasets. From megabytes to gigabytes and beyond, with certainly no exception in the geospatial industry. This is in turn posing an ongoing challenges for geospatial developers, web mappers, cartographers and various other spatial data wrangling professionals.

As with web pages, increased density of data on our focus point (our map) can subsequently lead to a poor user experience. Visual clutter and poor cartography detracts from the ability to extract meaning, and also makes exploring data more difficult. As a hyperbolic example see the following image:

This has been a problem that I have faced multiple times at my work helping startups at the Geovation Hub, alongside my own personal projects. As such I thought I would share my thoughts on the subject. The main purpose of this article is to visit a selection of ways to help tackle the problem of visualising dense datasets alongside improving usability of maps along the way.

Disclaimers: The article focuses predominantly on point data from a web mapping perspective; I make no claims at being an expert cartographer!. The Carto team wrote a great blog post covering a fair percentage of the approaches previously, so kudos for that! Lastly this list is not exhaustive and I make no claim any of these methods are particularly novel. With all that out the way, let’s take a look!

Clustering

Clustering; perhaps the most classic method of reducing point data overcrowding. The purpose of clustering is to appropriately pull together (‘cluster’) data points that are close together between different zoom levels to avoid map over crowding. Some libraries have nice declustering effects as you zoom in and out. Many web mapping providers provide built in or plugin based marker clustering to account for this. Here’s an example using MapBox:

Opacity Based Rendering

When I was building the GitHub map, I faced a noticable problem when zoomed out. The density of the cities at these zoom levels made the whole map very cluttered and hard to digest the location of cities with high (normalised!) number of Github users. Due to the power of functions and the expressiveness they provide it was possible to create an effect which would essentially allow all city points to be seen at full opacity when zoomed in, and reduce the opacity of less significant points at lower zoom levels.

Binning

The perhaps unfortunately named binning approach allows you to pull together points in a geographic region and ‘bin’ them together, taking some average value of an attribute data (or even just density) to give a value to the bin shape. The most classic bin is most likely the square, but variations on this include hexagons and triangles. Here’s an example from Mike Bostock (the author of D3.js) using bivarate coloured hexagon bins for data on median age of shoppers of a popular supermarket chain:

Heatmaps

The classic heatmap. Heatmaps allow you to create a continuous surface of some geometry attribute (or again just point density). Heatmaps can often be misused and certainly have their downfalls (this blog post by Kenneth Field gives a good explanation as to how and why) but for certain data and expressed correctly I believe they can be a useful approach. I think this is especially true when the blur effect distance is a function of some real world distance relating to the data points themselves. Here is an example of a heatmap depicting Earthquakes using Esri’s ArcGIS JavaScript API:

Collision Detection

For your users finding a useful data point amongst hundreds can sometimes be like trying to find Wally (Waldo) in a crowd, only an order of magnitude less entertaining. One technique we have been looking into at my current work is using collision detection. Collison detection allows you to determine when markers or labels collide with each other and hide the ones you are less interested in (this can be implemented via some sort of weighting system). For or specific example we wanted to prevent label overcrowding. Thanks to Vladimir Agafonkin’s rbush we were able to determine collisions and weight appropriately for our use case. This allows us to only show the most pertient labels at each zoom level. It turns out that MazeMap already worked on a similar idea as an opensource project for Leaflet, so shout out to them too! Although in our case we tackled map labels the same idea could be used with markers or other map items too.

The Geovation Hub along with our friends at Podaris (a super cool startup building a real-time infrastructure planning platform) helped us release Labelgun to combat the label overcrowding problem. Labelgun works across mapping libraries as it depends on label coordinates rather than their implemented objects. Here’s a demonstration of how that works in Leaflet:

Chips

Chips (think poker, as opposed to what Americans call crisps), are arguably one of the more more recent approaches on the list. They offer a unique way to visualise multiple data points that share same location. Chips stack markers as small chips vertically, allowing them to be more clearly exposed to end users. The Carto team gave some great coverage to chips as a visualisation technique in the aforementioned blog post. From my perspective they lose there advantage on highly dense visualisations as they begin to overlap and expand very far north. If you want to try the approach out Ivan Sanchez has implemented this in Leaflet alongside the explanation given in the Carto blog post.

Spidering

Spidering (arguably another unbefitting nomenclature) is another way of compressing overlapping information points. The approach fans out markers away from their central overlapping point. This works well for locations that lots of markers sharing the same point, but are sparsely placed. This might be appropriate for say highly populated global cities but fall down for more continuous data sets. There are examples of this in Leaflet and also in Google Maps. Here is in Leaflet:

Pie Chart Markers

Pie and chips, both classic English cuisine, but also arguably two sensible ways to express multiple points with different attributes at the same location. Chips and pie charts have a similar set of properties, but pie chart markers have the advantage of showing ratios of attributes to one another slighty better. A good example use case for pie chart markers might be showing the percent of voters for various political parties in a city/ward.

Pie charts also produce a drastically different visual effect, which might be more fitting to your purposes. Furthermore they can be combined with clustering to inherit the its benefits of better visual aggregation and simplification at lower zoom levels. I would posit that their sweet spot is at around 2-5 different catagories displayed but they probably begin to lose focus at a higher number. For a full break down of the approach check out this superb blog post from the breakdown take a look at Ordnance Survey’s cartographic design time regarding their implementation of this method.

Layer Switching

One thing I always ask folks who want to visualise lots of data at once is why? Most of the time it isn’t actually necessary to visualise that much data at one time, and is an active blow to the end user experience. Assuming that you are visualising data of different types/categories (for example, say bars, cafes, restaurants etc) you could add a layer switcher which would allow users to pick the specific type they’re interested in. This should reduce the amount of data you actually show at any given time, allow users to find what they want faster, whilst also boosting performance. Here’s an example from Carto:

Bonus: Stateful Markers

Although not explicitly about reducing density, one feature I’ve seen recently that I think adds a nice effect is to add some change of state to a marker once it has been viewed. This is similar to how hyperlinks might turn purple after you’ve viewed them on a webpage. I first noticed this approach when using Airbnb for the first time. Here, the way in which the markers changed provided utility as it allowed me to see which of the available properties I’d already explored.

Bonus: Data Capping

For performance reasons you may want to set a hard limit on the number of markers that actually get rendered to the screen (I’d put 500-1000 as a decent limit for most maps, potentially less). In a similar manner to collision detection, you may want to refresh this at higher zoom levels so that you can actually see detail as you zoom into the map. Ideally you might want to do this in a way that allows for good dispersion of markers over a geographic area for a better visual effect and reduce.

Conclusion

Hopefully that’s been a useful breakdown of potential approaches to reducing data clutter on your maps and subsequently improving user experience. I really welcome any feedback and comments, or even potential additions! Lastly on a more general note I would advise examining the Ordnance Survey’s cartographic design principles for more in depth look at how to effectively show data on map.