Revisiting flow maps : a classification and a 3 D alternative to visual clutter

Flow maps have long been servicing people in exploring movement by representing origin-destination data (OD data). Due to recent developments in data collecting techniques the amount of movement data is increasing dramatically. With such huge amounts of data, visual clutter in flow maps is becoming a challenge. This paper revisits flow maps, provides an overview of the characteristics of OD data and proposes a classification system for flow maps. For dealing with problems of visual clutter, 3D flow maps are proposed as potential alternative to 2D flow maps.


Introduction
Movements exists everywhere.Commonly one distinguishes between two kinds, continuous and discrete movements.The first, such as movement of wind or sea currents do not have a clearly defined origin or destination.The second, such as flight of airplanes or commuting trips always have an origin and destination.This paper concentrates on discrete movements.For many societal problems, an understanding of this type of movement data is essential.Policy makers will be interested in the effects of migration, transport managers have to predict and prepare the infrastructure to deliver on-line sales, and city planners have to cater for enough public transport.The data representing discrete movements is collected in massive amounts due to for instance the incorporation of GPS in mobile phones and other tracking devices.This type of data is called origin and destination data (OD data), and is often stored in a matrix.
OD data are commonly visualized in flow maps.Flow maps represent characteristics of movement between origin and destination (nodes).The path between origin and destination (links) can express the qualitative or quantitative nature of the movement or flow.Depending on the characteristic of OD data the nodes and links are displayed using the visual variables.
The earliest known flow map was described by Arthur H Robinson (1955) and dates back to 1837.It shows the average number of passengers carried by public transportation in the area around Dublin, Ireland and was drawn by Harness.However, the flow maps produced by Minard a few years later are better known.Especially Minard's map of Napoleon's Russian Campaign (Minard, 1869) has become a classic.It combines both qualitative and quantitative data and has been redrawn and discussed many times (Funkhouser, 1937;Kraak, 2003aKraak, , 2014;;Arthur Howard Robinson, 1982;Tobler, 1987;Tufte, 1983).Especially Tufte's quote "it may well be the best statistical graphic ever drawn" has made this map a classic.
For long the production of flow maps has been a tedious manual job.Tobler (1987) was one of the first cartographers who produced a flow map with the help of a computer program.It was a map based on an OD matrix of migration among the fifty US states.Others followed with different algorithms (Jong & Floor, 1993).As producing flow maps becomes more simple, and the data becomes abundant.Flow maps easily too crowded to be useful.This phenomenon is defined as visual clutter (Ellis & Dix, 2007).
Therefore, researchers looked into different solution to avoid the visual clutter.They developed algorithms that optimize the layout of the flows, worked on design alternatives such as animations and linked diagrams, interaction and combinations of the above.Examples are: edge bundling (Phan, Xiao, Yeh, Hanrahan, & Winograd, 2005) and aggregation (G.Andrienko & Andrienko, 2010) are applied to minimized intersections of links.Small multiples (Tufte, 1983), animation (Schich et al., 2014) and Space Time Cube (Kraak, 2003b) are applied to show time dimension.Interaction are used to show multiscale (Hurter, Tissoires, & Conversy, 2009;Rae, 2009) and multivariable (van den Elzen & van Wijk, 2014).Non-geographic diagram and other type visualization also be applied in flow maps (J.Wood, Slingsby, & Dykes, 2011).
From the literature, it can be seen that over the past 100 years, interest in flow maps has increased dramatically.However, it seems that despite all intriguing visualizations as discussed above, the speed of data collecting is much faster than the development of new representation to deal with the data: visual clutter remains a key problem.From this perspective, we revisit the flow map and suggest yet another possible direction to avoid clutter: the third dimension.However, before we explain our ideas it is good to take a closer look at the characteristics of the OD data first.

OD Data
Origin Destination data represents movement and its characteristics between two locations.The data can be organized in an OD matrix, which has rows and columns representing origins and destinations respectively.The cells represent flow (see Fig. 1).The characteristics of origins, destinations and flows can be described by a spatial (location), attribute (qualitative, quantitative), and time component.The existence of a flow mean time is inherently there -Attribute: For both the origin/destination and the flow one can distinguish between qualitative and quantitative characteristics.Let's look at the airline network as an example.Airports have different airlines arriving and departing (quality) and the number passengers arriving and departing (quantity).The flow examples are the type of airplane and the capacity of the airplanes (quality and quantity respectively).In practice, they are often combined.-Time: Time is inherent in flow maps, and its granularity should be taken into consideration.For instance, migration can be permanent or temporary.In the last case one can distinguish a temporal cycle (daily, weekly or seasonal) often linked to a spatial cycle (home-work; home-second home.etc.).
In this paper, we introduce a classification based on whether and how the OD data are visualized.The center of Fig. 2 shows structure of this classification and around the edges are classified flow maps found in literatures.Flows can be split in five different visualization options: Not shown (NS), shown (S), shown with qualitative characteristics (QL), shown with quantitative characteristics (QT), and shown with both qualitative and quantitative (QQ).For the origins and destinations, the first five classifications are valid as well, with the addition of the option that either the origin or the destination is shown or not shown (NS-S).
As indicated in the previous section, time is inherent in OD data, because movement always "flows" from an origin to a destination.But time is not always explicitly shown in flow maps.Time can be expressed by directionality and/or by showing the progress of time.Time as such is not included in the classification of flow maps in Fig. 2. So Fig. 3 gives options on how directionality and the progress of time can be represented.
Directionality can be visualized with arrows, visual variables, and animation (Fig. 3).Arrows are the most intuitive way to show directions of the flow line.Visual variables can distinguish direction with width, color or transparency.Animation refers to internal movement is a symbol like moving particles along lines.It also gives a sense of direction between two locations.To represent the progress of time along a time line one has several options: small multiples, Space Time Cube (STC) or animation.Small Multiples are composed of a set of snapshots of the movement process, each representing a moment in time.The notion of progress of time is retrieved by reading the snapshots as a story.In a Space Time Cube, geography is represented in x-and y-plane while the z-axis represents time.Movement can be visualized along Space Time Path, and at the same time directionality can also be figured out from the third dimension.Animation utilizes physical time to express the progress of time.Combined with timeline and interaction techniques it gives a good impression of time passing.

Feasible Solutions for Visual Clutter
The visual clutter in a flow map occurs when multiple flow overlap, and the flow patterns sometimes get obscured by the flows themselves when OD data set is too huge.The experience of clutter will depend on the spatial and temporal distribution of the flows, as well as the qualitative and quantitative characteristics of the flows.Ellis and Dix (2007) made an inventory of clutter reduction techniques.In relation to the flow maps one has experimented with clustering, aggregation, interaction, animation and multiples.
Interaction compensates the limitation of the computer two-dimensional display and helps to discover unobvious patterns in data by altering representation type and symbolization, posing queries and transforming data.Shneiderman (1996) classified users tasks with visualization as "overview, zoom, filter, details-on-demand, relate, history and extract" which can be fully supported by interaction (Roth, 2012).Multiscale interactive flow maps provide users an overview + detail (van den Elzen & van Wijk, 2014) context, and users can explore voluminous and multidimensional databases by toggling visibility, brushing, linking and filtering (Edsall, Andrienko, Andrienko, & Buttenfield, 2008).
Animation uses physical time to represent the progress of time of movement and it helps users understand the change.Animation should also be combined with interaction to pause, replay, zoom in and out, and aggregation of time should also be applied in order to find patterns or more general information.

The Third Dimension
Most flow maps are displayed in two dimensions, but according to Tobler (1987) "drawing flows in the third dimension as well creates more space", and as Tufte (1990) argues, escaping flatland is the essential task of envisioning information.Few years after Tobler proposed the idea, Kenneth C Cox and Eick (1995) drawn a 3D display of internet traffic on flat, and then K. C. Cox, Eick, and He (1996) developed the SeeNet3D network visualization system to show internet traffic on globe.Meanwell, Munzner, Hoffman, Claffy, and Fenner (1996) visualized the global topology of the Internet MBone.Hägerstrand (1970) proposed "Space-Time-Path" which inspire experts exploring movements with "Space-time aquarium" (Kwan, 2000) and Space Time Cube (Kraak, 2003b), both of them are 3D visualizations.Itoh et al. (2013) represents passenger numbers on flows and encode it as the height of flows.Nagel and Pietsch (2016) visualized bicycle movement in a plane while using 3D symbols which they call "firefly style".Buschmann, Trapp, and Döllner (2016) designed a dynamic flow map for air traffic in which the z-axis showed the height of planes.
The third dimension in flow maps can be applied based on four principles (Fig. 4): Height: when the spatial component of OD data has x-, y-and z-coordinate, the third dimension can be used to represent the physical height.GPS tracks are an example.Path's follow the terrain and plane are shown on their actual flying heights.
Attribute: the third dimension can also be utilized to encode both qualitative or quantitative values.Drawing "walls" on flow map in three space is an example.Symbols with volumes in three-dimension space can represent quantitative attribute and display more variables.
Time: in space time cube, the z-axis represents time dimension (Kraak, 2003b).3D Arcs: the third dimension in flow maps can be used to represent the connection only.
Combinations: the third dimension can express combination of height, time, attribute aspects, such as using lines with volume in Space Time Cube, both of quantitative attribute and time can be expressed.The third dimension offers more design options for flow maps while at the same time it may lead to an increase of the cognition load.This because third dimension could cause more visual clutter and also introduces occlusion.However, these potential problems could be avoided by using existing methods of occlusion management, such as multiple viewports, virtual x-ray, tour planner, volumetric probe and projection distorter (Elmqvist & Tsigas, 2008).In addition, we do expect that virtual reality can also help to perceive 3D visualization environment and explore movement data.
The usability of 3D flow map will be systematically tested to find out under which circumstances it's an alternative for 2D flow maps.The effectiveness, efficiency and satisfaction of users for both 2D-and 3D flow maps will be studied in a realistic user context, using both qualitative and quantitative usability methods.Sample methods such as thinking aloud, questionnaires, screenshot, logging, eye movement recording, task completion rate and time need etc.

Conclusion
Visualization of OD data is challenging because of the huge amounts of available data and various users' requirement.Multiple visualization techniques can be applied on flow maps such as clustering, aggregation, interaction, animation, multiple views and three-dimension graphic to reduce visual clutter.Most of the time, flow maps visualizing OD data with more than one techniques.
We proposed a classification of flow maps based on how the underlying OD can be visualized.Based on this knowledge we looked at an alternative to avoid clutter: using the third dimension.
Developing 3D flow map is promising since it provides more possibility for mapping movement and takes advantages of people's visual system.The third dimension can encode height, time and attribute, and provide mapping space by using 3D symbols.While 3D will also bring more visual clutter and occlusion.The usability of 3D flow map needs further study, and whether it is an alternative for 2D flow map need design practice and comparison.

Fig. 1 .
Fig. 1.OD Data Structure.-Location:The location of origin and destination can be a point or area.Examples of point locations are the airports served by an airline or the start and end of a taxi or bicycle ride.The location of flows can be direct or indirect.The (curved) lines between airports are an example of the first, and GPS logs of taxi trips or bicycles' tracks are an example of the last.Examples of area location are administrative units, such as the countries or municipalities between which people migrate.The location of flows is often represented by a direct line since one deals with aggregate data.

Fig. 3 .
Fig. 3. Time and flow maps: visualization options for directionality and progress of time.