Improving the Gephi User Experience

This is an effort to rethink the design of Gephi authored by Donato Ricci, co-founder of Density Design and senior designer at the Sciences Po médialab in Paris, and me, creator of Gephi and an engineer at the same lab (note that I am not Mathieu Bastian, our lead developer and actual powerhouse of the project for the past 10 years).

While this text presents possible improvements and practical solutions, it does not address practical considerations of available labor. Also, be aware that this is not a formal roadmap for future releases but rather a way to open the current state of our reflections for brainstorming. So feel free to share your ideas and comments with us.

There are five main categories that structure the improvements that we currently envision:

  • Design strategy: Ensure that a coherent design philosophy is applied across the entirety of the project
  • User interface: Identify and correct user-facing errors
  • Network-focus: Re-focus design and architecture around the network’s position of primary importance
  • Filling in the Gaps: Providing expected functionality
  • Miscellany: Other minor issues

In addition, we have drafted a UI mockup illustrating some of our propositions.

Design strategy

Gephi was built by engineers without a comprehensive design strategy. This situation is fairly common: engineers approach design in an ad-hoc fashion learning by trial and errors and through casual user observation but without a formal user-testing protocol. Should the tool succeed, it is mostly because the utility triumphs over the pain of use. Gephi is an embodiment of this phenomenon in its current state. Some computer scientists may find it simple, partly because using terrible interfaces is a part of their job, but for many users Gephi is confusing. Geeks of a masochistic tendency may love the tool as a result of digital Stockholm Syndrome, but the bulk of users that could benefit from Gephi find it to be confusing and opaque. In our defense, developing desktop applications is heavily constrained and the Java technology was not helping us to overcome this difficulty. What could a designer do to alleviate this situation? Apply a strategy.

A designer does more than treating the symptoms of poor usability; he or she approaches user experience from its fundamentals. Improving Gephi requires rethinking some of its longest standing features from a new standpoint. A design strategy is the solid foundation upon which we build both a satisfying user experience and underlying software architecture.

Our design strategy fits in five basic points: obtaining substantial and organized user feedback, giving Gephi a clear workflow, implementing a facet-oriented interface layout, reordering panels from the user standpoint, and removing unnecessary features. Each point is explained in further detail along with practical guidelines for implementing potential solutions in the future.

User feedback

We cannot build a sustainable user interface without a quantitative measure of user activity. These data are necessary to support and validate design choices. One approach to obtain this information is to log and collect feedback about interface usage.

An optional logger could be implemented in Gephi to allow users to opt-in to the collection of logs in order to improve the software. Data harvesting can be done as a campaign: for example, we may ask some users to activate it for one month to evaluate the usage of a new interface paradigm that we are testing.

A clear workflow

Users need a clear and visible path to start with Gephi, in particular when opening a new file. We need to remove information to allow users focusing on what is important.

Gephi involves not only the software itself, but also the installer, website and documentation. Our ultimate goal is to make the entry process as simple as possible by coordinating these different elements. We begin by focusing here on just the software itself. We propose to consider that there are only two proper ways to enter in Gephi:

  • Opening a file (constructing a network from a pre-generated file)
  • Connecting to a data source (embedded scraper or API connection)

We also need to clarify the roles of the “open” and “import” functions. We have to clarify that:

  • If the user has a file and needs to see it in Gephi, then “Open” is the right answer
  • If the user has an external data source he or she wishes to connect to, there is distinct menu option for this function

Use case: we have observed that some users try to get in Gephi with a table of nodes and a table of links, and do not succeed in finding the right path. The problem there is that it is not explicit that it is necessary to create an empty document, go to the data table, and then import the tables. Since the pattern is “I have files and I want to see them in Gephi”, then the answer should be under the “Open” menu item.

Facet-oriented interface layout

Rethinking overall design has the virtue of allowing for the reorganization of the interface from a user-centric perspective. The current interface relies on the panels system provided by the Netbeans Platform, which provides some beneficial properties for design. We were inspired by Ben Shneiderman’s motto of “overview first, zoom and filter, then details-on-demand” and it has been quite successful. However the different views are not articulated in a coherent way and the features sometimes struggle to find the right place in terms of visibility.

We propose three simple guidelines for a better organization of the panels:

  • The global hierarchy of containers should reflect the generality of the features
  • Some panels are not facet-dependent: they should not change with the facet
  • The network should occupy a single place whatever its facet, since it is always the same object

Panels guidelines

These guidelines have two consequences. First: facet-dependent panels should be contained inside facet-independent panels; which is to say that there is a single container for all facet-dependent features. Second: the facet selector (currently the three tabs on top) has to be inside this container.

We illustrate this with a comparison between a representation of the current layout and a new simplified structure.

Web

Reordering panels from the user standpoint

A part of our design strategy is to reduce visual clutter by grouping panels that are not used simultaneously. Though it is not intuitive, we prioritize separating panels that work well together over grouping similar features. For instance the following panels should be placed in different groups in order to be used combined:

  • Filters + Layout
  • Filters + Partition or Ranking
  • Timeline + almost anything else

With the current panels there are at least three obvious groups: one with filters, another with layout, and the third with the timeline. Generic contextual information is a fourth possible group, but could be placed in a non-intrusive location like the footer. Some panels, like statistics, could theoretically be at home in any group or even as a separate window that could be invoked from a menu.

Collapsing panels concept

Panels and groups of panels create two levels, so the “window” menu should have two levels too.

Removing unnecessary features

Reducing complexity can also be accomplished by removing features. We have detected at least one clear candidate for removal, but we may find more unnecessary complications to remove.

The “preview” panel of Gephi has been increasingly simplified over iterative updates. The goal of this feature is to provide a quick way to export cartographies. Users with competence in design tend to rely on third-party tools that provide finer-grained control over the visualization, like Illustrator and Scriptographer. Thus, the focus in Gephi is to provide a quick way to export images that can be manipulated in other tools.

We propose to further streamline preview functionality by removing some advanced label features: they infrequently used, complicated, and at times internally inconsistent with other preview settings. Furthermore, it is not necessary to facilitate changes to features like label and node size and color when such adjustments can be made much more easily using other tools.

User Interface

Donato Ricci has identified various flaws in the Gephi UI. Fixing them is a priority for the future.

User-centric features: reordering workflow

Users think in terms of results they want to obtain. They have an action in mind and they search how to do it. By displaying features according to their result, we can both improve user orientation and reduce the tool’s learning curve. A few examples of follow.

We propose to aggregate Partition and Ranking under the more accessible term “Appearance”, and to reverse the order of what is asked to the user. The current interface is organized in the following way: if the user has metadata that can rank the nodes then the user can visualize it using different attributes like color and size. The new interface inverts this approach: if the user wants to color nodes then he or she chooses which metadata to use. The panel may progress like a wizard to reduce cognitive load by drawing attention only to information that is necessary for a given step.

Unified panel appearance

Collapsing advanced layout settings

The current design of Gephi does not respect the general principle of drawing attention to information that is commonly used while obscuring information that is infrequently used.

Tools of the Overview: many problems to fix

The small tool buttons on the left side of the overview panel have a number of problems including:

  • Confusing icons that do not easily communicate the use of tools
  • Indistinct icons that do not sufficiently distinguish between different tools
  • Missing tools that are commonly expected

These issues are compounded by evidence that the tools themselves do not provide sufficient utility for common use.

We propose to alleviate some of these shortcomings by putting most of these tools in a collapsed panel and to have a normal panel dedicated to the settings of each tool. We also propose to implement a default tool cursor that draws on common mouse usage paradigms to provide intuitive functionality to users:

  • A click-drag starting on the background (or an edge) of the view makes a rectangle selection
  • A click-drag starting on a node moves this node
  • A click on a node selects the node, the shift key is used for multiple selection
  • A click on the background deselects
  • A set of meta-keys changes the click function, for instance the spacebar to switch to the view-panning tool (i.e. the hand in Photoshop)
  • The secondary-click works the same

Fixing highlight colors

Highlighting works by tweaking colors so that some nodes get more contrast than others. The contrast should depend on the background color, but this is not current implementation. At the very leastthe following should be done:

  • White background: highlighted nodes darker and other nodes lighter
  • Black background: highlighted nodes lighter and other nodes darker

Network-focus

As a network analysis tool, the network itself plays an obvious central role in Gephi. We have explored different ways to incorporate the network into the software’s presentation and have developed some suggestions for modifications that would increase interface coherency.

A different layout for the panels: network as background

In this approach, the network is contained by a “background sheet” and floating panels support functions. Such a philosophy has been successfully implemented in other systems like Photoshop or Google Maps. Using the root panel for the network, like grouping facet-dependent panels, fosters the feeling that we always deal with the same object, the network. This operates on the metaphor that we are always manipulating a primary canvas that consists of the network to be analyzed.

Network as background

Statistics as an invoked panel or window

Statistics tend to be used on demand, and thus do not need to be displayed permanently. Rather, a discrete menu or button could invoke the statistics panel when needed. Removing this visual information leaves more room to focus on what is important, i.e. the network.

Workspaces: more visible, on top

The workspaces need more attention. We propose to show them as tabs on the top of Gephi. It is more natural to have the workspaces above the facet selector in the hierarchy of panels. This is consistent with the prevalence of the “tab” paradigm in the browser space.

Workspaces-on-top

Filling in the Gaps: Providing Expected Functionality

In addition to the different aspects listed above, users need some well-known common features such as an “undo” function, even if they are complex to implement.

History and undo: feasible if limited to network structure

A visible trace of previous steps, like a proverbial breadcrumb trail, provides users with a sense of orientation and confidence when exploring and manipulating data in a speculative fashion. This also accelerates the learning process by alleviating cognitive load by not forcing users to have to remember a series of unfamiliar steps. This works in tandem with an “undo” feature, which facilitates experimentation without fear of permanently corrupting data.

History and Undo are complex to implement and burden the development of plugins and modules as these functions tend to be deeply embedded in a piece of software’s architecture. This partly explains why they are not currently available in Gephi. However a prudent approach in Gephi would be to focus recording and reversal of changes to the structure of the network: Nodes and edges, their attributes (including color, size and coordinates), but not the state of panels such as filters, statistics…

An initial approach would be to cover only a minimal set of modifications of the network structure. The history would then contain information about the type of the modification, but not its exact content nor the way it was done (manually, filter, data table…). For instance:

  • Modifying attribute X for node N / n nodes / all nodes
  • Modifying color / size / position for node N / n nodes / all nodes
  • Adding / removing: node N / n nodes / all nodes
  • Adding / removing: node attribute X / n attributes / all attributes
  • …and the same for links

The history would not include operations such as exporting files, taking screenshots, modification of views, changes to settings, or other changes that did not directly affect the structure of the network

Protecting irreversible operations

Some operations are irreversible: removing nodes, edges and attributes (and possibly more). Because these operations are definitive and may cause the loss of a certain quantity of work, they should be protected. A classical solution is to ask confirmation for any definitive operation. This is a simple guideline but the result is quite user hostile. We propose a better solution, as implemented in Photoshop: when an irreversible operation has been done, when the user tries to save the network the “Save as…” window appears instead and proposes a different name (with a suffix number or “Copy of X”).

Miscellany

A few additional points deserve to be listed, and are done so in no particular order.

Manual versioning

A basic versioning feature would be appreciated: just the opportunity to save with incrementing / adding a number suffix:

  • “My Network.gexf” is saved as “My Network 01.gexf”
  • “My Network 11.gexf” is saved as “My Network 12.gexf”

A common shortcut for this is Ctrl+Alt+S.

Generalizing zoom options (more internal consistency)

We can currently set how the zoom impacts text labels. The same feature would be useful for edges, for instance to keep 1 pixel lines whatever the zoom, as well as for nodes, for instance to keep small points whatever the zoom.

Generalized zoom options

Size nodes according to area

Our eyes perceive areas. Setting a ranking to the diameter of nodes is less intuitive than to apply it to the area of nodes. We propose to offer an option to customize ranking by either diameter or area, but set the default to area.

Removing unnecessary settings about labels

Node labels and edge labels should help the user identifying nodes. However, using the color or size of labels to visualize attributes is confusing. Gephi presently contains settings to manipulate labels in this way, these settings should be removed and replaced with a simpler interface.

UI Mockup sample (work in progress)

We present here a possible approach to integrate some of the different suggestions made in this document. Consider the following image as a way to help imagine the future of the Gephi user experience.

MockupA-01

As we stated earlier, the purpose of this document is to open up the floor to brainstorming ideas about improving the Gephi UX. Please share your ideas in the comments!

PS: Thanks to Niranjan Sivakumar for his excellent proof-reading 🙂

Gephi initiator interview: how “Semiotics matter”

Today I have the honnor to interview a special member of Gephi Team: Mathieu Jacomy.

Mathieu is an engineer, a founder of the WebAtlas NGO, teacher in Sciences Po Paris, and leads R&D in the TIC Migrations program in the Fondation Maison des Sciences de l’Homme and Telecom ParisTech school.
He is the main developer of the “Navicrawler” software. He also created the first Gephi prototype.

 

heymann2_8080
Sebastien Heymann: Hi Mathieu Jacomy, you are the creator of Graphiltre, the first Gephi prototype that you developed in 2006. What was the purpose of making a yet-another-graph-software?
jacomy8080
Mathieu Jacomy: Hi ! I’m glad to answer your questions, and I hope our readers will be pleased to know more about Gephi.

At this time I was analyzing a lot of graphs and I wasn’t satisfied by the existing free tools. That’s why I started to build my own tools.

I had no money to use professional tools, and I needed to understand precisely what the software was doing : the open source, free softwares perfectly fit these constrains.
I was using the amazing software Guess proposed by Eytan Adar, that himself built for his own needs. I was doing quite the same thing as him, and I couldn’t start to explore graphs without this tool.
But I wasn’t satisfied because the software didn’t allow so much manipulations. I couldn’t look at the substructures as easily as I wanted, and it was difficult to make nice cartographies.
I was dreaming of a “graph-dedicated Photoshop“, a visualization-oriented software rather than a script-oriented tool.

A good way to figure out what I mean is to look at the spatialization process. In famous softwares such as Pajek or Guess, you have algorithms called “layout”, “force-vectors” or “energy model”. These algorithms give its shape to the graph, and it is probably the most critical part of the process to build a clear visualization. Because the substructures or “patterns” that one may see in the image strongly depend on the algorithm and the settings chosen. But in the same time, most of users also want to quickly look at the global shape of the graph, and may not be aware that it’s important to search for the best algorithm to use depending on the time you have, the quality you want, the size of the graph, its degree distribution, the substructure that you expect to recognize… I was careful with these algorithms but even if I understood their principles and specificities, I couldn’t figure out how they were transforming the graph, and I couldn’t evaluate their differences.

Why? Because in these softwares you can’t :
– Manipulate the graph while the algorithm is running
– Modify the settings while the algorithm is running
– And sometimes, you can’t event see the graph while the algorithm is running
How can you just understand what’s happening there? Of course I started to work on a software that allowed this. But the same kind of problems appears again in other parts of the process, like filtering, image exporting… Pajek is clearly built in a mathematical perspective. Guess is more user-friendly, but not enough. I didn’t want a tool for mathematics experts, but a tool for people that actually have to explore and understand graphs. A professional tool for a job that didn’t exist at this time.

This was the starting point of “Graphiltre“. Building a graph exploration system so that you can understand what you are doing by looking at what happens on the screen, and do anything (including filtering) without typing a single script line.

Continue reading →

Diseasome at Online Information'09

Diseasome.eu, the human disease network map, was presented at Online Information’09, the international symposium of information systems. Our conference shows how interactive exploration of complex networks can lead to innovative interfaces for document discovery:

Performance and scalability

With this article and some following I’ll focus on the application design and explain technical points I think relevant to understand our approach.

Today’s subject is performance and scalability in the visualization. Although other modules need high-quality performances, the visualization of thousands nodes and edges remains the major challenge. For a visualization-centered software like Gephi it is a key feature we attach great important.

What you can find in other network visualization software is either a poor visualization module or a stunning aspect but not efficient. For instance Pajek has a very efficient core and you can achieve a lot with it but problems starts when you want to visualize your network. With GUESS you are able to produce nice maps but the render engine starts suffering seriously over 2000 nodes. Gephi tries to combine an efficient render engine with looking good results.

In 2007 when we started designing the current version of Gephi we had in mind we want to create a new generation of network visualization software and hence we made some choices I will try to explain here.

Use multi-core
Already in 2007 and even more now multi-core processors impose new rules in software development. It brings appealing features but also some risks. However technology starts to be mature in this, all current Top 10 video games has been thought multi-thread from the beginning. Multi-core brings performance but does it bring scalability as well? I would say YES for Gephi because no matter how many processor you have, what can be parallelized will be parallelized. Graphics card are not able to parallelize yet but we count this would be the case in the future.

Use GPU
You may notice we got some inspiration from video games development. When using the graphic card features, Gephi’s render engine let the processor free for other computing and allows using GPU acceleration to speed up rendering. Apart allowing 3D graphs, many drawings are speed up by the GPU in Gephi. I would say the only problem is compatibility, due to the high number of different graphic cards on the market.

Architecture
The visualization package architecture is a compromise between flexibility and performances. In 3D engine design it is quite impossible to have both in the same time. Hence our engine has flexibility where it doesn’t harm efficiency.

These choices allow good performance for visualizing, and I would say it is only the beginning. Currently, up to 50,000 nodes can be visualized and even more but this depends on edges number and how your graph is spatialized. Indeed we use techniques to avoid computing of parts of the graphs out of the screen:

Octree cubes partition Octree cubes on a 3D graph

The graph is cut in fixed volumes in a structure called Octree. It is easy for the render engine to determine which cubes are hidden and which are visible. Only 3D objects in visible cubes are computed. As a consequence performances don’t depend on how much nodes you have in your network but how many you are currently visualizing. So even with huge graphs, zooming in and exploring parts of it remains fast.

Besides the current 3D engine, which is intended to work on all configurations a new one will be developed in 2009. Using the last features of graphic card, networks size limit around 200,000 nodes may be reached.