Plugin development gets new tools and opens-up to the community

Since the introduction of the Gephi Marketplace and tools such as the Plugins Bootcamp we’ve seen more and more plugins being developed. Even developers with little experience with Java give it a try and succeed in creating their first plugin. We want developers to be productive and make it as easy as possible to get started with plugin development and find help along the way. As the release of the 0.9 version is near, it’s time to review our plan on that matter and upcoming improvements. Here’s the summary:

  • The gephi-plugins base repository (i.e. repository plugin developers fork) is now using Maven for building and is simpler. It contains only 4 files versus 890 for the Ant-based system.
  • All Gephi modules are published on Maven central, making it very easy to inspect and extend.
  • Introduction of a custom Maven plugin designed to facilitate plugin development.
  • The submission and review of plugins will be entirely based of GitHub, making it more scalable and transparent.
  • A new online portal for plugins is coming up with an easier edit experience and new features.

From Ant to Maven

Before diving into plugins, let’s first review what has changed on how Gephi is compiled, built and packaged – as this directly affects plugins as well. Since the Gephi 0.8.2 version we have migrated our build system from Ant to Maven. This is in line with what the Netbeans Platform (i.e. which Gephi is based on) community recommends. It already has increased the level of automation we’re capable of as a result. The main benefits are (compared to Ant):

  • Maven is great at dependencies management. It’s now very clear what version of what library Gephi depends on, making it simpler to integrate. Dependencies are also downloaded automatically instead of being checked in the codebase
  • Unlike the Ant-based system, it’s independent from Netbeans. This allows developers not using Netbeans to develop Gephi and produce a build entirely from the command-line.
  • Gephi modules can now be placed on Maven Central (i.e. global repository where Maven finds its dependencies). This allows plugins to automatically find the Gephi dependencies online, reducing the manual steps at each Gephi upgrade.

Build assistant

There are a few critical steps we want to help plugin developers with and as a result started the development of a custom Maven plugin. This new tool will work behind the scenes when developers build their plugin. No installation or configuration is needed as it comes already as dependency of the gephi-plugins module. It already addresses common pain points and hope to automate more and more of the steps in the future. This is what it can do as of today:

  • Plugin validation: The assistant reviews the plugin configuration and metadata at each build. This allows for instance to check if the plugin depends on the correct Gephi version or remind the developer to define an author or license in its configuration.
  • Run Gephi with plugins: A single command allows to run Gephi with the plugins pre-installed. This makes testing faster than ever when developing plugins.
  • New plugin generator: A step-by-step command-line tool that creates the correct folder structure and configuration to get started.

In the future, we want to rely on this build assistant to further automate the process and for instance do easy migration or code generation. For instance, you could ask to generate a Layout plugin code and configuration. Afterwards, all needed would be to fill in the blanks in the code.

A new way to review and submit plugins

As the number of plugins grows, it’s important to have a clear process how plugins are reviewed and updated. We also want this process to be transparent and open to the community. So far, the process was based on the submission of the plugin binaries with a manual review done by the team. This helped us get where we are today, but we want to get it to the next level and propose to entirely move this process to GitHub – using the pull-request mechanism. This has multiple advantages, listed below:

  • Reviewing new/updated plugins can scale because any developer can read the code and contribute to the pull requests.
  • Developers are already asked to fork the gephi-plugins repository so submitting the plugin via GitHub is a natural extension to it.
  • There’s a clear history of each version, comment and what code has changed from one version to another.
  • It makes it easier to test plugins and detect issues before the plugin is approved.

As part of this migration, we’ll no longer add plugins with closed source code but all existing plugins for Gephi 0.8.2 will remain available. For security and stability reasons, it’s essential that each plugin’s code can be inspected before approval. In order for this to work, all existing plugins not already on GitHub or not forking the gephi-plugins repository will need to migrate. For those already set up, the migration will be easier but Ant-based plugins will still need to migrate to Maven.

To summarize, this is what the new 4-steps process looks like for developers:

gephi-new-plugin-development-process

In the current submission process we ask for additional information such as description, author or license as well as allow the upload of images. Going forward with GitHub, all of these data will directly be defined in the plugin’s configuration making it easier to update.

A new home for plugins (again)

Plugins are currently available online from the Gephi Marketplace, where users could also reach people providing teachings and support.  We have ideas on how to improve these community services and will be migrating them to a new architecture, starting with the plugins. We will tell you more about these changes in an upcoming post but for now our focus is on developing a new lightweight plugin portal that can directly be connected with the data source on GitHub.

Here is a preview of what it will look like for plugin pages:

new-plugin-frontend-preview

 

The content of this website will be automatically updated when plugins are published or updated. The way it works is with Travis CI (i.e. continuous integration platform) simply refreshing the JSON file after changes to the plugin repository on GitHub. Developers can even embed images and write the description in Markdown. This will remove entirely the need for plugin developers to login to the marketplace, update NBMs and metadata.

Migrating plugins

This new Maven-based repository along with the new submission process will be introduced with the Gephi 0.9 release. Let’s review what plugin developers need to know to bring their plugin to this new major version.

As with all major Gephi release, plugins compatibility needs to be evaluated as APIs may have changed. In fact, given this new version is based on an entirely redeveloped core it’s very likely code changes will be required. Hopefully, these changes will often be minor and actually simplify things (i.e less, more efficient code). Documentation will be published on these API changes and core developers will be available to answer questions as well.

Plugin developers will also get contacted regarding moving their code to GitHub with a step-by-step guide. We’re considering adding a migrate command to the new Gephi Maven plugin to facilitate the transition from Ant but that’s an unfunded project at the moment (if you’re interested contributing to that, please let us know). Stay tuned for details right after the release on the path to migration.

And again, thanks for all your hard work on bringing your ideas to life though new Gephi plugins!

 

Continuous Integration at Gephi

We recently finished to deploy a continuous integration environment at Gephi and I’m excited to share some of the highlights.

The Gephi developer team has been hard at work to change the way we iterate and create releases at Gephi. Developer productivity has been an important theme for this year’s focus and we already made several improvements. At the end of last year we migrated our code to GitHub and improved the documentation. We then focused on plugin developers and made it really easy to create new plugins with the Plugin Bootcamp and the new gephi-plugins repository. Finally, we’re now introducing a completely automated build and release production system.

Our objective was to automate the way releases are created and tested. Previously, creating a release was a manual process and included error prone tasks like updating configuration files, unzipping translations in the right folder or creating installers. Open-source tools like Maven, Jenkins and Nexus can help to make this process seamless and always have the latest deliverables available.

Maven migration

We migrated our code base from Ant to Maven. Gephi is based on the Netbeans Platform and has more than 80 different modules with dependencies and third-party librairies. Maven makes it easy to manage a large number of dependencies and put all configuration parameters in one place. Maven has also a large number of plugins and is very well integrated in Netbeans and Eclipse IDE.

Highlights:

  • A full application package, all Javadocs and sources are now produced and uploaded online with a single command.
  • Dependencies are all defined in one place. It is also much easier to update to the latest version of the Netbeans Platform.
  • All library JARs are dependencies to Maven Central or 3rd party repositories. No library JARs are directly included in the sources anymore.
  • The Gephi project is now a standard multi-module Maven project. It can therefore be opened and built in Eclipse or IntelliJ, as well as Netbeans out of the box
  • It facilitates module reuse in other projects like the Gephi Toolkit. Any other project can easily depend on any (or all) Gephi modules.

Jenkins server

Jenkins is the continuous integration server we chose to automate building and testing Gephi. It is configured to build and test Gephi every night based on the latest version of the code on GitHub. If the build fails, developers are informed something needs to be fixed.

Highlights:

  • Fully automated build in a stable environment. If something is wrong, it must be the code.
  • In addition of Gephi itself, we’re also building the Gephi Toolkit every night. Eventually, we’ll be able to build and test plugins as well.
  • Artifacts produced are uploaded to Nexus.

Nexus

Nexus is a repository for artifacts, which could either be librairies Gephi is using or release binaires like the latest release. At any time, beta testers can download the nightly build and test new features. We just announced a new beta testing program, which couldn’t be possible without the availability of the nightly build.

Highlights:

  • All 3rd party librairies have been uploaded to Nexus. Maven is using Nexus as a source for librairies.
  • The nightly build packages are available for download.
  • It also hosts the latest set of NBMs and Javadocs.


We learnt a lot during this project and will continue to strengthen the developer and beta-tester environment to scale up Gephi development. So far, we’ve done the Maven migration on a separate GitHub repository but we’ll soon convert the main repository and soon after release a 0.8.2 Gephi version. We’ve created a new Continuous Integration section on the Dev Portal and documented this project.

Plugin development remains the same for now and all plugins should be compatible with the new code base. In the next few months we would like to bring continuous integration to plugin developers as well. Testing at scale a large number of plugins at each new Gephi version remains a challenge and we would like to improve that. Also, we’ve seen issues where different plugins use different version of the same library and eventually cause crashes. Stay tuned for some news on that.

In the next few weeks we’ll update the documentation at various places how to build Gephi and work with the code. Developers interested to try this new system out should follow the instructions on GitHub or reach to us on the developer mailing-list.

Last but not least, we would like to say kudos to Maven, Jenkins and Nexus contributors for their huge and excellent work!

Introducing the Gephi Plugins Bootcamp

We’re happy to announce a new tool for the community today: the Gephi Plugins Bootcamp. The bootcamp is a large set of plug-in examples to guide developers create Gephi plugins easily.

Gephi’s vision focuses on the platform and we want developers to be creative and successful. Gephi is built in a way it can be easily extended with plug-ins (layout, filters, io, preview, …) but it’s not always easy to know where to start. The bootcamp addresses this need and provide the environment and the examples to get started.

Want to create a new layout? The Grid Layout example shows how to read the graph and to change coordinates. A new filter? A new exporter? Check out the JPGExporter or the SQLite exporter examples. Below is the complete list of examples and we’ll add more soon upon requests.

Checkout the code on GitHub.

The README file contains the instructions to get the code and to run the examples.

Layout

Grid Layout Example Grid Layout
Place all nodes in a simple grid. Users can configure the size of the area and the speed.
Sorted Grid Layout Example Sorted Grid Layout
Same example as Grid Layout but users can sort nodes using an attribute column.

Filter

Transform to Undirected Example Transform to Undirected
Edge filter to remove mutual edges in a directed graph.
Top Nodes Filter Top nodes
Keep the top K nodes using an attribute column.
Filter Edge Crossing Example Remove Edge Crossing
Example of a complex filter implementation which removes edges until no crossing occurs.

Tool

FInd with Autocomplete Find with autocomplete
Tool with a autocomplete text field to find any node based on labels and zoom by it.
Add Nodes example Add Nodes
Listen to mouse clicks and adds nodes. Also adds edges if selecting other nodes.

Export

jpg export JPG Export
Vectorial export to the JPG image format. Contains a settings panel to set the width and height.
sqlite_export SQLite Database Export
Current graph export to a SQLite Database file. A new sub-menu is added in the Export menu and an example of a custom exporter is shown.

Preview

Highlight mutual edges example Highlight Mutual Edges
Colors differently mutual edges. Overwrites and extends the default edge renderer.
Glow Renderer example Glow Renderer
Adds a new renderer for node items which draws a glow effect around nodes.
Z-Ordering Example Node Z-ordering
Extends the default node builder by reordering the node items by size or any number columns. Also shows how to create complex Preview UI.

Importer

Matrix import example Matrix Market File Importer
File importer for the Matrix Market format. Large set of matrix file examples on Yifan Hu’s gallery.

Statistics

Count Self Loop example Count Self-Loop
Example of a statistics result at the global level. Simply counts the number of self-loop edges in the graph.
Average Euclidean Distance Example Average Euclidean Distance
Example of a per-node calculation. For a given node it calculates the average distance to others.

Plugins sub-menu

Submenus Example Test action
Simple action which displays a message and a dialog.
remove_self_loops Remove self loops
Action which accesses the graph and removes self-loops, if any.
Progress Example Using Progress and Cancel
Action which creates a long task and executes it with progress and cancel support.

Execute at startup

UI Ready Example When UI is ready
Do something when the UI finished loading.
workspace_events Workspace select events
Do something when a workspace is selected.

Processor

Initial Position Processor Example Initial Position
Set up the nodes’ initial position always the same. It calculates a hash with all nodes so the X/Y position is randomized always in the same way.

Panels

Panel Example New panel
Example of a new panel plugin set up at the layout position.

If you have any questions please send an email to the gephi-plugins [at] lists.gephi.org mailing list or stop by on the forum.

Gephi migrates to GitHub

Github logoWe are happy to announce we finished the migration of our code from Launchpad to GitHub. All the code and bugs have been successfully transfered with the complete history. We can now profit from the best platform out-there and use Git, the fastest revision control system.

We hope you’ll find GitHub faster and easier to use than Launchpad. The team is already appreciating how easy it is to report issues and work together on the code. GitHub has more than a million users and will make the project more visible and ease external contributions.

Technically, we migrated our branches from Bazaar to Git (thanks to git-bzr) so the history is entirely kept. We also moved all our bugs with a simple script. We are still working on the details. If you see something wrong or missing on GitHub, please contact us or create an issue on GitHub. If you had some branches on Launchpad, you can find them on the GitHub repository. Let us know if you have questions. Contributors simply fork the repository and get started. We updated the documentation on the wiki. Consult the Developer Handbook.

Checkout code

Run
git clone git://github.com/gephi/gephi.git

Report issues

Simply go to the Issues tab.

Build in one step

Simply run ant at the root of the repository to build Gephi. The executable are located in the dist folder.

We made some improvements on the building process. Previously, Netbeans was required to build Gephi. We now integrates the platform directly in the source code so it’s not necessary anymore. It’s literally a one step process.

Please let us know your feedbacks and questions as usual on the forum.

First Gephi Plugin Developers Workshop on October, 6

gephi workshop

This is an announcement for the first Gephi Plugins Developers Workshop October 6, 2011 in Mountain View, California. Come and learn how to write your first Gephi plugin and ask questions. The workshop is organized by Mathieu Bastian, Gephi Architect and will be gratefully hosted by IMVU.

Gephi is a modular software and can be extended with plug-ins. Plug-ins can add new features like layout, filters, metrics, data sources, etc. or modify existing features. Gephi is written in Java so anything that can be used in Java can be packaged as a Gephi plug-in! Visit the Plugins Portal on the wiki and follow the tutorials to get started.

The workshop will start with a presentation of Gephi’s architecture and the different types of plugins that can be written with examples. Details about Gephi’s APIs, code examples and best practices will be presented in an interactive “live coding” way. The Gephi Toolkit will also be covered in details. The second part of the workshop will be dedicated to help individuals with their projects and answer questions.

Some of the best projects using or extending Gephi are developed in the Silicon Valley and we are looking forward helping the developer community. Please don’t hesitate to send us your ideas to maximize efficiency.

RSVP here

GSoC mid-term: Automated build & Maven

My name is Keheliya Gallaba and during this Google Summer of Code I am working on the Automated build system for Gephi. The goal of this project is to add Maven build support to Gephi and set up a continuous integration system to fasten the release process. The Netbeans Platform, which Gephi is built upon, natively uses Apache Ant to compile, build and package the application. But now there is also a variant of NetBeans which uses Apache Maven as the build system. There are several reasons that make moving into a Maven based system worthwhile.

Maven vs Ant

The existing Ant build system for building NetBeans Platform-based applications which is called Ant Build Harness is very intuitive, and needs almost no initial setup. The set of standard Ant scripts and tasks can be easily triggered by the IDE or by the command line. But there are reasons that Ant might not suite a rapidly growing, multi-module project like Gephi. The Gephi project consists of a team of developers who work on dependent modules and plugins. These modules have to be composed to the application regularly. With a large number of modules, with many small packages, and with multiple projects with many inter-dependencies and external dependencies, its essential to manage different versions and branches with their dependencies. And reusing modules with the Ant build harness is not that intuitive.

Image1-Gephi-modules-modified

But Apache Maven is introduced as a standard, well defined build system that can be customized. It uses a construct known as a Project Object Model (POM) to describe the software project being built, its dependencies on other external modules and components, and the build order. It comes with pre-defined targets for performing certain well-defined tasks such as compilation of code and its packaging. It makes dependency management very easy and efficient with the concept of repositories. Most importantly in maven unique coordinates: groupId, artifactId, version, packaging, classifier identifies an artifact which can be uploaded or retrieved from a repository. This helps to easily build modules which depend on other modules.

Work completed so far

This project involves digging deeper in to the Gephi’s architecture and understanding dependencies, building and packaging. Gephi includes 100+ submodules categorized into Core, UI, Libraries and Plugins sections. NBM, which stands for “NetBeans module”, is the deployment format of modules in NetBeans. It is a ZIP archive, with the extension .nbm, containing the JARs in the module, and their configuration files. NBM files can be manually installed using the Update Center and choosing the option for installing manually downloaded modules, or they can be downloaded and installed directly from netbeans.org or another update server.

I’m happy to say that I was able to successfully mavenize 75 modules and continuing to complete the rest. I primarily used the NetBeans Module Maven Plugin for this, which now comes built in with NetBeans 6.9 and 7.0 IDEs. Currently NBM handles the tasks like defining the ‘nbm’ packaging by registering a new packaging type “nbm” so that any project with this packaging will be automatically turned into a netbeans module project, creating nbm artifacts and managing branding. It is also capable of populating the local maven repository with module jars and NBM files from a given NetBeans installation.

Image-2-Screenshot-NetBeansIDE7.0

Some third party libraries used in Gephi are not maintained in any public Maven Repositories. So I had set up a local Sonatype Nexus Repository to store and serve these dependencies. Basic functionalities of a repository manager like Sonatype are:

  • managing project dependencies,
  • artifacts & metadata,
  • proxying external repositories
  • and deployment of packaged binaries and JARs to share those artifacts with other developers and end-users.

We are in the process of setting up a Sonatype Nexus Repository in official Gephi server as well, so not only these third party jars, but the Gephi releases such as the Gephi Toolkit can be served as a maven dependency to maven-based projects all over the world.

Image-3-Screenshot-Sonatype-Nexus-Maven-Repository-Manager-Google-Chrome

Challenges faced during the process

  • Researching on existing large scale applications using NetBeans RCP and Maven
  • Finding documentation on handling Netbeans specific ant tasks, now in Maven
  • Managing transitive dependencies and versioning (specially with slight defferences of Maven and NetBeans difinitions)
  • Compilation and Test Failures.

Continuous Integration

Image-4-Screenshot-Continuum-Continuum-Project-Google-Chrome

While Maven migration is going on, I also looked in to the other aspect of the project, setting up of a continuous integration server. Main benefits of such a system are:

  • checking out source from source control,
  • running clean build,
  • deploying the artifacts in a repository
  • and running unit tests.

Furthermore it can notify developers via Email, IM or IRC on Success, Failure, Error and Warning in a build or simply a Source Code Management Failure. What this means is that when a project gets updated during development, the continuous integration system will try to build the project and will notify the developers if it ran into any issues. This is very useful when working on a multi-module project with many developers, like Gephi since a developer may unintentionally, by accident break the build since they are working concurrently on code and they may have unique configurations to their development environment that isn’t shared by other developers. I looked at the options of Apache Continuum, Hudson and Jenkins (A fork of Hudson) considering the criteria, being open source, supporting Ant & Maven and better integration with Java based projects.

Hudson is an extensible Continuous Integration Server built by Sun Microsystem’s Kohsuke Kawaguchi. Since the design of Hudson includes well thought-out extension points, developers have written plugins to support all of the major version control systems and many different notifiers, and many other options to customize the build process for example the Amazon EC2 plugin to use the Amazon “cloud” as the build cluster.

Continuum is described as a fast, lightweight, and undemanding continuous integration system built by Apache Maven team. It is built on the Plexus component framework, and comes bundled with its own Jetty application server. Like Maven, it is built on the Plexus component framework, and comes bundled with its own Jetty application server. It uses Apache Derby, a 100% Java, fully embedded database for its persistence needs. All these reasons make Continuum self-reliant, and also particularly easy to install in almost any environment.

After considering all of these reasons I settled on Apache Continuum because of the ease of setting it up, configuration and out-of-the-box support for Bazaar. Bazaar is the distributed version control system used in Launchpad for managing the source code, when lot of developers work together on software projects like Gephi. I have set up a local instance of Apache Continuum to check out and build the ant-based Gephi hourly. In the future we can host this in the Gephi server to notify the developers and administrators.

Future Work

Since the initial foundation has been laid out, it will be quite convenient to complete the rest of the planned work. These will include completion of mavenizing rest of the modules, creating .zip distribution, properly running the final project being developed and setting up the infrastructure at the Gephi server.

I would like to thank my mentors Julian Bilcke, Mathieu Bastian and Sébastien Heymann for providing all the guidelines and support for making this project a success. You can find my ongoing work at this repository: https://code.launchpad.net/~keheliya-gallaba/Gephi/maven-build

References

New Gephi Toolkit release, based on 0.8alpha

toolkitarticleexample1-300x211 A new release of the Gephi Toolkit arrived, based on the 0.8alpha version. Download the latest package, including Javadoc and demos by clicking on the link below.

It includes all features and bugfixes the 0.8alpha version has, notably:

  • GEXF 1.2 support (partial)
  • Add Neighbour Filter
  • Improve support of meta-edges in Statistics and Filters
  • Edge weight option in PageRank, which can now be used by the algorithm
  • VNA Import (Thanks to Vojtech Bardiovsky)
  • Label Adjust algorithm 3 times faster
  • Saving/Loading projects is faster and use less memory

Demos available on the Toolkit Portal have been adapted when necessary and tested. If you are intrested in using plug-ins from the Toolkit, checkout How to use plug-ins with the Toolkit.

Links you may be interested:

This summer, the student Luiz Ribeiro is working on GSoC Scripting Plugin, a project to bring advanced scripting features in Gephi, using Python. This project will work with the Gephi Toolkit, and greatly facilitate its usage.

Gephi Toolkit released, based on 0.7beta

toolkitarticleexample1-300x211 The 0.7beta version of Gephi has been released last week. It is today the Gephi Toolkit release, based on the latest codebase. Download the latest package, including Javadoc and demos by clicking on the link below.

It includes all features and bugfixes the 0.7beta version has. Therefore it is possible to use dynamic networks and new Data Laboratory features from the Gephi Toolkit.

Two new demos are available from the Toolkit Portal:

  • Import Dynamic – How to import several static files and transform them into longitudinal network
  • Dynamic Metric – How to execute a metric (ex: Average Degree) for each slice of a dynamic network

Links you may be interested:

Since it’s launch in July, the Gephi Toolkit has been used in various use cases for graph visualization. Recently, the Indiana University launched Truthy, a system to analyze and visualize the diffusion of information on Twitter. Truthy uses the Gephi Toolkit for layout.

Scientist Christian Tominski about Gephi

Guest blog post from Dr. Tominski who accepted to review Gephi 0.7alpha4 for us.

Christian Tominski received his diploma (MCS) from the University of Rostock in 2002. In 2006 he received doctoral degree (Dr.-Ing.) from the same university. Currently, Christian is working as a lecturer and researcher at the Institute for Computer Science at the University of Rostock. Christian has authored and co-authored several articles in the field of information visualization. His main interests concern visualization of multivariate data in time and space, visualization of graph structures, and visualization on mobile devices. In his research, a special focus is set on interactivity, including novel interaction methods and implications for software engineering.

Recently, I stumbled upon the Gephi Project – an open source graph visualization system. As I’ve done some research in the area of interactive graph visualization, I was eager to see how Gephi works and if it brings some new concepts or if it’s yet another graph visualization system. I’ll share my thoughts on Gephi from three perspectives. The first one is the user perspective. I’ll take the role of a user who is interested in getting a visual depiction of some graphs. Secondly, I’ll take the role of a developer and shed some light on the aspect of software engineering. And finally, I’ll be a scientist and try to foresee if and in which regard Gephi might have some impact on visualization research.

The User’s Perspective

Gephi has been designed with the users and their needs in mind. The system welcomes its users with a familiar look and feel. It is quite easy to load graph data into the system. Many of the known file formats for graphs are supported, as for instance, DOT, GML, GraphML, or Tulip’s file format TLP. A nice thing about the data import is that an import report provides essential information about the import process (e.g., number of nodes and edges, edge-directedness, potential problems, etc.). Once imported, the graph is shown as nodes and links in a main view, and several complementary views provide additional information.

The main view is the core for visual graph exploration. It allows users to zoom in, to select nodes, to adjust node size and color, to find shortest paths, and to access attributes of nodes and edges. In addition to letting users set sizes and colors manually, the system can also set these automatically based on attributes associated with nodes and edges. What is called “Partition” in Gephi is used to assign unique colors to nodes and edges based on qualitative data attributes (e.g., class affiliation). Quantitative data values can be mapped to size and color of nodes, edges, and labels using the “Ranking” tool. All these tools are customizable. It is worth mentioning, that Gephi provides some nice user controls to parameterize the color coding.

Gephi also supports graph editing, i.e., insertion and deletion of nodes and edges as well as manipulation of attribute values. What is missing in terms of editing the data is the possibility to add (and delete) attributes, for instance to generate some derived data values using simple formula.

A key aspect in graph exploration is the layout of node and edges. As it is usually unclear what will be the best layout for a given graph, Gephi offers various layout algorithms to choose from. While a layout is being computed, the main view constantly updates itself to provide feedback of the progress made. A big plus is that users can interrupt the layout algorithm once they deem the result to be ok or if they find that it might be more suitable to use the current result as the initial setup for another algorithm. This way users can easily tune the layout to fit the graph and the particular needs. Users may put the finishing touches to the layout by moving nodes manually in the main view.

Once a suitable visual representation has been created, the final step is to export nice pictures of the graph. To this end, Gephi follows the philosophy of providing a dedicated export interface with many options to create high quality printouts.
People that have been working with larger graphs might know that some computations on graphs (including layout computation) are quite complex and take some time. While other systems are blocked during computation and in the best case provide a progress bar, Gephi is different. Long running calculations are concurrent to the main application. From my point of view, this is one of the strongest points of Gephi, the system does not block during costly computations. The benefit for the users is that they can always interact, for instance to initiate some other computations or to cancel running ones when they recognize that a re-parameterization would yield better results.

Concurrency is Gephi’s solution to offering computations of statistics about the graph. Currently, Gephi supports a variety of classic graph statistics including degree distribution, number of connected components, and others. Based on data attributes and computed statistics, the graph can be filtered to reduce nodes and edges to those that fulfill the filter criteria. In a dynamic filtering UI, several filters can be combined using drag’n’drop and thresholds can be manipulated easily, for instance via sliders. Besides using filtering for data reduction, Gephi also provides basic support for graph clustering. However, the currently implemented MCL algorithm is still experimental. But there is the possibility to manually group nodes to build a hierarchical structure on top of the visualized graph. Yet, this is quite cumbersome for larger graphs. Additional tools are needed to support the user in creating a navigable hierarchy on top of a graph. Configurable clustering pipelines that combine several strategies for clustering (e.g., based on attributes or based on bi-connected components) in addition to a clustering wizard user interface would be helpful.

In summary, I see a much potential in Gephi, the overall shape of the system impressed me – me as a user. I personally felt it easy to work with Gephi and explore some of my own data sets and some provided at Gephi’s website. Given the fact that the version I’ve worked with is 0.7 alpha, there is also much space for improvements. In the first place I would like to mention the navigation of the graph. The main view provides just basic zoom and pan navigation, which is even imprecise in some situations. Navigation tools like those provided in Google Earth and navigation based on paths through a graph would be really helpful. Moreover, I was missing the concept of linking between views. Selecting an element (node or edge) in one view should highlight that element in all other views. Right now this is not really an issue as the number of views seen in parallel is quite low. But once additional views are needed, for instance to focus on data attributes in a Parallel Coordinates Plot or to visualize the cluster hierarchy in a dedicated view, or when one and the same graph is shown in parallel in two or more main views for comparing different analytic results, linking will be crucial for user experience. But these things are not too complex and should be easy to integrate in future versions of Gephi. Another aspect regards highlighting in the main view: instead of marking the selected node, all non-selected nodes faded out to focus on the selected node. This implies rather big visual changes because all but one nodes change their appearance when a single node gets selected and deselected.

Pros: Cons:
  • Easy graph import and export
  • Many options for visual encoding
  • Various layout algorithms to choose from
  • Support for dynamic filtering
  • Computation of graph statistics
  • Basic support for graph clustering
  • System does not block during long running computations
  • Graph navigation can be improved
  • No linking among views
  • Few visual glitches
  • Still an alpha version with bugs here and there

The Developer’s Perspective

Now let me switch to the developer’s view. Gephi is open source software so that everybody can participate in improving the system or can adapt the system to personal or business needs. Gephi seems to be very well designed on the back-end. The project is based on the Netbeans platform and the Java language. It is subdivided into a number of modules that define several APIs and SPIs and that provide implementations of these interfaces. Thanks to the modular structure, Gephi can be extended quite easily. The best way to do so is to implement plugins. Plugins can be used, for instance, to add further layout or clustering algorithms, statistical computations, filter components, or export methods. The modular structure also allows for using only specific parts of the Gephi project in one’s own projects. The Gephi Toolkit is a good example. It is not an end-user desktop application, but a class library that provides all the functionality of Gephi to those who want to reuse Gephi’s functionality and data structures in different ways.

As I’ve mentioned in the user perspective, the way how Gephi deals with long running computations is a big plus. Given the fact that aspects of multi-threading are inherent in the system from the very beginning and are manifested at the systems core, I sincerely hope – no, I’m quite sure that Gephi will not run into all the problems that are likely to occur when multithreading is integrated into an existing single-threaded system, as I have experienced it myself. Also I conjecture that others will find it much easier to implement concurrent non-blocking extensions of the system simply by following the way how existing code handles things in Gephi.

As Gephi is split up into many different modules, it took me a while to get accustomed to the system and to learn which functionality can be found in which module. But I have to add that I had no prior experience in Netbeans platform development and the module concept that is used there. I also found that the code documentation could be improved in several parts of Gephi’s sources. On the other hand, the Gephi website provides informative wiki pages with various examples and tutorials.

My view from the developer’s perspective can be summarized as the following pros and cons:

Pros: Cons:
  • Open source
  • Modular structure
  • Well defined interfaces
  • Extensible via plugins
  • Inherently multithreaded
  • In-code documentation can be improved

The Scientist’s Perspective

As a scientist I’m not so much interested in developing fully-fledge end-user software, but in developing solutions to scientific questions and in publishing the results. A difficulty in interactive visualization is that usually one needs a broad basis of fundamental functionality to be able to develop such solutions. Previous attempts of establishing a common infrastructure for interactive data exploration made notable progress, but eventually did not fully succeed or are no longer actively maintained. This is due to the fact that a single researcher usually simply does not have the time to do decent research and at the same time to maintain a larger software project.

I personally feel that Gephi can become such a fundamental infrastructure. Maintained by an active community, the system allows researchers to focus on solutions in form of plugins, while they can utilize the functionality that the system provides. Visualization researchers will be happy if they can simply plug in new visualization techniques as additional views, test new layout algorithms, and experiment with new clustering methods. Moreover, new solutions can be easily disseminated to real users in the community. This might prove beneficial when it comes to acquiring early user feedback or when more formal user evaluation is needed prior to publishing new techniques and concepts.

A big issue in visualization research is visual analytics, that is, the combination of analytical, interactive, and visual means to facilitate making sense of large volumes of data. In terms of analytic means, a goal is to break analytic black boxes and make analysis algorithms interactively steerable. With the architecture of Gephi, where parameterizable algorithms run concurrently and provide feedback in form of intermediate results, I believe this goal can be reach in the future. A thing that I’m curious about is if it is also possible to come up with concepts that allow for plugging in new interaction techniques. As interaction is usually quite tightly bound to a view, I wonder if interaction could be implemented as independent plugins as well, and if novel interaction concepts will be supported in the future (e.g., touch interaction)? Furthermore, aspects of interactive collaboration of multiple users working to solve a common analysis problem could be of interest. A question related to the visual side is whether it is possible to use Gephi with different displays and display environments such as tabletop displays, display walls, smart phones, or multi-display environments?

A facet of graph visualization that I did not mention in the user’s perspective as I felt it more suited to be mentioned here is dealing with dynamically changing graphs. Visualization of time-varying graphs is a hot research topic and Gephi is about to face this challenge. There is preliminary support for exploring time-dependent graphs via a time slider. But there is more to this that just browsing in time. Concepts have to be integrated to support easy comparison of multiple snapshots of a graph and to highlight significant changes in the development of a graphs history.

Let me try to put my thoughts into a pros and cons list:

Pros: Cons:
  • Potential infrastructure for visualization research
  • Researchers can focus on solutions in form of plugins
  • Potential to use community for user feedback and evaluation
  • Partial results for current research questions (graph clustering, steerable algorithms, dynamic graphs)
  • Nice playground for experimentation and testing new ideas
  • Unclear if new and alternative technologies will be supported

Summary

Since I’ve put hands on Gephi I’m infected. Maybe I’m dazzled by the beautiful demo video or the nice pictures that have been generated using Gephi, but in my opinion Gephi has the potential to become a big player in interactive visual graph exploration and analysis. From all perspectives that I’ve taken I see many positive things – and plenty of room for improvements or additional features. I do hope that the people behind Gephi will continue their work to the benefit of all users, developers, and researchers.

Related Stuff

There are many other systems and frameworks out there that do a great job in interactive graph visualization or in supporting it as a toolkit. I would like to give credit to these systems, because they can be the source of many ideas and much inspiration:

To go further about Gephi design, see also this article about semiotics.

Announcing the Gephi Toolkit

We are announcing today the first release of the Gephi Toolkit. The Toolkit project packages essential modules (Graph, Layout, Filters, IO…) in a standard Java library, which any Java project can use for getting things done. The toolkit is just a single JAR that anyone could reuse in a Java program and script Gephi features.

The toolkit is the counterpart of the desktop application. Gephi’s user interface aims to be simple, intuitive and without command-line or scripting needed. The toolkit is made for people who want to:

  • Script, automate features & reproduce the same procedure over and over
  • Reuse Gephi features and algorithms in other projects and softwares
  • Develop all types of mashups or web-services that deals with networks

A lot of new content is coming with the release of the Toolkit. A new portal appeared on the wiki, with documentation. Above all we provide demos and examples and a tutorial for newcomers. The cool thing is that it is very easy to use and this is all compatible with Gephi plugins. What is done for Gephi desktop can be reused in the toolkit.

Gephi is designed in a modular way and splitted into different modules. All features are wrapped into separated modules, for instance a module for the graph structure, a module for the layout algorithms and so on. Moreover business modules are separated from user interfaces modules. That allows to keep only business modules and remove UI without any problems. That is the purpose of the toolkit, which wraps only core modules and removes all the UI layer. So the toolkit is just taking what already exists in Gephi and packages it.

That is all thanks to the power of Java and Netbeans Platform. The way modular development is encouraged and the ability to manually extract modules from the Netbeans Platform is all thanks to the way they designed the architecture and use standards like ‘ant’ and plain Java. It’s a good occasion to say Kudos to them!

With the release of the toolkit, we are also moving to the AGPL license, as announced earlier. The GNU Affero General Public License is a modified version of the ordinary GNU GPL version 3. It has one added requirement: if you run the program on a server and let other users communicate with it there, your server must also allow them to download the source code corresponding to the program that it’s running. If what’s running there is your modified version of the program, the server’s users must get the source code as you modified it.