Enrichment Map – a Cytoscape app to visualize... | F1000Research
ALL Metrics
-
Views
Get PDF
Get XML
Cite
Export
Track
Software Tool Article

Enrichment Map – a Cytoscape app to visualize and explore OMICs pathway enrichment results

[version 1; peer review: 2 approved]
PUBLISHED 01 Jul 2014
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

This article is included in the Bioinformatics gateway.

This article is included in the Cytoscape gateway.

Abstract

High-throughput OMICs experiments generate signals for millions of entities (i.e. genes, proteins, metabolites or any measurable biological entity) in the cell. In an effort to summarize and explore these signals, expression results are examined in the context of known pathways and processes, through enrichment analysis to generate a set of pathways and processes that is significantly enriched. Due to the high redundancy in annotation resources this often results in hundreds of sets. To facilitate the analysis of these results, we have developed the Enrichment Map app to visualize enrichments as a network. We have updated Enrichment Map to support Cytoscape 3, and have added additional features including new data formats and command line access.

Introduction

With the expansion and accessibility of a wide range of experimental techniques to accurately identify and measure any known genomics feature ranging from proteins, transcripts, genes, microRNAs, copy number variations, or DNA methylation in a high-throughput manner, signals for thousands of entities are often generated for an individual OMICs experiment. In efforts to interpret these results in the context of perturbed cellular mechanisms, the entities are often scored and examined for enrichment in known pathways and processes.

Pathway enrichment analysis helps to uncover general trends or themes present in the data, instead of focusing on one or a few favorite differential genes. Available tools are abundant, designed for varying data types and implemented using a range of different statistical tests: given a set of biological entities, these OMICs signals are then translated into a set of significant pathways and processes (reviewed in Khatri et al.1, Huang et al.2). Due to the high redundancy that exists between pathway databases coming from multiple functional annotations of gene products, pathway enrichment often results in a long list of potentially interesting pathways. To help analyze the set of differential pathways, we created the Enrichment Map app to display enrichment results as a network, where pathways are nodes in the network and edges represent known pathway cross-talk defined by the number of genes shared between the pair of pathways and where the network layout organizes the map into functional modules3.

In this paper, we present the recent implementation of the Enrichment Map app for Cytoscape 3 as well as new features.

Implementation

Although originally designed to support Gene Set Enrichment Analysis (GSEA)4 the current Enrichment Map app supports multiple enrichment results from tools such as DAVID5, BiNGO6, and GREAT7 as well as simplified generic input files which one can easily create from your own enrichment results. Tools like g:Profiler8 allow users to download results in an Enrichment Map compatible generic format.

With the ongoing effort to populate gene annotation and pathway databases, it is difficult for standalone enrichment tools to keep databases up to date. For convenience, we compile gene set files or GMT files, a format created for the GSEA software, to describe all the genes contained in a specified gene set, monthly, from a comprehensive set of annotation and Pathway databases (http://download.baderlab.org/EM_Genesets/), including standard sources, like MSigDB4. Although originally GMT files were specific to GSEA, with the expansion of R and Bioconductor it is now straightforward to load GMT files into data structures in R using packages like GSA (http://statweb.stanford.edu/~tibs/ftp/GSA.pdf) and analyze your OMICs expression data with one of the many different gene set enrichment algorithms such as geneSetTest in the Limma package9, global test10, or Camera11. Visualizing the resulting enrichments is straightforward by exporting to our generic format which minimally consists of the geneset name, description and associated enrichment p-value. Through this mechanism, no matter what the dataset of interest is, gene, protein or metabolite expression, the resulting enrichment analysis can be displayed as an enrichment map.

There are two main ways to input data into Enrichment Map, through the user interface (Figure 1) or the command tool (Table 1). The user interface is an interactive way to specify all the required files and parameters based on the analysis type chosen. The command tool allows users to automatically create maps directly from the command line, other Cytoscape apps or other programs which can include in-house enrichment tools.

39cbdb14-c58f-4a6a-b4d9-33017ab96beb_figure1.gif

Figure 1. Enrichment Map app user interface

Illustration of Enrichment Map user interface which consists of four main parts: analysis type, file specifications, node and edge filtering. For each analysis type there is a different set of required files. For added functionality there are a set of optional files that can be included to help annotate and explore results. Tuning parameters such as p-value and q-value helps control the number of nodes while tuning the similarity coefficient helps control the number of edges.

Table 1. Command tool specification outlined for each of the analysis types.

There is an additional command optimized for GSEA inputs only.

CommandRequired ArgumentsOptional Arguments
enrichment map build
analysistype="GSEA"
gmtFile=filepath to geneset file
enrichmentsDataset1=filepath to enrichments
enrichments2Dataset1=filepath to enrichments
pvalue=numerical cutoff, {default : 0.05}
qvalue=numerical cutoff, {default : 0.1}
coefficients=one of the following
[OVERLAP, JACCARD, COMBINED],
{default:OVERLAP}
similaritycutoff=numerical cutoff,
{default : 0.5}
expressionDataset1=filepath to expression file
ranksDataset1=filepath to rank file
classDataset1=filepath to class file
phenotype1Dataset1=Text representing Phenotype
phenotype2Dataset1=Text representing Phenotype2
enrichmentsDataset2=filepath to enrichments

enrichments2Dataset2=filepath to enrichments

(Replace 1 for 2 to specify which dataset the file is)
enrichmentmap build
analysistype="generic"
gmtFile=filepath to geneset file
enrichmentsDataset1=filepath to enrichments
pvalue=numerical cutoff, {default : 0.05}
qvalue=numerical cutoff, {default : 0.1}
coefficients=one of the following
[OVERLAP, JACCARD, COMBINED],
{default:OVERLAP}
similaritycutoff=numerical cutoff,
{default : 0.5}
expressionDataset1=filepath to expression file
ranksDataset1=filepath to rank file
classDataset1=filepath to class file
phenotype1Dataset1=Text representing Phenotype
phenotype2Dataset1=Text representing Phenotype2

enrichmentsDataset2=filepath to enrichments

(Replace 1 for 2 to specify which dataset the file is)
enrichmentmap build
analysistype=
"David/BiNGO/Great"
enrichmentsDataset1=filepath to enrichments
pvalue=numerical cutoff, {default : 0.05}
qvalue=numerical cutoff, {default : 0.1}
coefficients=one of the following
[OVERLAP, JACCARD, COMBINED],
{default:OVERLAP}
similaritycutoff=numerical cutoff,
{default : 0.5}
expressionDataset1=filepath to expression file
enrichmentsDataset2=filepath to enrichments
(Replace 1 for 2 to specify which dataset the file is)
enrichmentmap
gseabuild
edb=filepath to GSEA results edb directory
pvalue=numerical cutoff, {default : 0.05}
qvalue=numerical cutoff, {default : 0.1}
coefficients=one of the following
[OVERLAP, JACCARD, COMBINED],
{default:OVERLAP}
similaritycutoff=numerical cutoff,
{default : 0.5}
expression=filepath to expression file
expression2=filepath to expression file
edbdir2=filepath to edb directory

Once files and parameters have been specified, the Enrichment Map can be created. Unlike a traditional biological network, nodes in an Enrichment Map represent a set of genes (e.g. a pathway) and their connections the set of genes that two nodes have in common (e.g. pathway cross-talk). Every Enrichment Map is associated with a set of files, parameters, and a number of datasets (currently limited to two) (Figure 2). Datasets contain gene sets, enrichments, and expression all of which is needed to interactively update the map through cutoff adjustment sliders found in the legend panel or display the genes contained in a given node or edge selection as a heatmap.

39cbdb14-c58f-4a6a-b4d9-33017ab96beb_figure2.gif

Figure 2. Enrichment Map build process overview.

Enrichment Map app was ported to Cytoscape 3 as a bundle app using Open Service Gateway initiative (OSGi) services provided through the extensive Cytoscape API (version 3.1). The look and feel of the app remains similar to the original implementation for Cytoscape 2 with user input interfaces and view panels including expression heatmap and legend being a direct port from the original source. Given the new framework, each panel implements the CytoPanelComponent and is a registered service associated with the Enrichment Map app. The main enrichment map input panel is registered only once a user opens the app. The remaining view panels are only registered once an enrichment map is created. Enrichment Map consists of one main taskFactory that given an Enrichment Map object populated with a set of input files will construct the appropriate task iterator. Depending on the files specified different parsing tasks can be added to the iterator. Additionally, multiple files of the same type can also be added to the queue with distinct instantiations of a parsing task (with different files specified on task creation). All parsed files populate fields contained in the Enrichment Map object which is then passed to and updated by each of the subsequent tasks (Figure 2).

The BuildEnrichmentMapTaskFactory is used by both the user interface and command tool to construct an enrichment map. Command tool functionality for Enrichment Map requires the given task to define its variables as tunables. Tunables are user supplied information needed by the task. User interfaces can be automatically generated for such tasks based on the set of tunable definitions. When implementing the Enrichment Map tunable task it was our intention to replace our current user interface with the one automatically generated by the task. Given the varied data required from the user as well as the interactive nature of our current user interface the generated tunable interface although functional lacked features that our users are accustomed to. For instance, to specify the analysis type or similarity cutoff our interface has two sets of radio buttons where all the options are visible and only one is selectable. In the tunable interface the same choice can only be represented as a single selection list, a drop down list the user can choose one option from. Both representations are functional but we preferred the radio button implementation therefore, we decided to keep our original interface and add the tunable task solely for the command tool functionality.

Results

To illustrate the functionality of Enrichment Map we analyzed and visualized an expression dataset from the Gene Expression Omnibus (GEO)12 for mouse fibroblast cells. The experiment was designed to compare gene expression in fibroblast cells in the heart to those in the tail to highlight genes that are uniquely expressed in heart fibroblasts13 (GSE50531). Raw expression data was scored using the GEO2R tool available on the GEO website. These expression data were input to GSEA along with a recent compilation of mouse pathway gene sets (May 14, 2014; http://download.baderlab.org/EM_Genesets/May_14_2014/) to calculate enrichments. GSEA output files were given to the app with the cutoffs p-value < 0.005, q-value < 0.05 and overlap similarity coefficient > 0.3. The Enrichment Map generated had roughly the same number of enriched gene sets specific to heart as to tail with cardiac specific sets associated only with the heart phenotype (Figure 3, red nodes).

39cbdb14-c58f-4a6a-b4d9-33017ab96beb_figure3.gif

Figure 3. Enrichment Map of heart fibroblast versus tail fibroblast expression.

Using the search field you can enter any text to search all attributes of the given network. Highlighted nodes, (shown as yellow nodes with red edges just left of center) are genesets that contain the gene TBX20.

One of the main genes mentioned in the paper associated with this dataset was TBX20 as a specific cardiogenic fibroblast gene found to be important for both normal cardiac development and postinfarct repair13. In Enrichment Map it is easy to find all gene sets that contain it by entering the term TBX20 into the search box (Figure 3) (this will also highlight any gene sets that have TBX20 in the name or any other attribute). Built-in search functionality in Cytoscape 3 has improved from Cytoscape 2. All attributes associated with a given network are indexed so there is no longer the need to specify which attribute you would like to search through. Selection of individual or sets of nodes and edges creates a view of the genes contained within the selection as a heat map (Figure 4).

39cbdb14-c58f-4a6a-b4d9-33017ab96beb_figure4.gif

Figure 4. Node Heat Map Panel (contained in the Cytoscape table panel) displayed on selection of “Pericardium development (GO:0060039)” gene set.

If GSEA results are loaded into Enrichment Map, GSEA leading edge genes, defined as the set of genes that contribute most to the enrichment, are highlighted in yellow.

Often one of the main challenges after creating an Enrichment Map is going from a network in Cytoscape to publication quality figures. We format the labels so they are more readable and don’t extend across the whole screen, but as a result modules often contain overlapping labels that are difficult to read and require hours of manual formatting to create networks that can be used for figures. Using the Cytoscape 3 built-in scaling feature (Layout>Scale), the visualization of clusters and networks can be improved.

Conclusions

The Enrichment Map app allows users to translate large sets of enrichment results to a network where highly similar terms cluster together to better highlight overall trends and themes of the underlying data. The details behind the enrichment can be further investigated within the Enrichment Map app using the built-in expression viewer to see all the entities associated with a selected pathway.

Software availability

Software available from: http://apps.cytoscape.org/apps/enrichmentmap

Latest source code: https://github.com/BaderLab/EnrichmentMapApp

Source code as at the time of publication: https://github.com/F1000Research/EnrichmentMapApp/releases/tag/V1.0

Archived source code as at the time of publication: http://dx.doi.org/10.5281/zenodo.1054214

License: Lesser GNU Public License 2.1: https://www.gnu.org/licenses/old-licenses/lgpl-2.1.html

Tutorials http://baderlab.org/Software/EnrichmentMap#Tutorials_and_Examples

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 01 Jul 2014
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Isserlin R, Merico D, Voisin V and Bader GD. Enrichment Map – a Cytoscape app to visualize and explore OMICs pathway enrichment results [version 1; peer review: 2 approved]. F1000Research 2014, 3:141 (https://doi.org/10.12688/f1000research.4536.1)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 1
VERSION 1
PUBLISHED 01 Jul 2014
Views
0
Cite
Reviewer Report 15 Jul 2014
Florian Markowetz, Department of Oncology, University of Cambridge, Cambridge, UK 
Approved
VIEWS 0
This paper is an update on the Enrichment Map introduced in 2010 by the same authors. They have extended the methodology and made it available in the newest version of Cytoscape.

Gene set enrichment methods of various forms are one of ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Markowetz F. Reviewer Report For: Enrichment Map – a Cytoscape app to visualize and explore OMICs pathway enrichment results [version 1; peer review: 2 approved]. F1000Research 2014, 3:141 (https://doi.org/10.5256/f1000research.4852.r5298)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
0
Cite
Reviewer Report 14 Jul 2014
Nathan Salomonis, Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center Research Foundation, Cincinnati, OH, USA 
Approved
VIEWS 0
The authors present an already highly used and very useful approach for making sense of highly redundant biological enrichment results that arise from the analysis of transcriptome and genomics datasets. The added additional utility of accessing the software by command ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Salomonis N. Reviewer Report For: Enrichment Map – a Cytoscape app to visualize and explore OMICs pathway enrichment results [version 1; peer review: 2 approved]. F1000Research 2014, 3:141 (https://doi.org/10.5256/f1000research.4852.r5301)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 01 Jul 2014
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.