About INSIdEnano

INSIdE nano is a graphical tool that highlights connections between phenotypic entities based on their effects on the genes. The database behind the tool is a network whose nodes are grouped into four categories:

  • Nanomaterials
  • Drugs
  • Chemicals
  • Diseases

For each element, information regarding its effects on the genes is known. Edge weights in the network explain how similar is the effect, on the genes, of each couple of nodes.

Data

INSIdEnano integrates four different types of phenotypic entities:
  • Nanomaterials: Gene expression data of different human cell types exposed to different ENMs coming from the NanoMiner project. NanoMiner is a collection of 634 samples derived from human primary cells and cell lines exposed to ENMs.
  • Drugs: Gene expression data for drug treatments was downloaded from the Connectivity Map (CMap) web page
  • Diseases and Chemicals: Manually curated information about chemical-gene and disease-gene interaction were retrieved from the Comparative Toxicogenomics Database (CTD) website

For each phenotypic entity, a list of associated genes is given. In particular, a set of genes is associated to each disease and chemical, while an ordered list of genes resulting from differential expression analysis is built for each drug and ENM in the data set. In order to construct a network of similarity between the phenotypic entities, the pair-wise similarity between each possible pair of entities was evaluated. Because of the different nature of the information (sets or ordered list of genes), different measures were applied to evaluate the pair-wise similarity:

  • The Jaccard Index was used to compile the similarity between sets of genes (e.g. two diseases, two chemicals, or a disease and a chemical)
  • The Kendall Tau distance was used to compute the similarity between ordered lists of genes (e.g. two nanomaterials, two drugs, or a nanomaterial and a drug)
  • The Gene Set Enrichment Analysis was used to compute the similarity between sets and ordered lists of genes (e.g. a chemical/disease and a nanomaterial/drug)

The pairwise similarity matrix was used as an adjacency matrix to construct a weighted undirected network where the nodes are the entities and the similarity between them represents the weight of the edge. Each similarity measure has a different range of values both positive and negative. In order to make them comparable, these values were scaled into uniform range 0-1 by means of the cumulative function. Unlike the similarity value, the signs have not been altered; then edges in the network have a sign that indicates if the correlation between a couple of nodes is positive or negative.

INSIdEnano

INSIdEnano data organization

Tool description and tutorials

Exploitative Analysis. The system provides two major functions for the exploitative analysis of the data set. The former is the visualization of the phenotypic network and the latter is the visualization clustering of the phenotypic entities.

Query Analysis. The tool provides two different types of queries. The former, called simple query, allows the user to investigate connections of a specific element in the network. Given a node and a threshold, the tool shows all its neighbours divided into four categories: nanomaterials, diseases, drugs and chemicals. For each phenotypic entity in the query output the tool highlights its position in the ranking of the neighbours and the information indicating whether the connection is already known in the literature or not. Moreover, the tool displays the connection distributions for the query input.

The latter, called conditional query analysis, allows the users to query the network by applying different filters. The user can specify more than one item for each data type, the level of similarity necessary to report a connection between two selected items, the number of items that must be in the same resulting cliques and the number of query items being connected to the other nodes in the sub-network. The tool gives two different outputs: firstly for each item, it creates a sub-network of all the elements connected to the input phenotypic entities with a connection stronger than the selected threshold; secondly it analyses the sub-networks and founds out all the patterns (cliques) of three or four different types of phenotypic entities each other significantly interconnected. The cliques are then clustered with respect to the nature of the connection between two items.

Network Browser Tutorial

The network browser tool allows the user to display and interact with the network. Due to the high dimensionality of the network the tool allows the user to display only part of it.

How do I perform the simple query analysis?

Default Interface

  1. Click on Browse in the navigation bar
  2. The tool, by default, displays the subnetwork of all the elements associated to WCCo with threshold 1%
  3. The network browse panel is available on the left side of the screen. It can be used to modify the network layout or filter out edges and nodes of the network
  4. Network statistics are showed on the right part of the screen
  5. Nodes can be clicked for more details
INSIdEnano network browser

INSIdEnano network browser

Querying the network

In order to visualize the information related to a specific node the user can follow these steps:

  1. Click on the search button in the network browser pane
  2. Insert the name of the node to investigate (in the example TIO2T20)
  3. Specify the percentage of top interaction to visualize
  4. Press on the button "Update network" to visualize the result
Performing a query within the network

Performing a query within the network

Displaying complete subnetworks

The user can also display the subnetwork of a specific class of nodes for, example nanomaterials. In order to perform this operation, the user must follow the following steps:

  1. Open the search tab in the network browser panel
  2. Select the category of objects to display
  3. Click on the "Update network" button
Subnetwork of all nanomaterials

Subnetwork of all nanomaterials

Simple query tutorial

How do I perform the simple query analysis?

Input

  1. Click on Simple Query in the navigation bar
  2. Choose the name of the element from the element list
  3. Insert the connection strength threshold
  4. Click on Search

Query Example: Parkinson Disease with threshold 50%

Simple query form

Simple query form

Results

Results will be displayed on three tabs:
  1. The first one reports all the entities connected to the given element listed by category (nanos, drugs, chemicals and diseases)
  2. The second one gives information about the connection distributions. We can see that Parkinson has 75% of positive and 25% of negative connections with nanomaterials in the following example.
  3. The third one gives information about the distribution weights.
Simple query results

Simple query results

The user can investigate the result section by:

  1. Sorting the elements by their position in the ranking or sort the elements based on the fact that the connection is known in literature or not. For example, sorting drugs by the known connection we find 'levodopa' to be connected to Parkinson with rank 9.
  2. Clicking on the name of the element more information is available. The tool displays their ATC code, unique IDs and gives links to external resources in case of drugs.
  3. Clicking on the levodopa Wikipedia link we discover that levodopa is used to increase dopamine concentrations in the treatment of Parkinson's disease
  4. The colours of the element names give information about the positive (red) or negative (green) type of connection
Simple query results

Simple query results

Conditional query tutorial

How do I perform the conditional query analysis?

Input

  1. Click on Conditional Query in the navigation bar
  2. Insert the subset of elements to investigate divided by the four categories: nanomaterials, drugs, diseases and chemicals.
    • N.B.: insert elements of at least two categories
    • Insert one, more than one, or all the element of a specific category
    • In order to insert all the element of a category, click on the "include all" button and switch it from OFF to ON: Conditional Query How
  3. Insert the connection strength threshold
  4. Insert the "minimum connected elements" threshold. This threshold allows to control the number of neighbours of the elements of interest to be selected. In particular, it indicates how many of the elements of interest are to be connected to a neighbour to consider it part of the analysis.
  5. Insert the "minimum elements in cliques" threshold. This threshold allows to control the number of elements of interest to be in the final cliques.
  6. Click on Run conditional query
Conditional query form

Conditional query form

Conditional query output

The output is displayed in three different tabs:

Clique List

The tab named "Results" contains the list of cliques. The panel is organized in the following manner:

  1. The left panel shows the types of cliques. Cliques are categorized based on the classes of the objects that they contain. The user can choose the group to visualize by clicking on its button
  2. The "Matching cliques" panel reports a table with the list of cliques.
  3. Cliques can be filtered based on the fact that at least one of the interaction represented are already known in literature
  4. Moreover, the list can be filtered based on the sign of the edge between two object of different classes
  5. Elements can be searched in the table by typing their name in the search field
  6. By clicking on a row representing the cliques another panel named clique information appear. It gives information regarding the kind of connection between the nodes.
  7. In this tab each entity name is clickable in order to show its details
  8. Tables can be exported by using the data export panel
Conditional query results

Conditional query results

Clique Investigation

Clique information panel has four buttons that can be used to further investigate the selected clique.

  1. The first one highlights the clique in the subnetwork obtained from the analysis
  2. The second one shows a table containing information about the common genes affected by all element in the clique. A green arrow means that the gene is upregulated by the element; a red arrow means it is downregulated by the element; two black arrows means that the genes is affected but we do not know if it is up or down regulated.
  3. The third one shows the genes affected by each element of the clique, divided in groups. The groupings come from the gene sets collection of MSigDB. The Jaccard index has been evaluated for each set of the MSigDB database and the set of genes affected by each element. A threshold based on this index can be used for visualization purpose. Only sets of genes with Jaccard index higher than the one specified will be displayed. Each set name is expandable in order to inspect how the genes it contains behave with respect to the phenotypic entity.
  4. This button allows to perform a PubMed search by using the elements in the clique. The default query is displayed. The user can either use the default query or change it before submitting to PubMed.
Highlighting a clique in the network

1. Highlighting a clique in the network

Gene enrichment information

2. Gene enrichment information

Pathway enrichment information

3. Pathway enrichment information

PubMed query interface

3. PubMed query interface

Subnetwork
This tab shows the subnetwork obtained from the conditional query analysis. The user can:
  1. control the network from the control panel. It is divided into six main parts:
    1. zoom in and out the network
    2. network layout algorithms
    3. node size: the weighted size of the is the node degree.
    4. edge thickness: the weighted thickness is the strength of correlation between the connected nodes
    5. edge filter
    6. node filter
  2. visualize the network and interact with it. One can for example click and drag the nodes or display their names by hovering the mouse pointer
  3. display detailed information about a node by clicking it
  4. inspect the network statistics
Conditional query subnetwork tab

Conditional query subnetwork tab

Statistics
This tab allows the user to visualize various statistics about the resulting cliques. In particular, it counts how many time a couple of objects are connected in the cliques. The user needs to specify the two classes of objects to investigate and the tool returns a bubble plot where the bubbles radii indicate the number of times these two objects are connected.
Conditional query results statistics

Conditional query results statistics

Clustering analysis tutorial

The Cluster Analysis panel gives the possibility to investigate how nanomaterials, drugs, disease and chemical are grouped each other and which genes are most important to each of them.

How do I perform the clustering analysis?

  1. Click on Cluster Analysis in the navigation bar
  2. Choose the kind of items to investigate, for example nanomaterials. The tool displays the list of all nanomaterials grouping based on their gene expression value
  3. The user can click on a nanomaterial name to obtain more information in the pane on the right
  4. The information panel contains information about the nanomaterials gene expression experiments. Moreover, the list of up and down regulated genes coming from the differentially expressed analysis is displayed
  5. By clicking on the name of the genes its Gene card webpage is displayed.
Clustering analysis

Clustering analysis

Support

The development of INSIdEnano was supported by the European Commission (FP7-NANOSOLUTIONS, Grant Agreement No. 309329).