DISRUPT

Version 1.01

contents

introduction

DISRUPT is a software to predict disrupted interactions in gene regulatory or protein-protein interaction networks. Required input data is a list of interactions, describing a normal network, and expression data for the normal network and the network with disruptions (e.g. matched normal and cancer expression data). Based on those inputs DISRUPT ranks the interactions of the normal network according to their likelihood of being disrupted in the disease-related network.

installation

DISRUPT is implemented in Python and requires an installation of Python 2.7. Also required are the numpy and the scipy library, which can be downloaded from here.

DISRUPT itself can be downloaded from here. Unpack the zip-file to a folder of your choice and run the command example.bat to ensure that everything is working properly. Compare the program output to the example output provided here.

usage

DISRUPT is invoked from the command line using the following parameters:

disrupt.bat network_normal expression_normal expression_disrupted ex_type predictions
network_normal is a file that contains a list of interactions within the normal network. Interactions are given as tab-separated pairs of gene names. See the following example:
YJL206C YNL004W
YJL206C YHL008C
YJL206C YDR166C
YJL206C YDR167W
YJL206C YGL262W
YJL206C YBR235W
YJL206C YNR071C
YJL206C YMR259C
YJL206C YNR072W
Note that DISRUPT also allows to provide "non-interactions" within the network_normal file. They are not marked in any special way and it is the responsibility of the user to keep track which gene pairs define interactions and which of them are non-interactions. This allows to infer losses and gains of interactions, since the predictions file (see below) will contain all gene pairs listed in network_normal.

expression_normal is a file that contains expression data for the network in normal state. This can be knock-out, knock-down or multi-factorial expression data but the values should be normalized to the interval [0,1]. The software will work with different normalization intervals as well but the ranking scores will be more difficult to interpret for multi-factorial data in this case. Here an example of an expression file. All values are tab-separated, the first row contains the header and the gene names are listed in the first column:
GENE  sample_0  sample_1  sample_2  sample_3  sample_4  sample_5  sample_6  sample_7  sample_8  sample_9
YNL004W 0.0000 0.7181569 0.0829788 0.0939352 0.0921688 0.1362978 0.0848851 0.1432751 0.0812507 0.0671806
YJL206C 0.6693 0.0000000 0.7539569 0.6497819 0.7073736 0.7908985 0.7002137 0.6431129 0.7935542 0.6487457
YHL008C 0.0033 0.6617605 0.0000000 0.0040327 0.0186448 0.0195510 0.0180854 0.0074746 0.0023201 0.0053902
YDR166C 0.2626 0.6669589 0.4270926 0.0000000 0.3185876 0.2416608 0.3402738 0.2608482 0.2611949 0.2991349
YDR167W 0.5505 0.0035929 0.4914464 0.5676906 0.0000000 0.4568083 0.4026544 0.5978579 0.5653026 0.5356704
YGL262W 0.7360 0.0638349 0.7396615 0.7035809 0.7824290 0.0000000 0.7011108 0.7824157 0.5624576 0.7891701
YBR235W 0.7189 0.0427199 0.6966874 0.5924989 0.8905074 0.7029194 0.0000000 0.8780427 0.6040577 0.6762094
YNR071C 0.0713 0.6311608 0.0653176 0.1145613 0.0952396 0.0871489 0.1102488 0.0000000 0.0793149 0.1488523
YMR259C 0.7328 0.0090925 0.7027191 0.5981570 0.7818790 0.6853868 0.7193278 0.6575753 0.0000000 0.7506075
YNR072W 0.3369 0.0366842 0.3642304 0.2860091 0.3850221 0.3693415 0.3950807 0.3554340 0.4025031 0.0000000
expression_disrupted is a file that contains expression data for the network in the disrupted state. expression_normal and expression_disrupted need to be of the same experimental type (e.g. knock-out or multi-factorial) and of course both files need to contain expression data for all genes in network_normal. However, the number of samples in expression_normal and expression_disrupted can be different.

ex_type is a flag that specifies the experimental type of the expression data described above. It is either knockout for knock-out data or multifactorial in all other cases (including knock-down data).

predictions is the output file the scored and ranked interactions are written two. Here an example of the contents of a predictions file after running DISRUPT:
YJL206C YNL004W 0.368
YJL206C YGL262W 0.084
YJL206C YDR166C 0.046
YJL206C YMR259C 0.043
YJL206C YNR071C 0.023
YJL206C YDR167W 0.013
YJL206C YHL008C 0.008
YJL206C YBR235W 0.007
YJL206C YNR072W 0.002

The predictions file contains all gene pairs provided in network_normal but now ranked according to their score of being different (e.g. disrupted) to the normal network. The number after each pair of interacting genes is that difference score. For expression data normalized to [0,1] the score will range between 0 and 1 as well. The higher the score the higher the likelihood that the corresponding interaction is lost (or gained for non-interactions).

Note that the reported score is not a probability and a score of 0.368 can not be interpreted as a 36% chance that an interaction is disrupted. In the given example, the highest ranking interaction with the score of 0.368 actually has been disrupted and all following, lower-ranking interactions, with much lower scores, have not been disrupted. Generally, the differences between scores is more indicative of disruptions than the score itself. Here all interactions, apart from the first one, show very low score.

DISRUPT also reports the node-to-edge ratio of the normal network and generally predictions for networks with node-to-edge ratios close to 1.0 are trustworthy while predictions for node-to-edge ratios close to zero are not.

example

Here an example output of DISRUPT when running example.bat. The computation should finish within seconds and the data folder should then contain a new predictions file: predictions.tsv

DISRUPT
Detection of disruptions in gene regulatory networks.
Version 1.0

parameters:
data/network_normal.tsv
data/expression_normal.tsv
data/expression_disrupted.tsv
knockout
data/predictions.tsv

running ...
Node/Edge = 1.111

YJL206C YNL004W 0.368
YJL206C YGL262W 0.084
YJL206C YDR166C 0.046
YJL206C YMR259C 0.043
YJL206C YNR071C 0.023
YJL206C YDR167W 0.013
YJL206C YHL008C 0.008
YJL206C YBR235W 0.007
YJL206C YNR072W 0.002
finished.

To run your own experiment call disrupt.bat or python disrupt.py with the parameters described before.

known problems

None so far.

history


versiondatedescription
1.0113.11.12 Minor changes to doc string comments
1.0012.11.12 First public version

contact

nameemail
Stefan Maetschkes.maetschke@uq.edu.au
Mark Raganm.ragan@uq.edu.au