Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
282b3ba
Update README.md
evanbiederstedt Aug 12, 2018
a4106f6
revised requirements.txt, README
evanbiederstedt Aug 12, 2018
8effca7
added travis config
evanbiederstedt Aug 12, 2018
4ff1858
updated to python3.x in "examples", "experiments", "viz"
evanbiederstedt Aug 14, 2018
f4c339c
updated certain scripts to python3.x
evanbiederstedt Aug 14, 2018
98086fc
updated certain scripts to python3.x
evanbiederstedt Aug 14, 2018
e3a0a71
ported scripts from "wext" to python3.x
evanbiederstedt Aug 15, 2018
f613b30
update travis config, compile 'wext' C/Fortran code
evanbiederstedt Aug 15, 2018
e45829c
revise travis config
evanbiederstedt Aug 15, 2018
4612a70
revise using python3.x syntax for explicit relative imports
evanbiederstedt Aug 15, 2018
844ec70
fixed relative import syntax for all scripts in "wext"
evanbiederstedt Aug 15, 2018
300f1d8
attempt to fix issue with wext_exact_test
evanbiederstedt Aug 15, 2018
4a87caa
2nd attempt to fix issue with wext_exact_test
evanbiederstedt Aug 15, 2018
36a22bc
3rd attempt to fix issue with wext_exact_test
evanbiederstedt Aug 15, 2018
7c6427e
removed python3 shebangs
evanbiederstedt Aug 21, 2018
02e5586
corrected structures for PyMethodDef
evanbiederstedt Sep 9, 2018
934904a
check compiles under Python3.x
evanbiederstedt Sep 9, 2018
602e301
revise poibinmodule header file, defin py_pmf as static
evanbiederstedt Sep 9, 2018
49750b7
try renaming to comet_exact_tests
evanbiederstedt Sep 9, 2018
6b2cdda
revised setup.py, module should be cpoibin
evanbiederstedt Sep 9, 2018
053e680
corrected typo with "from wext_exact_test import triple_exact_test"
evanbiederstedt Sep 9, 2018
29de50b
cannot find module, but it does install...
evanbiederstedt Sep 10, 2018
61fc045
revised __init__.py to correclty import modules
evanbiederstedt Sep 10, 2018
da5a5a9
run nosetests in different subdirectory
evanbiederstedt Sep 10, 2018
2beb71f
try nosetests in upper subdirectory
evanbiederstedt Sep 10, 2018
8f646d4
revised __init__.py
evanbiederstedt Sep 10, 2018
c04fdc8
changed __init__.py again, try explicit imports
evanbiederstedt Sep 10, 2018
8254756
now change exact.py, from ..c import wext_exact_test
evanbiederstedt Sep 10, 2018
7fa5cbb
revised relative imports
evanbiederstedt Sep 10, 2018
c43aa88
try "from .wext_exact_test import * "
evanbiederstedt Sep 10, 2018
fcea393
revise import
evanbiederstedt Sep 10, 2018
3a3e8d1
try from .src.c.wext_exact_test import *
evanbiederstedt Sep 10, 2018
a08fa52
added __init__.py files
evanbiederstedt Sep 10, 2018
38acd11
these should be global modules
evanbiederstedt Sep 10, 2018
885059b
install instead
evanbiederstedt Sep 10, 2018
2717e4a
revised module name
evanbiederstedt Sep 10, 2018
f0498a1
check if fortran extension module can be imported
evanbiederstedt Sep 10, 2018
096a412
revised travis config
evanbiederstedt Sep 10, 2018
3d4972c
check the FORTRAN code installs via setup.py
evanbiederstedt Sep 10, 2018
81ebf43
missing )
evanbiederstedt Sep 10, 2018
a8ff8d2
revise setup.py
evanbiederstedt Sep 10, 2018
9e5d982
revise c extensions
evanbiederstedt Sep 11, 2018
d9a03b7
allow 2.7 builds with python
evanbiederstedt Sep 11, 2018
48873ff
revise how module named
evanbiederstedt Sep 11, 2018
3d74ad2
removed comments
Sep 11, 2018
79f697b
revised string handling, outside scripts
evanbiederstedt Sep 11, 2018
e799a4f
revised string handling, outside scripts
evanbiederstedt Sep 11, 2018
ec674fc
first commit
evanbiederstedt Sep 11, 2018
bb0d151
revised external scripts
evanbiederstedt Sep 11, 2018
4158cbd
fixed performance issue, added future dependency
evanbiederstedt Sep 11, 2018
a60809a
fixed performance issue, added future dependency
evanbiederstedt Sep 11, 2018
e3a4355
revised experiments/eccb2016/scripts
evanbiederstedt Sep 12, 2018
a508dac
revised experiments/eccb2016/scripts, permutation_helper
evanbiederstedt Sep 12, 2018
82ff47c
revised travis config
evanbiederstedt Sep 12, 2018
3ddc71f
revised README
evanbiederstedt Sep 30, 2018
f5ef9a2
Merge pull request #1 from evanbiederstedt/speed_issue
evanbiederstedt Sep 30, 2018
31a8a28
revised source
evanbiederstedt Sep 30, 2018
3abf0f6
remove debugging files
evanbiederstedt Sep 30, 2018
990fdb2
source code revisions for py23 compatibility
evanbiederstedt Sep 30, 2018
aad1b60
Merge branch 'speed_issue' of https://github.com/evanbiederstedt/wext…
evanbiederstedt Sep 30, 2018
866d32e
revised source for py23 compatibility
evanbiederstedt Sep 30, 2018
3ee93a6
revise source, use generator instead of converting to list()
evanbiederstedt Sep 30, 2018
411303d
Merge pull request #2 from evanbiederstedt/speed_issue
evanbiederstedt Sep 30, 2018
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
language: python
python:
- 2.7
- 3.4
- 3.5
- 3.6
install:
- sudo apt-get -y update
- sudo apt-get -y install r-base
- sudo apt-get -y install python-matplotlib
- pip install codecov
- pip install -r requirements.txt
- cd wext
- python setup.py install
- cd ../
- pwd
- ls
script:
- nosetests
after_success:
- codecov
16 changes: 4 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,11 @@
# Weighted Exclusivity Test (WExT) #

The Weighted Exclusivity Test (WExT) was developed by the [Raphael research group](http://compbio.cs.brown.edu/) at Brown University.

### Requirements ###

Latest tested version in parentheses.
[![Build Status](https://api.travis-ci.org/raphael-group/wext.svg?branch=master)](https://travis-ci.org/raphael-group/wext?branch=master)

1. Python (2.7.9)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

future is also currently needed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments above on the README; I think I was thinking the requirements.txt addressed all Python dependencies.

Something like pip install -r requirements.txt should address concerns of all users, I think. (Could be wrong, happy to change.)

Copy link
Contributor

@matthewreyna matthewreyna Jan 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but not everyone will think to check requirements.txt or run pip install -r requirements.txt. Others may think, justifiably, that running python setup.py install successfully means that everything ready to go. For a real-life example, it crashed for me because I tried it in a new virtual machine with the usual dependencies but hadn't needed future yet.

If we add it to the README, then we'll save a few emails, GitHub issues, and StackOverflow searches, which is worth it for everyone.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we add it to the README, then we'll save a few emails, GitHub issues, and StackOverflow searches, which is worth it for everyone.

Absolutely, and I agree with this as a philosophy whole-heartedly. If there are any GitHub issues, it's 99% the fault of the developer, even if the problem is one of presentation.

Yes, but not everyone will think to check requirements.txt or run pip install -r requirements.txt.

So, the motivation above wasn't to remove this information. Rather, I'm trying to cater to the lazy user (myself included) which reduces GitHub issues, e-mails, SO questions, etc.

When I look at new tools in bioinformatics, I want to see a succinct description of the algorithm/code within seconds in the README. This is needed for WExT, at the top of the README. In the current README, requirements are given first. The requirements should be made more comprehensive (e.g. add the python libraries necessary with libraries required as detailed in requirements.txt) and I think it shouldn't be the first think seen in a README. We need to aim for both succinct and comprehensive :)

The rationale behind this is presentation for new users (as an invitation to use the code), as well as preventing unnecessary Github issues/e-mails.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Let's move requirements lower in the README if they are too long.


a. NumPy (1.11.0)

b. SciPy (0.17.0)
The Weighted Exclusivity Test (WExT) was developed by the [Raphael research group](http://compbio.cs.brown.edu/) at Brown University.

2. gcc (4.9.2)
### Requirements ###

We recommend using [`virtualenv`](https://virtualenv.pypa.io/en/latest/) to install the Python requirements. After installing `virtualenv`, you can install the Python requirements for the weighted exclusivity test as follows:

Expand All @@ -27,8 +20,7 @@ See the wiki for additional instructions on [Setup and installation](https://git
The C and Fortran extensions must be compiled before running the weighted exclusivity test:

cd wext
python setup.py build
f2py -c src/fortran/bipartite_edge_swap_module.f95 -m bipartite_edge_swap_module
python setup.py install

### Usage ###

Expand Down
42 changes: 23 additions & 19 deletions compute_mutation_probabilities.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,12 @@
import sys, os, argparse, json, numpy as np, multiprocessing as mp, random
from collections import defaultdict


# Load the weighted exclusivity test
this_dir = os.path.dirname(os.path.realpath(__file__))
sys.path.append(this_dir)
from wext import *
from past.builtins import xrange
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need xrange.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my sole motivation here was consistency :)

Above I detail the memory/performance issue associated with range() vs xrange() between Python2.x and 3.x. So, I may have simply started using this everything to avoid any potential issues. (A downside to not taking the time to unit test this is that I didn't think deeply about each use of range(), unless it became obvious it was a problem.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine, and the use of xrange in

seeds = random.sample(xrange(1, 2*10**9), args.num_permutations)

is the only really important one.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree with this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can keep xrange here and other places -- not a big issue.


# Argument parser
def get_parser():
Expand All @@ -20,12 +22,13 @@ def get_parser():
parser.add_argument('-q', '--swap_multiplier', type=int, required=False, default=100)
parser.add_argument('-nc', '--num_cores', type=int, required=False, default=1)
parser.add_argument('-s', '--seed', type=int, required=False, default=None)
parser.add_argument('-v', '--verbose', type=int, required=False, default=1, choices=range(5))
parser.add_argument('-v', '--verbose', type=int, required=False, default=1, choices=list(range(5)))
return parser

def permute_matrices_wrapper(args): return permute_matrices(*args)
def permute_matrices(edge_list, max_swaps, max_tries, seeds, verbose,
m, n, num_edges, indexToGene, indexToPatient):
def permute_matrices_wrapper(args):
return permute_matrices(*args)

def permute_matrices(edge_list, max_swaps, max_tries, seeds, verbose, m, n, num_edges, indexToGene, indexToPatient):
# Initialize our output
observed = np.zeros((m, n))
permutations = []
Expand All @@ -43,8 +46,8 @@ def permute_matrices(edge_list, max_swaps, max_tries, seeds, verbose,
indices.append( (edge[0]-1, edge[1]-1) )

# Record the permutation
observed[zip(*indices)] += 1.
geneToCases = dict( (g, list(cases)) for g, cases in geneToCases.iteritems() )
observed[tuple(zip(*indices))] += 1.
geneToCases = dict( (g, list(cases)) for g, cases in geneToCases.items())
permutations.append( dict(geneToCases=geneToCases, permutation_number=seed) )

return observed/float(len(seeds)), permutations
Expand Down Expand Up @@ -76,28 +79,28 @@ def run( args ):

# Load mutation data
if args.verbose > 0:
print '* Loading mutation data...'
print('* Loading mutation data...')

mutation_data = load_mutation_data( args.mutation_file )
genes, all_genes, patients, geneToCases, patientToMutations, params, hypermutators = mutation_data

geneToObserved = dict( (g, len(cases)) for g, cases in geneToCases.iteritems() )
patientToObserved = dict( (p, len(muts)) for p, muts in patientToMutations.iteritems() )
geneToObserved = dict( (g, len(cases)) for g, cases in geneToCases.items())
patientToObserved = dict( (p, len(muts)) for p, muts in patientToMutations.items())
geneToIndex = dict( (g, i+1) for i, g in enumerate(all_genes) )
indexToGene = dict( (i+1, g) for i, g in enumerate(all_genes) )
patientToIndex = dict( (p, j+1) for j, p in enumerate(patients) )
indexToPatient = dict( (j+1, p) for j, p in enumerate(patients) )

edges = set()
for gene, cases in geneToCases.iteritems():
for gene, cases in geneToCases.items():
for patient in cases:
edges.add( (geneToIndex[gene], patientToIndex[patient]) )

edge_list = np.array(sorted(edges), dtype=np.int)

# Run the bipartite edge swaps
if args.verbose > 0:
print '* Permuting matrices...'
print('* Permuting matrices...')

m = len(all_genes)
n = len(patients)
Expand Down Expand Up @@ -127,7 +130,7 @@ def run( args ):
# Create the weights file
if args.weights_file:
if args.verbose > 0:
print '* Saving weights file...'
print('* Saving weights file...')

# Allow for small accumulated numerical errors
tol = 1e3*max(m, n)*args.num_permutations*np.finfo(np.float64).eps
Expand All @@ -137,10 +140,10 @@ def run( args ):
P = np.add.reduce(observeds) / float(len(observeds))

# Verify the weights
for g, obs in geneToObserved.iteritems():
for g, obs in geneToObserved.items():
assert( np.abs(P[geneToIndex[g]-1].sum() - obs) < tol)

for p, obs in patientToObserved.iteritems():
for p, obs in patientToObserved.items():
assert( np.abs(P[:, patientToIndex[p]-1].sum() - obs) < tol)

# Construct mutation matrix to compute marginals
Expand All @@ -154,12 +157,12 @@ def run( args ):
P = postprocess_weight_matrix(P, r, s)

# Verify the weights again
for g, obs in geneToObserved.iteritems():
for g, obs in geneToObserved.items():
assert( np.abs(P[geneToIndex[g]-1].sum() - obs) < tol)

for p, obs in patientToObserved.iteritems():
for p, obs in patientToObserved.items():
assert( np.abs(P[:, patientToIndex[p]-1].sum() - obs) < tol)

# Add pseudocounts to entries with no mutations observed; unlikely or impossible after post-processing step
P[P == 0] = 1./(2. * args.num_permutations)

Expand All @@ -171,7 +174,7 @@ def run( args ):
if args.permutation_directory:
output_prefix = args.permutation_directory + '/permuted-mutations-{}.json'
if args.verbose > 0:
print '* Saving permuted mutation data...'
print('* Saving permuted mutation data...')

for _, permutation_list in results:
for permutation in permutation_list:
Expand All @@ -180,4 +183,5 @@ def run( args ):
permutation['params'] = params
json.dump( permutation, OUT )

if __name__ == '__main__': run( get_parser().parse_args(sys.argv[1:]) )
if __name__ == '__main__':
run( get_parser().parse_args(sys.argv[1:]) )
3 changes: 2 additions & 1 deletion examples/generate_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -81,4 +81,5 @@ def run(args):
raise NotImplementedError('Data generation mode "%s" is not implemented.' % args.mode)
return

if __name__ == '__main__': run( get_parser().parse_args(sys.argv[1:]) )
if __name__ == '__main__':
run( get_parser().parse_args(sys.argv[1:]) )
4 changes: 4 additions & 0 deletions experiments/eccb2016/scripts/helper.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
#!/usr/bin/env python

import numpy as np
from past.builtins import xrange
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may be entirely correct, and I'm happy to accept this.

See previous comments on range() vs. xrange() and consistency.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big issue. We can leave as-is.


# Add a y=x line to the given matplotlib axis
def add_y_equals_x(ax, c='k', line_style='--', alpha=0.75):
Expand All @@ -15,6 +17,7 @@ def add_y_equals_x(ax, c='k', line_style='--', alpha=0.75):
ax.set_xlim(lims)
ax.set_ylim(lims)


def aligned_plaintext_table(table, sep='\t', spaces=2):
"""
Create and return an aligned plaintext table.
Expand All @@ -41,6 +44,7 @@ def aligned_plaintext_table(table, sep='\t', spaces=2):
# Return results.
return '\n'.join([''.join([entries[i][j].rjust(sizes[j]+spaces) for j in range(n)]).rstrip() for i in range(m)])


def rank(a, reverse=False, ties=2):
"""
Find the ranks of the elements of a.
Expand Down
12 changes: 6 additions & 6 deletions experiments/eccb2016/scripts/pairs_summary.py
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@
"Cancer": cancer})
df = pd.DataFrame(items)

print 'Testing {} pairs...'.format(len(weighted_exact_pvals))
print('Testing {} pairs...'.format(len(weighted_exact_pvals)))

# Set up the figure
fig, ((ax1, ax2, ax3, ax4)) = plt.subplots(1, 4)
Expand Down Expand Up @@ -138,15 +138,15 @@
# Output the correlation between
all_correlation = spearmanr(weighted_exact_pvals, weighted_saddlepoint_pvals)
tail_correlation = spearmanr(weighted_exact_tail_pvals, weighted_saddlepoint_tail_pvals)
print '-' * 14, 'Correlation: WRE (Saddlepoint) and WRE (Recursive)', '-' * 14
print 'All: \\rho={:.5}, P={:.5}'.format(*all_correlation)
print '\Phi_WR < 10^-4: \\rho={:.5}, P={:.5}'.format(*tail_correlation)
print('-' * 14, 'Correlation: WRE (Saddlepoint) and WRE (Recursive)', '-' * 14)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Python 2, print(a, b, c) prints (a, b, c). Replace commas with plus signs?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace commas with plus signs?

Good catch. Yes, let's use plus signs.

print('All: \\rho={:.5}, P={:.5}'.format(*all_correlation))
print('\Phi_WR < 10^-4: \\rho={:.5}, P={:.5}'.format(*tail_correlation))

# Output a table summarizing the runtimes (Table 3)
print '-' * 35, 'Runtimes', '-' * 35
print('-' * 35, 'Runtimes', '-' * 35)
tbl = ['#Method\tMinimum\tMedian\tMaximum\tTotal']
for method in ["WRE (Exact)", "WRE (Saddlepoint)"]:
print method, sum(list(df.loc[df['Method'] == method]['Runtime (seconds)']))
print(method, sum(list(df.loc[df['Method'] == method]['Runtime (seconds)'])))

# Output to file
plt.tight_layout()
Expand Down
14 changes: 9 additions & 5 deletions experiments/eccb2016/scripts/permutation_test_helper.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
parser.add_argument('-o', '--output_prefix', type=str, required=True)
parser.add_argument('-w', '--wext_directory', type=str, required=True)
parser.add_argument('-j', '--job_id', type=int, required=job_id is None, default=job_id)
parser.add_argument('-v', '--verbose', type=int, required=False, default=0, choices=range(5))
parser.add_argument('-v', '--verbose', type=int, required=False, default=0, choices=list(range(5)))
args = parser.parse_args( sys.argv[1:] )

# Load weighted exclusivity test
Expand All @@ -25,19 +25,23 @@
from wext import rce_permutation_test, load_mutation_data, output_enumeration_table

# Load the mutation data
if args.verbose > 0: print '* Loading mutation data..'
if args.verbose > 0:
print('* Loading mutation data..')
mutation_data = load_mutation_data( args.mutation_file, args.min_freq )
genes, all_genes, patients, geneToCases, _, params, _ = mutation_data
num_patients = len(patients)
sets = list( frozenset(t) for t in combinations(genes, args.gene_set_size) )

if args.verbose > 0: print '\t- Testing {} sets of size k={}'.format(len(sets), args.gene_set_size)
if args.verbose > 0:
print('\t- Testing {} sets of size k={}'.format(len(sets), args.gene_set_size))

# Run the permutational test
if args.verbose > 0: print '* Running permutation test...'
if args.verbose > 0:
print('* Running permutation test...')
start_index = (args.job_id-1) * args.batch_size
permuted_files = get_permuted_files([args.input_directory], args.num_permutations)[start_index:start_index + args.batch_size]
if args.verbose > 0: print '\t- Testing {} files'.format(len(permuted_files))
if args.verbose > 0:
print('\t- Testing {} files'.format(len(permuted_files)))

setToPval, setToRuntime, setToFDR, setToObs = rce_permutation_test( sets, geneToCases, num_patients, permuted_files, 1, 0 )

Expand Down
9 changes: 5 additions & 4 deletions experiments/eccb2016/scripts/permute_single_matrix.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ def get_parser():
default=os.environ.get('SGE_TASK_ID', 0))
return parser


def run( args ):
# Load WExT
sys.path.append(args.wext_dir)
Expand All @@ -33,7 +34,7 @@ def run( args ):
indexToPatient = dict( (j+1, p) for j, p in enumerate(patients) )

edges = set()
for gene, cases in geneToCases.iteritems():
for gene, cases in geneToCases.items():
for patient in cases:
edges.add( (geneToIndex[gene], patientToIndex[patient]) )

Expand All @@ -57,16 +58,16 @@ def run( args ):
permutedPatientToMutations[patient].add(gene)

# Verify the number of mutations per gene/patient is preserved
for g, cases in geneToCases.iteritems():
for g, cases in geneToCases.items():
assert( len(cases) == len(permutedGeneToCases[g]) )

for p, muts in patientToMutations.iteritems():
for p, muts in patientToMutations.items():
assert( len(muts) == len(permutedPatientToMutations[p]) )

# Save edge list.
output_file = '{}-{}.json'.format(args.output_prefix, args.job_id)
permutation = dict(params=params, permutation_number=args.job_id,
geneToCases=dict( (g, list(cases)) for g, cases in permutedGeneToCases.iteritems()))
geneToCases=dict( (g, list(cases)) for g, cases in permutedGeneToCases.items()))
with open(output_file, 'w') as OUT: json.dump( permutation, OUT )

if __name__ == '__main__':
Expand Down
28 changes: 14 additions & 14 deletions experiments/eccb2016/scripts/pval_correlations.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@
# Compute the correlations with permutational
# permutational_pvals_with_zeros = list(df.loc[df['Method'] == 'Permutational']['Raw P-value'])
# all_indices =
tests = ["Permutational", "Fisher's exact test", "Weighted (exact test)", "Weighted (saddlepoint)"]
tests = ["Permutational", "Fisher's exact test", "Weighted (exact test)", "Weighted (saddlepoint)"]
for val, indices in [("All", []), (0, 1./args.num_permutations), (1./args.num_permutations, 2)]:
tbl = [list(tests)]
for t1 in tests:
Expand All @@ -46,33 +46,33 @@
row.append(rho)
tbl.append(row)

print '-' * 80
print 'CORRELATIONS ({})'.format(val)
print aligned_plaintext_table('\n'.join([ '\t'.join(map(str, row)) for row in tbl ]) )
print('-' * 80)
print('CORRELATIONS ({})'.format(val))
print(aligned_plaintext_table('\n'.join([ '\t'.join(map(str, row)) for row in tbl ])))

permutational_pvals_no_zeros = [ p for p in permutational_pvals_with_zeros if p > 0 ]
for method in ["Fisher's exact test", "Weighted (exact test)", "Weighted (saddlepoint)"]:
pvals = list(df.loc[df['Method'] == method]['P-value'])
print 'Correlation:', method, 'with Permutational'
print('Correlation:', method, 'with Permutational')
rho, pval = spearmanr(permutational_pvals, pvals)
print '\tIncluding P < {}: N={}, \\rho={}, P={}'.format(1./args.num_permutations, len(pvals), rho, pval)
print('\tIncluding P < {}: N={}, \\rho={}, P={}'.format(1./args.num_permutations, len(pvals), rho, pval))
pvals_no_zeros = [ p for i, p in enumerate(pvals) if permutational_pvals_with_zeros[i] > 0 ]
rho, pval = spearmanr(permutational_pvals_no_zeros, pvals_no_zeros)
print '\tWithout P < {}: N={}, \\rho={}, P={}'.format(1./args.num_permutations, len(pvals_no_zeros), rho, pval)
print
print('\tWithout P < {}: N={}, \\rho={}, P={}'.format(1./args.num_permutations, len(pvals_no_zeros), rho, pval))

# Compute the correlations of weighted saddlepoint and exact test
weighted_exact_pvals = list(df.loc[df['Method'] == 'Weighted (exact test)']['P-value'])
weighted_saddlepoint_pvals = list(df.loc[df['Method'] == 'Weighted (saddlepoint)']['P-value'])
rho, pval = spearmanr(weighted_exact_pvals, weighted_saddlepoint_pvals)

print 'Correlation of weighted exact test and saddlepoint (all P-values)'
print '\tN={}, \\rho: {}, P={}'.format(len(weighted_exact_pvals), rho, pval)
print('Correlation of weighted exact test and saddlepoint (all P-values)')
print('\tN={}, \\rho: {}, P={}'.format(len(weighted_exact_pvals), rho, pval))

tail_weighted_exact_pvals = [ p for p in weighted_exact_pvals if p < 1e-4 ]
rho, pval = spearmanr(tail_weighted_exact_pvals, [ p for i, p in enumerate(weighted_saddlepoint_pvals) if weighted_exact_pvals[i] < 1e-4])
print 'Correlation of weighted exact test and saddlepoint (P < 0.0001)'
print '\tN={}, \\rho: {}, P={}'.format(len(tail_weighted_exact_pvals), rho, pval)
print('Correlation of weighted exact test and saddlepoint (P < 0.0001)')
print('\tN={}, \\rho: {}, P={}'.format(len(tail_weighted_exact_pvals), rho, pval))

rho, pval = spearmanr(tail_weighted_exact_pvals, [ p for i, p in enumerate(permutational_pvals) if weighted_exact_pvals[i] < 1e-4])
print 'Correlation of weighted exact test and permutational (P < 0.0001)'
print '\tN={}, \\rho: {}, P={}'.format(len(tail_weighted_exact_pvals), rho, pval)
print('Correlation of weighted exact test and permutational (P < 0.0001)')
print('\tN={}, \\rho: {}, P={}'.format(len(tail_weighted_exact_pvals), rho, pval))
Loading