-
Notifications
You must be signed in to change notification settings - Fork 5
compatibility for python2 and python3, including C API, and revisions as to how package built #29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
282b3ba
a4106f6
8effca7
4ff1858
f4c339c
98086fc
e3a0a71
f613b30
e45829c
4612a70
844ec70
300f1d8
4a87caa
36a22bc
7c6427e
02e5586
934904a
602e301
49750b7
6b2cdda
053e680
29de50b
61fc045
da5a5a9
2beb71f
8f646d4
c04fdc8
8254756
7fa5cbb
c43aa88
fcea393
3a3e8d1
a08fa52
38acd11
885059b
2717e4a
f0498a1
096a412
3d4972c
81ebf43
a8ff8d2
9e5d982
d9a03b7
48873ff
3d74ad2
79f697b
e799a4f
ec674fc
bb0d151
4158cbd
a60809a
e3a4355
a508dac
82ff47c
3ddc71f
f5ef9a2
31a8a28
3abf0f6
990fdb2
aad1b60
866d32e
3ee93a6
411303d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,21 @@ | ||
| language: python | ||
| python: | ||
| - 2.7 | ||
| - 3.4 | ||
| - 3.5 | ||
| - 3.6 | ||
| install: | ||
| - sudo apt-get -y update | ||
| - sudo apt-get -y install r-base | ||
| - sudo apt-get -y install python-matplotlib | ||
| - pip install codecov | ||
| - pip install -r requirements.txt | ||
| - cd wext | ||
| - python setup.py install | ||
| - cd ../ | ||
| - pwd | ||
| - ls | ||
| script: | ||
| - nosetests | ||
| after_success: | ||
| - codecov |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -4,10 +4,12 @@ | |
| import sys, os, argparse, json, numpy as np, multiprocessing as mp, random | ||
| from collections import defaultdict | ||
|
|
||
|
|
||
| # Load the weighted exclusivity test | ||
| this_dir = os.path.dirname(os.path.realpath(__file__)) | ||
| sys.path.append(this_dir) | ||
| from wext import * | ||
| from past.builtins import xrange | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think we need
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think my sole motivation here was consistency :) Above I detail the memory/performance issue associated with
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's fine, and the use of is the only really important one.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I agree with this.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can keep |
||
|
|
||
| # Argument parser | ||
| def get_parser(): | ||
|
|
@@ -20,12 +22,13 @@ def get_parser(): | |
| parser.add_argument('-q', '--swap_multiplier', type=int, required=False, default=100) | ||
| parser.add_argument('-nc', '--num_cores', type=int, required=False, default=1) | ||
| parser.add_argument('-s', '--seed', type=int, required=False, default=None) | ||
| parser.add_argument('-v', '--verbose', type=int, required=False, default=1, choices=range(5)) | ||
| parser.add_argument('-v', '--verbose', type=int, required=False, default=1, choices=list(range(5))) | ||
| return parser | ||
|
|
||
| def permute_matrices_wrapper(args): return permute_matrices(*args) | ||
| def permute_matrices(edge_list, max_swaps, max_tries, seeds, verbose, | ||
| m, n, num_edges, indexToGene, indexToPatient): | ||
| def permute_matrices_wrapper(args): | ||
| return permute_matrices(*args) | ||
|
|
||
| def permute_matrices(edge_list, max_swaps, max_tries, seeds, verbose, m, n, num_edges, indexToGene, indexToPatient): | ||
| # Initialize our output | ||
| observed = np.zeros((m, n)) | ||
| permutations = [] | ||
|
|
@@ -43,8 +46,8 @@ def permute_matrices(edge_list, max_swaps, max_tries, seeds, verbose, | |
| indices.append( (edge[0]-1, edge[1]-1) ) | ||
|
|
||
| # Record the permutation | ||
| observed[zip(*indices)] += 1. | ||
| geneToCases = dict( (g, list(cases)) for g, cases in geneToCases.iteritems() ) | ||
| observed[tuple(zip(*indices))] += 1. | ||
| geneToCases = dict( (g, list(cases)) for g, cases in geneToCases.items()) | ||
| permutations.append( dict(geneToCases=geneToCases, permutation_number=seed) ) | ||
|
|
||
| return observed/float(len(seeds)), permutations | ||
|
|
@@ -76,28 +79,28 @@ def run( args ): | |
|
|
||
| # Load mutation data | ||
| if args.verbose > 0: | ||
| print '* Loading mutation data...' | ||
| print('* Loading mutation data...') | ||
|
|
||
| mutation_data = load_mutation_data( args.mutation_file ) | ||
| genes, all_genes, patients, geneToCases, patientToMutations, params, hypermutators = mutation_data | ||
|
|
||
| geneToObserved = dict( (g, len(cases)) for g, cases in geneToCases.iteritems() ) | ||
| patientToObserved = dict( (p, len(muts)) for p, muts in patientToMutations.iteritems() ) | ||
| geneToObserved = dict( (g, len(cases)) for g, cases in geneToCases.items()) | ||
| patientToObserved = dict( (p, len(muts)) for p, muts in patientToMutations.items()) | ||
| geneToIndex = dict( (g, i+1) for i, g in enumerate(all_genes) ) | ||
| indexToGene = dict( (i+1, g) for i, g in enumerate(all_genes) ) | ||
| patientToIndex = dict( (p, j+1) for j, p in enumerate(patients) ) | ||
| indexToPatient = dict( (j+1, p) for j, p in enumerate(patients) ) | ||
|
|
||
| edges = set() | ||
| for gene, cases in geneToCases.iteritems(): | ||
| for gene, cases in geneToCases.items(): | ||
| for patient in cases: | ||
| edges.add( (geneToIndex[gene], patientToIndex[patient]) ) | ||
|
|
||
| edge_list = np.array(sorted(edges), dtype=np.int) | ||
|
|
||
| # Run the bipartite edge swaps | ||
| if args.verbose > 0: | ||
| print '* Permuting matrices...' | ||
| print('* Permuting matrices...') | ||
|
|
||
| m = len(all_genes) | ||
| n = len(patients) | ||
|
|
@@ -127,7 +130,7 @@ def run( args ): | |
| # Create the weights file | ||
| if args.weights_file: | ||
| if args.verbose > 0: | ||
| print '* Saving weights file...' | ||
| print('* Saving weights file...') | ||
|
|
||
| # Allow for small accumulated numerical errors | ||
| tol = 1e3*max(m, n)*args.num_permutations*np.finfo(np.float64).eps | ||
|
|
@@ -137,10 +140,10 @@ def run( args ): | |
| P = np.add.reduce(observeds) / float(len(observeds)) | ||
|
|
||
| # Verify the weights | ||
| for g, obs in geneToObserved.iteritems(): | ||
| for g, obs in geneToObserved.items(): | ||
| assert( np.abs(P[geneToIndex[g]-1].sum() - obs) < tol) | ||
|
|
||
| for p, obs in patientToObserved.iteritems(): | ||
| for p, obs in patientToObserved.items(): | ||
| assert( np.abs(P[:, patientToIndex[p]-1].sum() - obs) < tol) | ||
|
|
||
| # Construct mutation matrix to compute marginals | ||
|
|
@@ -154,12 +157,12 @@ def run( args ): | |
| P = postprocess_weight_matrix(P, r, s) | ||
|
|
||
| # Verify the weights again | ||
| for g, obs in geneToObserved.iteritems(): | ||
| for g, obs in geneToObserved.items(): | ||
| assert( np.abs(P[geneToIndex[g]-1].sum() - obs) < tol) | ||
|
|
||
| for p, obs in patientToObserved.iteritems(): | ||
| for p, obs in patientToObserved.items(): | ||
| assert( np.abs(P[:, patientToIndex[p]-1].sum() - obs) < tol) | ||
|
|
||
| # Add pseudocounts to entries with no mutations observed; unlikely or impossible after post-processing step | ||
| P[P == 0] = 1./(2. * args.num_permutations) | ||
|
|
||
|
|
@@ -171,7 +174,7 @@ def run( args ): | |
| if args.permutation_directory: | ||
| output_prefix = args.permutation_directory + '/permuted-mutations-{}.json' | ||
| if args.verbose > 0: | ||
| print '* Saving permuted mutation data...' | ||
| print('* Saving permuted mutation data...') | ||
|
|
||
| for _, permutation_list in results: | ||
| for permutation in permutation_list: | ||
|
|
@@ -180,4 +183,5 @@ def run( args ): | |
| permutation['params'] = params | ||
| json.dump( permutation, OUT ) | ||
|
|
||
| if __name__ == '__main__': run( get_parser().parse_args(sys.argv[1:]) ) | ||
| if __name__ == '__main__': | ||
| run( get_parser().parse_args(sys.argv[1:]) ) | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,5 +1,7 @@ | ||
| #!/usr/bin/env python | ||
|
|
||
| import numpy as np | ||
| from past.builtins import xrange | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not needed.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You may be entirely correct, and I'm happy to accept this. See previous comments on
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not a big issue. We can leave as-is. |
||
|
|
||
| # Add a y=x line to the given matplotlib axis | ||
| def add_y_equals_x(ax, c='k', line_style='--', alpha=0.75): | ||
|
|
@@ -15,6 +17,7 @@ def add_y_equals_x(ax, c='k', line_style='--', alpha=0.75): | |
| ax.set_xlim(lims) | ||
| ax.set_ylim(lims) | ||
|
|
||
|
|
||
| def aligned_plaintext_table(table, sep='\t', spaces=2): | ||
| """ | ||
| Create and return an aligned plaintext table. | ||
|
|
@@ -41,6 +44,7 @@ def aligned_plaintext_table(table, sep='\t', spaces=2): | |
| # Return results. | ||
| return '\n'.join([''.join([entries[i][j].rjust(sizes[j]+spaces) for j in range(n)]).rstrip() for i in range(m)]) | ||
|
|
||
|
|
||
| def rank(a, reverse=False, ties=2): | ||
| """ | ||
| Find the ranks of the elements of a. | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -86,7 +86,7 @@ | |
| "Cancer": cancer}) | ||
| df = pd.DataFrame(items) | ||
|
|
||
| print 'Testing {} pairs...'.format(len(weighted_exact_pvals)) | ||
| print('Testing {} pairs...'.format(len(weighted_exact_pvals))) | ||
|
|
||
| # Set up the figure | ||
| fig, ((ax1, ax2, ax3, ax4)) = plt.subplots(1, 4) | ||
|
|
@@ -138,15 +138,15 @@ | |
| # Output the correlation between | ||
| all_correlation = spearmanr(weighted_exact_pvals, weighted_saddlepoint_pvals) | ||
| tail_correlation = spearmanr(weighted_exact_tail_pvals, weighted_saddlepoint_tail_pvals) | ||
| print '-' * 14, 'Correlation: WRE (Saddlepoint) and WRE (Recursive)', '-' * 14 | ||
| print 'All: \\rho={:.5}, P={:.5}'.format(*all_correlation) | ||
| print '\Phi_WR < 10^-4: \\rho={:.5}, P={:.5}'.format(*tail_correlation) | ||
| print('-' * 14, 'Correlation: WRE (Saddlepoint) and WRE (Recursive)', '-' * 14) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In Python 2,
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Good catch. Yes, let's use plus signs. |
||
| print('All: \\rho={:.5}, P={:.5}'.format(*all_correlation)) | ||
| print('\Phi_WR < 10^-4: \\rho={:.5}, P={:.5}'.format(*tail_correlation)) | ||
|
|
||
| # Output a table summarizing the runtimes (Table 3) | ||
| print '-' * 35, 'Runtimes', '-' * 35 | ||
| print('-' * 35, 'Runtimes', '-' * 35) | ||
| tbl = ['#Method\tMinimum\tMedian\tMaximum\tTotal'] | ||
| for method in ["WRE (Exact)", "WRE (Saddlepoint)"]: | ||
| print method, sum(list(df.loc[df['Method'] == method]['Runtime (seconds)'])) | ||
| print(method, sum(list(df.loc[df['Method'] == method]['Runtime (seconds)']))) | ||
|
|
||
| # Output to file | ||
| plt.tight_layout() | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
futureis also currently needed.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comments above on the README; I think I was thinking the
requirements.txtaddressed all Python dependencies.Something like
pip install -r requirements.txtshould address concerns of all users, I think. (Could be wrong, happy to change.)Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but not everyone will think to check
requirements.txtor runpip install -r requirements.txt. Others may think, justifiably, that runningpython setup.py installsuccessfully means that everything ready to go. For a real-life example, it crashed for me because I tried it in a new virtual machine with the usual dependencies but hadn't neededfutureyet.If we add it to the README, then we'll save a few emails, GitHub issues, and StackOverflow searches, which is worth it for everyone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely, and I agree with this as a philosophy whole-heartedly. If there are any GitHub issues, it's 99% the fault of the developer, even if the problem is one of presentation.
So, the motivation above wasn't to remove this information. Rather, I'm trying to cater to the lazy user (myself included) which reduces GitHub issues, e-mails, SO questions, etc.
When I look at new tools in bioinformatics, I want to see a succinct description of the algorithm/code within seconds in the README. This is needed for WExT, at the top of the README. In the current README, requirements are given first. The requirements should be made more comprehensive (e.g. add the python libraries necessary with libraries required as detailed in
requirements.txt) and I think it shouldn't be the first think seen in a README. We need to aim for both succinct and comprehensive :)The rationale behind this is presentation for new users (as an invitation to use the code), as well as preventing unnecessary Github issues/e-mails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. Let's move requirements lower in the README if they are too long.