- Added QSEQ parsing function
parse_qseqand iteratorQseqIteratortoskbio.parse.sequences. - Added
strictandlookupoptional parameters toskbio.stats.distance.mantelfor handling reordering and matching of IDs when providedDistanceMatrixinstances as input (these parameters were previously only available inskbio.stats.distance.pwmantel). skbio.stats.distance.pwmantelnow accepts an iterable ofarray_likeobjects. Previously, onlyDistanceMatrixinstances were allowed.- Added
readandwritemethods toDissimilarityMatrixandDistanceMatrix. These methods can support multiple file formats, automatic file format detection when reading, etc. by taking advantage of scikit-bio's I/O registry system. Seeskbio.ioandskbio.io.dmfor more details. Deprecatedfrom_fileandto_filemethods in favor ofreadandwrite. These methods will be removed in scikit-bio 0.3.0. - Added
readandwritemethods toOrdinationResults. These methods can support multiple file formats, automatic file format detection when reading, etc. by taking advantage of scikit-bio's I/O registry system. Seeskbio.ioandskbio.io.ordresfor more details. Deprecatedfrom_fileandto_filemethods in favor ofreadandwrite. These methods will be removed in scikit-bio 0.3.0. - Added
skbio.stats.ordination.assert_ordination_results_equalfor comparingOrdinationResultsobjects for equality in unit tests.
skbio.stats.distance.mantelnow returns a 3-element tuple containing correlation coefficient, p-value, and the number of matching rows/cols in the distance matrices (n). The return value was previously a 2-element tuple containing only the correlation coefficient and p-value.skbio.stats.distance.mantelreorders inputDistanceMatrixinstances based on matching IDs (see optional parametersstrictandlookupfor controlling this behavior). In the past,DistanceMatrixinstances were treated the same asarray_likeinput and no reordering took place, regardless of ID (mis)matches.array_likeinput behavior remains the same.- If mismatched types are provided to
skbio.stats.distance.mantel(e.g., aDistanceMatrixandarray_like), aTypeErrorwill be raised.
- Added git timestamp checking to checklist.py, ensuring that when changes are made to Cython (.pyx) files, their corresponding generated C files are also updated.
This is an initial alpha release of scikit-bio. At this stage, major backwards-incompatible API changes can and will happen. Many backwards-incompatible API changes were made since the previous release.
- Added ability to compute distances between sequences in a
SequenceCollectionobject (#509), and expandedAlignment.distanceto allow the user to pass a function for computing distances (the default distance metric is stillscipy.spatial.distance.hamming) (#194). - Added functionality to not penalize terminal gaps in global alignment. This functionality results in more biologically relevant global alignments (see #537 for discussion of the issue) and is now the default behavior for global alignment.
- The python global aligners (
global_pairwise_align,global_pairwise_align_nucleotide, andglobal_pairwise_align_protein) now support aligning pairs of sequences, pairs of alignments, and a sequence and an alignment (see #550). This functionality supports progressive multiple sequence alignment, among other things such as adding a sequence to an existing alignment. - Added
StockholmAlignment.to_filefor writing Stockholm-formatted files. - Added
strict=Trueoptional parameter toDissimilarityMatrix.filter. - Added
TreeNode.find_allfor finding all tree nodes that match a given name.
- Fixed bug that resulted in a
ValueErrorfromlocal_align_pairwise_nucleotide(see #504) under many circumstances. This would not generate incorrect results, but would cause the code to fail.
- Removed
skbio.math, leavingstatsanddiversityto become top level packages. For example, instead offrom skbio.math.stats.ordination import PCoAyou would now importfrom skbio.stats.ordination import PCoA. - The module
skbio.math.gradientas well as the contents ofskbio.math.subsampleandskbio.math.stats.miscare now found inskbio.stats. As an example, to import subsample:from skbio.stats import subsample; to import everything from gradient:from skbio.stats.gradient import *. - The contents of
skbio.math.stats.ordination.utilsare now inskbio.stats.ordination. - Removed
skbio.appsubpackage (i.e., the application controller framework) as this code has been ported to the standalone burrito Python package. This code was not specific to bioinformatics and is useful for wrapping command-line applications in general. - Removed
skbio.core, leavingalignment,genetic_code,sequence,tree, andworkflowto become top level packages. For example, instead offrom skbio.core.sequence import DNAyou would now importfrom skbio.sequence import DNA. - Removed
skbio.util.exceptionandskbio.util.warning(see #577 for the reasoning behind this change). The exceptions/warnings were moved to the following locations:
FileFormatError,RecordError,FieldError, andEfficiencyWarninghave been moved toskbio.utilBiologicalSequenceErrorhas been moved toskbio.sequenceSequenceCollectionErrorandStockholmParseErrorhave been moved toskbio.alignmentDissimilarityMatrixError,DistanceMatrixError,DissimilarityMatrixFormatError, andMissingIDErrorhave been moved toskbio.stats.distanceTreeError,NoLengthError,DuplicateNodeError,MissingNodeError, andNoParentErrorhave been moved toskbio.treeFastqParseErrorhas been moved toskbio.parse.sequencesGeneticCodeError,GeneticCodeInitError, andInvalidCodonErrorhave been moved toskbio.genetic_code
- The contents of
skbio.genetic_codeformerlyskbio.core.genetic_codeare now inskbio.sequence. TheGeneticCodesdictionary is now a functiongenetic_code. The functionality is the same, except that because this is now a function rather than a dict, retrieving a genetic code is done using a function call rather than a lookup (so, for example,GeneticCodes[2]becomesgenetic_code(2). - Many submodules have been made private with the intention of simplifying imports for users. See #562 for discussion of this change. The following list contains the previous module name and where imports from that module should now come from.
skbio.alignment.sswtoskbio.alignmentskbio.alignment.alignmenttoskbio.alignmentskbio.alignment.pairwisetoskbio.alignmentskbio.diversity.alpha.basetoskbio.diversity.alphaskbio.diversity.alpha.ginitoskbio.diversity.alphaskbio.diversity.alpha.lladsertoskbio.diversity.alphaskbio.diversity.beta.basetoskbio.diversity.betaskbio.draw.distributionstoskbio.drawskbio.stats.distance.anosimtoskbio.stats.distanceskbio.stats.distance.basetoskbio.stats.distanceskbio.stats.distance.permanovatoskbio.stats.distanceskbio.distancetoskbio.stats.distanceskbio.stats.ordination.basetoskbio.stats.ordinationskbio.stats.ordination.canonical_correspondence_analysistoskbio.stats.ordinationskbio.stats.ordination.correspondence_analysistoskbio.stats.ordinationskbio.stats.ordination.principal_coordinate_analysistoskbio.stats.ordinationskbio.stats.ordination.redundancy_analysistoskbio.stats.ordinationskbio.tree.treetoskbio.treeskbio.tree.trietoskbio.treeskbio.util.misctoskbio.utilskbio.util.testingtoskbio.utilskbio.util.exceptiontoskbio.utilskbio.util.warningtoskbio.util
- Moved
skbio.distancecontents intoskbio.stats.distance.
- Relaxed requirement in
BiologicalSequence.distancethat sequences being compared are of equal length. This is relevant for Hamming distance, so the check is still performed in that case, but other distance metrics may not have that requirement. See #504). - Renamed
powertrip.pyrepo-checking script tochecklist.pyfor clarity. checklist.pynow ensures that all unit tests import from a minimally deep API. For example, it will produce an error ifskbio.core.distance.DistanceMatrixis used overskbio.DistanceMatrix.- Extra dimension is no longer calculated in
skbio.stats.spatial.procrustes. - Expanded documentation in various subpackages.
- Added new scikit-bio logo. Thanks Alina Prassas!
This is a pre-alpha release. At this stage, major backwards-incompatible API changes can and will happen.
- Added Python implementations of Smith-Waterman and Needleman-Wunsch alignment as
skbio.core.alignment.pairwise.local_pairwise_alignandskbio.core.alignment.pairwise.global_pairwise_align. These are much slower than native C implementations (e.g.,skbio.core.alignment.local_pairwise_align_ssw) and as a result raise anEfficencyWarningwhen called, but are included as they serve as useful educational examples as they’re simple to experiment with. - Added
skbio.core.diversity.beta.pw_distancesandskbio.core.diversity.beta.pw_distances_from_table. These provide convenient access to thescipy.spatial.distance.pdistbeta diversity metrics from within scikit-bio. Theskbio.core.diversity.beta.pw_distances_from_tablefunction will only be available temporarily, until thebiom.table.Tableobject is merged into scikit-bio (see #489), at which pointskbio.core.diversity.beta.pw_distanceswill be updated to use that. - Added
skbio.core.alignment.StockholmAlignment, which provides support for parsing Stockholm-formatted alignment files and working with those alignments in the context RNA secondary structural information. - Added
skbio.core.tree.majority_rulefunction for computing consensus trees from a list of trees.
- Function
skbio.core.alignment.align_striped_smith_watermanrenamed tolocal_pairwise_align_sswand now returns anAlignmentobject instead of anAlignmentStructure - The following keyword-arguments for
StripedSmithWatermanandlocal_pairwise_align_sswhave been renamed:gap_open->gap_open_penaltygap_extend->gap_extend_penaltymatch->match_scoremismatch->mismatch_score
- Removed
skbio.util.sortmodule in favor of natsort package.
- Added powertrip.py script to perform basic sanity-checking of the repo based on recurring issues that weren't being caught until release time; added to Travis build.
- Added RELEASE.md with release instructions.
- Added intersphinx mappings to docs so that "See Also" references to numpy, scipy, matplotlib, and pandas are hyperlinks.
- The following classes are no longer
namedtuplesubclasses (see #359 for the rationale):skbio.math.stats.ordination.OrdinationResultsskbio.math.gradient.GroupResultsskbio.math.gradient.CategoryResultsskbio.math.gradient.GradientANOVAResults
- Added coding guidelines draft.
- Added new alpha diversity formulas to the
skbio.math.diversity.alphadocumentation.
This is a pre-alpha release. At this stage, major backwards-incompatible API changes can and will happen.
- Added
enforce_qual_rangeparameter toparse_fastq(on by default, maintaining backward compatibility). This allows disabling of the quality score range-checking. - Added
skbio.core.tree.nj, which applies neighbor-joining for phylogenetic reconstruction. - Added
bioenv,mantel, andpwmanteldistance-based statistics toskbio.math.stats.distancesubpackage. - Added
skbio.math.stats.miscmodule for miscellaneous stats utility functions. - IDs are now optional when constructing a
DissimilarityMatrixorDistanceMatrix(monotonically-increasing integers cast as strings are automatically used). - Added
DistanceMatrix.permutemethod for randomly permuting rows and columns of a distance matrix. - Added the following methods to
DissimilarityMatrix:filter,index, and__contains__for ID-based filtering, index lookup, and membership testing, respectively. - Added
ignore_commentparameter toparse_fasta(off by default, maintaining backward compatibility). This handles stripping the comment field from the header line (i.e., all characters beginning with the first space) before returning the label. - Added imports of
BiologicalSequence,NucleotideSequence,DNA,DNASequence,RNA,RNASequence,Protein,ProteinSequence,DistanceMatrix,align_striped_smith_waterman,SequenceCollection,Alignment,TreeNode,nj,parse_fasta,parse_fastq,parse_qual,FastaIterator,FastqIterator,SequenceIteratorinskbio/__init__.pyfor convenient importing. For example, it's now possible tofrom skbio import Alignment, rather thanfrom skbio.core.alignment import Alignment.
- Fixed a couple of unit tests that could fail stochastically.
- Added missing
__init__.pyfiles to a couple of test directories so that these tests won't be skipped. parse_fastqnow raises an error on dangling records.- Fixed several warnings that were raised while running the test suite with Python 3.4.
- Functionality imported from
skbio.core.sswmust now be imported fromskbio.core.alignmentinstead.
- Code is now flake8-compliant; added flake8 checking to Travis build.
- Various additions and improvements to documentation (API, installation instructions, developer instructions, etc.).
__future__imports are now standardized across the codebase.- New website front page and styling changes throughout. Moved docs site to its own versioned subdirectories.
- Reorganized alignment data structures and algorithms (e.g., SSW code,
Alignmentclass, etc.) into anskbio.core.alignmentsubpackage.
Fixes to setup.py. This is a pre-alpha release. At this stage, major backwards-incompatible API changes can and will happen.
Initial pre-alpha release. At this stage, major backwards-incompatible API changes can and will happen.