Commit graph

500 commits

Author SHA1 Message Date
bacon
36901a2a68 Add hisat2 2019-01-15 01:27:42 +00:00
bacon
1a344b31e4 biology/hisat2: import hisat2-2.1.0.23
HISAT2 is a fast and sensitive alignment program for mapping next-generation
sequencing reads (both DNA and RNA) to a population of human genomes (as well
as to a single reference genome).
2019-01-15 01:26:29 +00:00
bacon
12f5843131 Multiple packages: Replace obsolete maintainer email
jwbacon@tds.net ==> bacon@NetBSD.org
2019-01-13 22:06:42 +00:00
bacon
385a79042f ncbi-blast+: Upgrade to 2.8.1
Support for new BLAST database format
Increased makeblastdb output file size limit to 4GB
Other minor bug fixes and enhancements

OK wiz@
2019-01-07 15:00:10 +00:00
bacon
aefc835978 Add canu 2019-01-07 02:34:21 +00:00
bacon
94e269c682 biology/canu: import canu-1.8
Canu is a fork of the Celera Assembler, designed for high-noise single-molecule
sequencing (such as the PacBio RS II/Sequel or Oxford Nanopore MinION).

Canu is a hierarchical assembly pipeline which runs in four steps:

    Detect overlaps in high-noise sequences using MHAP

    Generate corrected sequence consensus

    Trim corrected sequences

    Assemble trimmed corrected sequences
2019-01-07 02:33:17 +00:00
bacon
136f8fcc3b Add stacks 2018-12-22 21:53:23 +00:00
bacon
bce1e613e1 biology/stacks: import stacks-2.2
Stacks is a software pipeline for building loci from short-read sequences, such
as those generated on the Illumina platform. Stacks was developed to work with
restriction enzyme-based data, such as RAD-seq, for the purpose of building
genetic maps and conducting population genomics and phylogeography.
2018-12-22 21:52:06 +00:00
bacon
669fe112b9 Add kallisto 2018-12-21 19:03:46 +00:00
bacon
4b7c925a95 biology/kallisto: import kallisto-0.45.0
Kallisto is a program for quantifying abundances of transcripts from RNA-Seq
data, or more generally of target sequences using high-throughput sequencing
reads. It is based on the novel idea of pseudoalignment for rapidly determining
the compatibility of reads with targets, without the need for alignment.
2018-12-21 19:00:56 +00:00
adam
5b12b7b592 revbump for boost 1.69.0 2018-12-13 19:51:31 +00:00
adam
16dd5de231 revbump after updating textproc/icu 2018-12-09 18:51:58 +00:00
bsiegert
e3848b96fc py-pydicom: Update to 1.2.1.
From Eric A. Borisch in pull request NetBSD/pkgsrc#40.

-   Do not derive Dataset from dict (#767)
    -   fixes side effects from initializing with another dataset
-   Added missing dict methods that are passed to the tags dict
-   Adapted documentation to Dataset changes
-   Make sure that the retry order config is reset in the test (#772)
2018-12-05 10:09:09 +00:00
adam
1d7a39df12 bcftools: added version 1.9
BCFtools is a program for variant calling and manipulating files in the Variant
Call Format (VCF) and its binary counterpart BCF. All commands work
transparently with both VCFs and BCFs, both uncompressed and BGZF-compressed.
In order to avoid tedious repetion, throughout this document we will use "VCF"
and "BCF" interchangeably, unless specifically noted.

Most commands accept VCF, bgzipped VCF and BCF with filetype detected
automatically even when streaming from a pipe. Indexed VCF and BCF work in all
situations. Unindexed VCF and BCF and streams work in most, but not all
situations. In general, whenever multiple VCFs are read simultaneously, they
must be indexed and therefore also compressed.
2018-11-15 09:21:24 +00:00
kleink
f1a683c990 Revbump after cairo 1.16.0 update. 2018-11-14 22:20:58 +00:00
ryoon
b86dfe6873 Recursive revbump from hardbuzz-2.1.1 2018-11-12 03:51:07 +00:00
adam
56ad8e469f samtools: updated to 1.9
Release 1.9:

 * Samtools mpileup VCF and BCF output is now deprecated.  It is still
   functional, but will warn.  Please use bcftools mpileup instead.

 * Samtools mpileup now handles the '-d' max_depth option differently.  There
   is no longer an enforced minimum, and '-d 0' is interpreted as limitless
   (no maximum - warning this may be slow).  The default per-file depth is
   now 8000, which matches the value mpileup used to use when processing
   a single sample.  To get the previous default behaviour use the higher
   of 8000 divided by the number of samples across all input files, or 250.

 * Samtools stats new features:

   - The '--remove-overlaps' option discounts overlapping portions of
     templates when computing coverage and mapped base counting.

   - When a target file is in use, the number of bases inside the
     target is printed and the percentage of target bases with coverage
     above a given threshold specified by the '--cov-threshold' option.

   - Split base composition and length statistics by first and last reads.

 * Samtools faidx new features:

   - Now takes long options.

   - Now warns about zero-length and truncated sequences due to the
     requested range being beyond the end of the sequence.

   - Gets a new option (--continue) that allows it to carry on
     when a requested sequence was not in the index.

   - It is now possible to supply the list of regions to output in a text
     file using the new '--region-file' option.

   - New '-i' option to make faidx return the reverse complement of
     the regions requested.

   - faidx now works on FASTQ (returning FASTA) and added a new
     fqidx command to index and return FASTQ.

 * Samtools collate now has a fast option '-f' that only operates on
   primary pairs, dropping secondary and supplementary.  It tries to write
   pairs to the final output file as soon as both reads have been found.

 * Samtools bedcov gets a new '-j' option to make it ignore deletions (D) and
   reference skips (N) when computing coverage.

 * Small speed up to samtools coordinate sort, by converting it to use
   radix sort.

 * Samtools idxstats now works on SAM and CRAM files, however this
   isn't fast due to some information lacking from indices.

 * Compression levels may now be specified with the level=N
   output-fmt-option.  E.g. with -O bam,level=3.

 * Various documentation improvements.

 * Bug-fixes:

   - Improved error reporting in several places.

   - Various test improvements.

   - Fixed failures in the multi-region iterator (view -M) when regions
     provided via BED files include overlaps

   - Samtools stats now counts '=' and 'X' CIGAR operators when
     counting mapped bases.

   - Samtools stats has fixes for insert size filtering (-m, -i).

   - Samtools stats -F now longer negates an earlier -d option.

   - Fix samtools stats crash when using a target region.

   - Samtools sort now keeps to a single thread when the -@ option is absent.
     Previously it would spawn a writer thread, which could cause the CPU
     usage to go slightly over 100%.

   - Fixed samtools phase '-A' option which was incorrectly defined to take
     a parameter.

   - Fixed compilation problems when using C_INCLUDE_PATH.

   - Fixed --version when built from a Git repository.

   - Use noenhanced mode for title in plot-bamstats.  Prevents unwanted
     interpretation of characters like underscore in gnuplot version 5.

   - blast2sam.pl now reports perfect match hits (no indels or mismatches).

   - Fixed bug in fasta and fastq subcommands where stdout would not be flushed
     correctly if the -0 option was used.

   - Fixed invalid memory access in mpileup and depth on alignment records
     where the sequence is absent.
2018-11-06 10:49:41 +00:00
adam
7520162475 htslib: updated to 1.9
1.9:
If ./configure fails, make will stop working until either configure is re-run successfully, or make distclean is used. This makes configuration failures more obvious.

The default SAM version has been changed to 1.6. This is in line with the latest version specification and indicates that HTSlib supports the CG tag used to store long CIGAR data in BAM format.

bgzip integrity check option '--test'

Faidx can now index fastq files as well as fasta. The fastq index adds an extra column to the .fai index which gives the offset to the quality values. New interfaces have been added to htslib/faidx.h to read the fastq index and retrieve the quality values. It is possible to open a fastq index as if fasta (only sequences will be returned), but not the other way round.

New API interfaces to add or update integer, float and array aux tags.

Add level=<number> option to hts_set_opt() to allow the compression level to be set. Setting level=0 enables uncompressed output.

Improved bgzip error reporting.

Better error reporting when CRAM reference files can't be opened.

Fixes to make tests work properly on Windows/MinGW - mainly to handle line ending differences.

Efficiency improvements:

Small speed-up for CRAM indexing.

Reduce the number of unnecessary wake-ups in the thread pool.

Avoid some memory copies when writing data, notably for uncompressed BGZF output.

Bug fixes:

Fix multi-region iterator bugs on CRAM files.

Fixed multi-region iterator bug that caused some reads to be skipped incorrectly when reading BAM files.

Fixed synced_bcf_reader() bug when reading contigs multiple times.

Fixed bug where bcf_hdr_set_samples() did not update the sample dictionary when removing samples.

Fixed bug where the VCF record ref length was calculated incorrectly if an INFO END tag was present. (71b00a)

Fixed warnings found when compiling with gcc 8.1.0.

sam_hdr_read() and sam_hdr_write() will now return an error code if passed a NULL file pointer, instead of crashing.

Fixed possible negative array look-up in sam_parse1() that somehow escaped previous fuzz testing.

Fixed bug where cram range queries could incorrectly report an error when using multiple threads.

Fixed very rare rANS normalisation bug that could cause an assertion failure when writing CRAM files.
2018-11-06 10:24:14 +00:00
leot
ff020a3410 biology: +py-pydicom 2018-10-31 20:16:30 +00:00
leot
5041cb8415 py-pydicom: Import py-pydicom-1.2.0 as biology/py-pydicom
Pydicom is a pure Python package for working with DICOM files such as medical
images, reports, and radiotherapy objects.

Pydicom makes it easy to read these complex files into natural pythonic
structures for easy manipulation. Modified datasets can be written again to
DICOM format files.

Packaged by Eric A. Borisch via NetBSD/pkgsrc#37, thank you Eric!
2018-10-31 20:15:40 +00:00
tnn
73a403b056 xylem: build fix 2018-09-29 12:49:55 +00:00
wiz
9bd737fe76 Recursive bump for perl5-5.28.0 2018-08-22 09:42:51 +00:00
minskim
321ee00dec biology/plinkseq: Requires c++11 to build with protobuf-3.6.0 2018-08-04 21:42:28 +00:00
bacon
32ab611e22 Add trimmomatic 2018-07-25 15:15:50 +00:00
bacon
cb1a7273b2 biology/trimmomatic: import Trimmomatic-0.38
Trimmomatic performs a variety of useful trimming tasks for illumina
paired-end and single ended data. The selection of trimming steps and their
associated parameters are supplied on the command line. It works with FASTQ
(using phred + 33 or phred + 64 quality scores, depending on the Illumina
pipeline used), either uncompressed or gzipp'ed FASTQ.
2018-07-25 15:14:32 +00:00
ryoon
b9c1e1d533 Recursive revbump from textproc/icu-62.1 2018-07-20 03:33:47 +00:00
joerg
a19083df44 Mark packages that require C++03 (or the GNU variants) if they fail with
C++14 default language.
2018-07-18 00:06:10 +00:00
jperkin
5393242c73 *: Move SUBST_STAGE from post-patch to pre-configure
Performing substitutions during post-patch breaks tools such as mkpatches,
making it very difficult to regenerate correct patches after making changes,
and often leading to substituted string replacements being committed.
2018-07-04 13:40:07 +00:00
adam
a31bce9748 extend PYTHON_VERSIONS_ for Python 3.7 2018-07-03 05:03:01 +00:00
bacon
4cdd11c350 biology/ncbi-blast+: Respect env to support PKGSRC_USE_RELRO
Fix a previous patch that hard-coded relro support by patching in pkgsrc
CFLAGS, CXXFLAGS, and LDFLAGS instead.

OK wiz@
2018-05-22 21:37:29 +00:00
bacon
6136f9678e Add samtools 2018-05-07 18:40:10 +00:00
bacon
c5c6ca3a78 biology/samtools: import samtools-1.8
Samtools implements various utilities for post-processing alignments in the
SAM, BAM, and CRAM formats, including indexing, variant calling (in conjunction
with bcftools), and a simple alignment viewer.

OK wiz@
2018-05-07 18:37:31 +00:00
bacon
b91165d053 biology/htslib: Fix category in bl3 2018-05-01 13:20:44 +00:00
bacon
584191f36d Add htslib 2018-04-30 16:53:07 +00:00
bacon
bee3c5dd67 biology/htslib: import htslib-1.8
HTSlib is an implementation of a unified C library for accessing common file
formats, such as SAM, CRAM, VCF, and BCF, used for high-throughput sequencing
data. It is the core library used by samtools and bcftools.
2018-04-30 16:51:54 +00:00
adam
8f438fb8df ncbi-blast+: removed references to wip 2018-04-29 21:00:04 +00:00
bacon
6801a45e0e Add ncbi-blast+ 2018-04-27 20:30:24 +00:00
bacon
6e68fec7bb biology/ncbi-blast+: import ncbi-blast+-2.7.1
The Basic Local Alignment Search Tool (BLAST) finds regions of local similarity
between sequences. The program compares nucleotide or protein sequences to
sequence databases and calculates the statistical significance of matches.
BLAST can be used to infer functional and evolutionary relationships between
sequences as well as help identify members of gene families.

OK wiz@
2018-04-27 20:28:28 +00:00
wiz
8ee21bdcf0 Recursive bump for new fribidi dependency in pango. 2018-04-16 14:33:44 +00:00
wiz
c57215a7b2 Recursive bumps for fontconfig and libzip dependency changes. 2018-03-12 11:15:24 +00:00
wiz
73dc221a12 plinkseq: use https 2018-02-11 15:50:59 +00:00
jperkin
e366662af4 Belated PKGREVISION bump for devel/protobuf update.
Fixes at least joyent/pkgsrc#60.
2018-01-17 12:10:37 +00:00
rillig
17e39f419d Fix indentation in buildlink3.mk files.
The actual fix as been done by "pkglint -F */*/buildlink3.mk", and was
reviewed manually.

There are some .include lines that still are indented with zero spaces
although the surrounding .if is indented. This is existing practice.
2018-01-07 13:03:53 +00:00
rillig
b381c6e2f3 Sort PLIST files.
Unsorted entries in PLIST files have generated a pkglint warning for at
least 12 years. Somewhat more recently, pkglint has learned to sort
PLIST files automatically. Since pkglint 5.4.23, the sorting is only
done in obvious, simple cases. These have been applied by running:

  pkglint -Cnone,PLIST -Wnone,plist-sort -r -F
2018-01-01 22:29:15 +00:00
rillig
4760eca917 Replaced $(ROUND) with ${CURLY} variable references.
This has been a pkglint warning for several years now, and pkglint can even
fix it automatically. And it did for this commit.

Only in lang/mercury, two passes of autofixing were necessary because there
were nested variables.
2018-01-01 18:16:35 +00:00
rillig
4078fbf571 Cleanup: replace curly braces with parentheses. 2018-01-01 01:10:13 +00:00
he
a81a9abae7 Re-add patch to bring timeval definition and select prototype into scope.
Bump PKGREVISION.
2017-12-27 23:44:01 +00:00
wiz
b102d7d5b1 chemical-mime-data: comment out dead sites 2017-12-24 09:42:12 +00:00
bacon
f78486b904 biology/bwa: Update to 0.7.17
Numerous bug fixes, enhancements, and new command-line options

ok wiz@
2017-12-17 14:30:36 +00:00
wiz
1011d2f380 transfig: remove, replaced by print/fig2dev 2017-10-03 15:12:42 +00:00