Commit graph

354 commits

Author SHA1 Message Date
Jason W. Bacon
68c12d4ad4 biology/bcf-score: Bcftools plugins for GWAS-VCF summary statistics
Score is a set of tools in the form of a bcftools plugin, for handling
and converting summary statistics files following the GWAS-VCF
specification.
2023-02-25 08:27:46 -06:00
Jason W. Bacon
595d0dabcf biology/wfa2-lib: Exact gap-affine algorithm using homology
The wavefront alignment (WFA) algorithm is an exact gap-affine
algorithm that takes advantage of homologous regions between the
sequences to accelerate the alignment process.  Unlike traditional
dynamic programming algorithms that run in quadratic time, the WFA runs
in time O(ns+s^2), proportional to the sequence length n and the
alignment score s, using O(s^2) memory (or O(s) using the
ultralow/BiWFA mode). Moreover, the WFA algorithm exhibits simple
computational patterns that the modern compilers can automatically
vectorize for different architectures without adapting the code.
2023-01-30 13:09:02 -06:00
Jason W. Bacon
d2c13cc6fd biology/atac-seq: Metaport for ATAC-Seq analysis
Metaport to install tools for typical ATAC-Seq analysis, including QC,
adapter trimming, alignment, and differential analysis.
2023-01-11 17:47:02 -06:00
Yuri Victorovich
8ce2f72427 biology/py-biosig: New port: Library for reading and writing different biosignal data format 2022-12-30 00:32:02 -08:00
Jason W. Bacon
8e926be18e biology/rna-seq: Metaport for RNA-Seq analysis
Metaport to install tools for typical RNA-Seq analysis, including QC,
adapter trimming, alignment, and differential analysis.
2022-12-12 11:30:00 -06:00
Jason W. Bacon
455ea913be biology/fasda: Fast and simple differential analysis
FASDA aims to provide a fast and simple differential analysis tool
that just works and does not require any knowledge beyond basic Unix
command-line skills. The code is written entirely in C to maximize
efficiency and portability, and to provide a simple command-line user
interface.
2022-12-12 11:17:22 -06:00
Yuri Victorovich
29c79a3b6a biology/py-pyrodigal: New port: Python binding for Prodigal, an ORF finder for genomes and metagenomes 2022-12-06 07:41:19 -08:00
Yuri Victorovich
affc6502d5 biology/metaeuk: New port: Gene discovery and annotation for large-scale eukaryotic metagenomics 2022-11-26 03:04:46 -08:00
Yuri Victorovich
8b43ef57fd biology/augustus: New port: Genome annotation tool 2022-11-25 15:55:09 -08:00
Yuri Victorovich
a7eec55e9c biology/barrnap: New port: BAsic Rapid Ribosomal RNA Predictor 2022-11-24 19:08:42 -08:00
Jason W. Bacon
acb3a0de64 biology/megahit: Ultra-fast single-node metagenomics assembly
MEGAHIT is a single node assembler for large and complex metagenomics
NGS reads, such as soil. It makes use of succinct de Bruijn graph
(SdBG) to achieve low memory assembly. MEGAHIT can optionally utilize
a CUDA-enabled GPU to accelerate its SdBG contstruction.
2022-11-22 10:39:22 -06:00
Yuri Victorovich
47f90fecad biology/libcombine: New port: C++ library for working with the COMBINE archive format 2022-10-21 00:05:07 -07:00
Yuri Victorovich
f83007151d biology/py-valerius: New port: Python bioinformatics tools 2022-10-03 21:06:22 -07:00
Yuri Victorovich
d17e67375d biology/kmcp: New port: Accurate metagenomic profiling & fast large-scale genome searching 2022-08-07 13:30:47 -07:00
Yuri Victorovich
d3e9586369 biology/ngs-sdk: Revert: Un-remove, ngs was moved into sra-tools
This reverts commit 1cb8897f61.

sra-tools needs to be updated first.

Reported by:	Tomoaki AOKI <junchoon@dec.sakura.ne.jp>, perciva@
2022-07-06 18:33:58 -07:00
Yuri Victorovich
1cb8897f61 biology/ngs-sdk: Remove, ngs was moved into sra-tools 2022-07-06 12:03:11 -07:00
Yuri Victorovich
db283c3063 biology/mopac: Move to science/mopac; Add TIMESTAMP to distinfo; Take maintainership 2022-06-19 00:52:28 -07:00
Yuri Victorovich
8db0b60850 biology/py-mrcfile: New port: MRC file I/O library which is used in structural biology 2022-05-01 10:37:04 -07:00
Jason W. Bacon
a47115b9ca biology/fastq-trim: Lightening fast sequence read trimmer
Fastq-trim is a lightening fast read trimming tool for QA of
DNA and RNA reads prior to analyses such as RNA-Seq.
2022-03-19 07:25:58 -05:00
Jason W. Bacon
d936c94c64 biology/py-bcbio-gff: Read and write Generic Feature Format (GFF)
Read and write Generic Feature Format (GFF) with Biopython integration.

Also adding py-dna-features-viewer to Makefile, missed on last commit
2022-02-03 13:38:42 -06:00
Jason W. Bacon
864d4fdd18 biology/libgff: GFF/GTF parsing library based on GCLib
This is an attempt to perform a simple "libraryfication" of the GFF/GTF
parsing code that is used in GFFRead codebase. There are not many
(any?) relatively lightweight GTF/GFF parsers exposing a C++ interface,
and the goal of this library is to provide this functionality without
the necessity of drawing in a heavy-weight dependency like SeqAn. Note:
This library draws directly from the code in GFFRead and GCLib, and
exists primarily to remove functionality (and hence code) that is
unnecessary for our downstream purposes. In the future, it may be
appropriate to just replace this library wholesale with GCLib.
2021-12-05 14:50:42 -06:00
Jason W. Bacon
9152cfc775 biology/bio-mocha: bcftools plugin for mosaic chromosomal alterations
MoChA is a bcftools plugin released under the MIT license for mosaic
chromosomal alteration detection and analysis from DNA microarray or
whole genome sequence data. It can be used both with Illumina and
Affymetrix data. It can also be used for detection of germline copy
number variants. Data can be prepared in usable file formats using the
gtc2vcf plugin.
2021-12-03 07:41:26 -06:00
Jason W. Bacon
8d2198bb6d biology/gffread: GFF/GTF format conversions, filtering, FASTA extraction, etc
GFF/GTF utility providing format conversions, filtering, FASTA sequence
extraction and more.
2021-11-10 14:49:10 -06:00
Po-Chuan Hsieh
70b81836d9
*/Makefile: Sort SUBDIRs 2021-10-25 23:57:04 +08:00
Jason W. Bacon
799d4cc2cf biology/py-deeptools: User-friendly tools for exploring deep-sequencing data
deepTools contains useful modules to process the mapped reads data for
multiple quality checks, creating normalized coverage files in standard
bedGraph and bigWig file formats, that allow comparison between
different files (for example, treatment and control). Finally, using
such normalized and standardized files, deepTools can create many
publication-ready visualizations to identify enrichments and for
functional annotations of the genome.
2021-10-14 06:51:12 -05:00
Jason W. Bacon
ad5a0604c4 biology/py-bigwig: Rename to biology/py-pybigwig
Fully match upstream name
2021-10-13 16:21:32 -05:00
Antoine Brodin
6a226290f3 biology/py-pybigwig: remove, duplicate of biology/py-bigwig 2021-10-13 18:57:58 +00:00
Jason W. Bacon
c69e97e63a biology/py-pybigwig: Fix poudriere build
Typo in post-install
2021-10-13 11:02:12 -05:00
Jason W. Bacon
904a889fd1 biology/py-pybigwig: Python access to bigWig files using libBigWig
py-bigwig is a python extension, written in C, for quick access to
bigBed files and access to and creation of bigWig files. This extension
uses libBigWig for local and remote file access.
2021-10-13 08:03:21 -05:00
Jason W. Bacon
d92344039b biology/py-py2bit: Python interface for 2bit packed nucleotide files
py2bit is a python extension, written in C, for quick access to 2bit
files for randomly accessible, packed nucleotide sequences. The
extension uses lib2bit for file access.
2021-10-13 07:53:53 -05:00
Jason W. Bacon
cdf2ff2f68 biology/bamutil: Utilities for working with SAM/BAM files
Utilities for working on SAM/BAM files from The Center for Statistical
Genetics at the University of Michigan School of Public Health.  It
includes numerous functions such as splitting, merging, trimming reads,
filtering, validation, diff, etc.
2021-10-12 11:40:35 -05:00
Yuri Victorovich
c38052dfd6 biology/libneurosim: New port: Common interfaces for neuronal simulators 2021-10-08 10:20:27 -07:00
Po-Chuan Hsieh
6e0aa01726
*/Makefile: Sort SUBDIRs 2021-09-21 11:35:06 +08:00
Yuri Victorovich
d8c851df59 biology/py-libsedml: New port: SED-ML library for Python 2021-09-19 11:15:36 -07:00
Yuri Victorovich
dd804e9871 biology/libsedml: New port: C++ SED-ML library 2021-09-19 11:15:35 -07:00
Yuri Victorovich
74c998634a biology/py-libnuml: New port: Numerical Markup Language for Python 2021-09-18 23:47:03 -07:00
Yuri Victorovich
ad2bdb6508 biology/libnuml: New port: C++ library for Numerical Markup Language 2021-09-18 23:47:02 -07:00
Jason W. Bacon
e61df5e464 biology/sam2pairwise: Show pairwise alignment for each read in a SAM file
sam2pairwise takes a SAM file and uses the CIGAR and MD tag to
reconstruct the pairwise alignment of each read.
2021-09-07 15:32:29 -05:00
Jason W. Bacon
c085af1e81 biology/py-pywgsim: Modified wgsim genomic data simulator
pywgsim is a modified version of the wgsim short read simulator.  The
code for wgsim has been modified to allow visualizing the simulated
mutations as a GFF file.
2021-09-06 07:55:49 -05:00
Jason W. Bacon
904aa60469 biology/biolibc-tools: High-performance tools based on biolibc
Biolibc-tools is a collection of simple fast, memory-efficient,
programs for processing biological data.  These are simple programs
built on biolibc that are not complex enough to warrant a separate
project.
2021-08-30 08:09:23 -05:00
Jason W. Bacon
2e559ca48c biology/bfc: Correct sequencing errors from Illumina sequencing data
BFC is a standalone high-performance tool for correcting sequencing
errors from Illumina sequencing data. It is specifically designed for
high-coverage whole-genome human data, though also performs well for
small genomes.
2021-08-23 12:28:41 -05:00
Jason W. Bacon
615c521bb3 biology/flash: Fast Length Adjustment of SHort reads
FLASH (Fast Length Adjustment of SHort reads) is a very fast and
accurate software tool to merge paired-end reads from next-generation
sequencing experiments. FLASH is designed to merge pairs of reads when
the original DNA fragments are shorter than twice the length of reads.
The resulting longer reads can significantly improve genome assemblies.
They can also improve transcriptome assembly when FLASH is used to
merge RNA-seq data.
2021-08-23 12:26:47 -05:00
Yuri Victorovich
1c0e78eb4b biology/py-PySCeS: New port: Python Simulator for Cellular Systems 2021-08-17 14:30:08 -07:00
Yuri Victorovich
10875d93bb biology/py-python-libsbml: New port: LibSBML Python API 2021-08-17 14:30:07 -07:00
Yuri Victorovich
aa9a35be04 biology/sigviewer: New port: Viewing application for biosignals 2021-08-17 00:05:37 -07:00
Yuri Victorovich
df70b7efed biology/biosig: New port: Library for reading and writing different biosignal data format 2021-08-16 19:51:36 -07:00
Jason W. Bacon
5797d6a329 biology/py-ont-fast5-api: Interface to Oxford Nanopore .fast5 files
The ont_fast5_api is a simple interface to HDF5 files of the Oxford
Nanopore .fast5 file format. It provides:

    o Implementation of the fast5 file schema using h5py library
    o Methods to interact with and reflect the fast5 file schema
    o Tools to convert between multi_read and single_read formats
    o Tools to compress/decompress raw data in files
2021-08-13 08:34:09 -05:00
Jason W. Bacon
e44f917e29 biology/erminej: Analyses of gene sets, e.g. gene expression profiling
ErmineJ performs analyses of gene sets in high-throughput genomics data
such as gene expression profiling studies. A typical goal is to
determine whether particular biological pathways are "doing something
interesting" in an experiment that generates long lists of candidates.
The software is designed to be used by biologists with little or no
informatics background (but if you do, you might be interested in the
CLI or the R support).
2021-07-09 07:26:58 -05:00
Jason W. Bacon
7fd4d4420f biology/py-goatools: Tools for processing and visualizing Gene Ontology terms
Goatools is a python library for processing Gene Ontology (GO) terms.  It
includes routines for processing, filtering, and visualizing GO data.
2021-07-02 12:00:29 -05:00
Jason W. Bacon
cf14bbb325 biology/mmseqs2: Ultra fast and sensitive sequence search and clustering suite
MMseqs2 (Many-against-Many sequence searching) is a software suite to search
and cluster huge protein and nucleotide sequence sets. MMseqs2 is open source
GPL-licensed software implemented in C++ for FreeBSD, Linux, MacOS, and (via
via cygwin) Windows. The software is designed to run on multiple cores and
servers and exhibits very good scalability. MMseqs2 can run 10000 times
faster than BLAST. At 100 times its speed it achieves almost the same
sensitivity. It can perform profile searches with the same sensitivity as
PSI-BLAST at over 400 times its speed.
2021-06-24 12:31:42 -05:00