RMKL: The first R package for multiple kernel learning 

Provides R and C++ function that enable the user to conduct multiple kernel learning (MKL) and cross validation for support vector machine (SVM) models. Cross validation can be used to identify kernel shapes and hyperparameter combinations that can be used as candidate kernels for MKL. There are three implementations provided in this package, namely SimpleMKL, Simple and Efficient, and Dual augmented Lagrangian  . These methods identify the convex combination of candidate kernels to construct an optimal hyperplane.

CRAN: https://cran.r-project.org/web/packages/RMKL/index.html

Reference: Wilson, C. M., et al. (2019). "Multiple-kernel learning for genomic data mining and prediction." BMC Bioinformatics 20(1): 426.


 

lncDIFF: a novel quasi-likelihood method for DE analysis of ncRNA

lncDIFF is a powerful differential analysis tool for low abundance non-coding RNA expression data. This method is compatible with various existing RNA-Seq quantification and normalization tools. lncDIFF is implemented in an R package available at

Download/Install at: https://github.com/qianli10000/lncDIFF.

References: Li Q, ... Wang X. lncDIFF: a novel quasi-likelihood method for differential expression analysis of non-coding RNA. BMC Genomics 20: 539 (2019).


 

glmaag: Adaptive LASSO and Network Regularized Generalized Linear Models

Efficient procedures for adaptive LASSO and network regularized for Gaussian, logistic, and Cox model. Provides network estimation procedure (combination of methods proposed by Ucar, et. al (2007) and Meinshausen and Buhlmann (2006)  cross validation and stability selection proposed by Meinshausen and Buhlmann (2010)  and Liu, Roeder and Wasserman (2010) <arXiv:1006.3316> methods. Interactive R app is available.

CRAN link/download: https://cran.r-project.org/web/packages/glmaag/index.html

Reference: https://www.biorxiv.org/content/10.1101/678029v1


 

gskat: GEE Kernel Machine Score test for Family based Association Tests

Perform family based association test via GEE Kernel Machine score test

CRAN link: https://cran.r-project.org/web/packages/gskat/index.html

Latest github link/install: https://github.com/xfwang/gskat

Reference:

Wang X. et al. GEE-Based SNP Set Association Test for Continuous and Discrete Traits in Family-Based Association Studies. Genetic Epidemiology (2013) 37: 778-786

Wang X. et al. Rare variant association test in family based sequencing studies. Breifings in Bioinformatics  (2017) 18:954-961

 


 

SCNVCNV analysis with Single-cell DNA sequencing

SCNV provides functions for performing CNV analysis with Single-cell DNA sequencing. Current pipeline majorly facilitates the binless segmentation on single-cell sequencing based on nonhomogeneous poisson process (NHPP). These CNV breakpoints may be used as surrogates for SNVs.

 

Link/download: https://github.com/xfwang/SCNV

Reference: 

Wang X. et al. DNA copy number profiling using single cell sequencing. Briefings in Bioinformatics (2018) 19:731-6


 

CLOSE/CLOSE-R: a toolkit for CNA and LOH analysis (CLOnality analysis) with SEquencing data

CLOSE-R is a toolkit for CNA and LOH analysis (as well as CLOnality analysis) with SEquencing data implemented in R. Current pipeline majorly facilitates the analysis on paired tumor and normal samples. This pipeline consists of three major compartments: (1) ASCN (allele-specific copy number) estimation using model-free approach (distance-based Chinese Restaurant Process) or model-based approach (MAP, Maximum a posteriori); (3) global purity and ploidy estimation; (3) Genome-wide ASCN visualization.

Download/Install: https://github.com/xfwang/CLOSE

Reference: Wang X et al. Global copy number profiling of cancer genomes. Bioinformatics 2016 32:926-928


 

BStools: trimming-and-retrieving alignment for bisulfite sequencing

Currently available bisulfite sequencing tools frequently suffer from low mapping rates and low methylation calls, especially for data generated from the Illumina sequencer, NextSeq. We introduce a sequential trimming-and-retrieving alignment approach for investigating DNA methylation patterns, which significantly improves the number of mapped reads and covered CpG sites. 

Download/Install: https://github.com/xfwang/BStools

Reference: Wang X, et al. A trimming-and-retrieving alignment scheme for bisulfite sequencing data. Bioinformatics (2015) 31(12):2040-2042.