Tiltmod Manual

User Manual:

Open the PDF directly: View PDF .
Page Count: 9

etilt
fnet
Marfan
MBM
pickrell
power
tqvalue
UBMM
Index

Package ‘tiltmod’

June 17, 2018

Type Package

Title Exponential Tilting Method for Reproducible Screening of Large

Scale Testing Problem

Version 0.0.1

Author Chong Ma

Maintainer Chong Ma <chongm@email.sc.edu>

Description Develop an exponential tilting method by conditioning on the false discov-

ery rate to tilt a two component mixture of Beta distributions, yielding an tilted mix-

ture model. Use a Boosted EM algorithm to ﬁt a two-component mixture of Beta distribu-

tions for p-values or left tail areas of test statistics. The Boosted EM algorithm for the mix-

ture model ﬁtting is built in C++, which is quite fast and stable. The package also in-

cludes two utility functions to generate tilted false discovery rates and frequency network.

Depends R (>= 3.3.1)

License GPL (>= 2)

LinkingTo Rcpp, BH

Imports Rcpp (>= 0.12.15), stats, foreach, doParallel, plyr, utils,

igraph, edgeR, limma

Encoding UTF-8

LazyData true

RoxygenNote 6.0.1

URL https://github.com/chongma1989/tiltmod

NeedsCompilation yes

Archs i386, x64

Rtopics documented:

etilt ............................................. 2

fnet ............................................. 3

Marfan............................................ 4

MBM ............................................ 4

pickrell ........................................... 5

power ............................................ 5

tqvalue............................................ 7

UBMM ........................................... 8

Index 9

2etilt

etilt The exponential tilting function

Description

This function tilts the mixture model ﬁtted from the training tail-areas (or p-values) by conditioning

on the average of local fdr’s from the testing tail-areas (or p-values)

Usage

etilt(xl, xt, f = NULL, h = NULL, m = NULL, interval = NULL,

rel.tol = .Machine$double.eps^0.25, ...)

Arguments

xl The training left-tail areas (or p-values)

xt The testing left-tail areas (or p-values)

fThe objective function is tilted. If either xl or f is NULL, f is ﬁtted by UBMM.

Default is NULL.

hThe conditioning function. By default, h = (1-p)/f(x), where f(x) = (1 −p)×

duniform(x) + p×dbeta(x, α, β).

mThe constant is used to ﬁnd the optimal theta such that E(h(x))=m. If m is

NULL, m = mean(h(xt)). Default is NULL.

interval The interval is used to search the optimal theta. Default is (-100L,100L).

rel.tol the accuracy used in integrate.

... Arguments to be passed to uniroot.

Value

A list includes theta, tau, tilt_tau, tilt_f, tilt_f0, tilt_f1, respectively.

Examples

xl=c(rbeta(100,0.5,0.5),runif(900))

xt=c(rbeta(300,2,3),runif(700))

## Not run:

etilt(xl,xt)

## End(Not run)

fnet 3

fnet Frequency network

Description

This function displays the frequency network of discovered differentially expressed genes.

Usage

fnet(x, Simplify = FALSE, threshold = 0.05, max.ew = 2, max.size = 18,

node = TRUE, directed = FALSE, print.graph = TRUE, ...)

Arguments

xA list of discovered differentially expressed genes

Simplify logical indicating whether to discard the genes with lower relative freqency than

the threshold. Default is FALSE.

threshold a numeric value determining the cutoff point, where the genes are discarded with

lower relative frequency than it. Default is 0.05.

max.ew a numeric value. The maximum edge width in the network plot.

directed A logical value, indicates whether the edges are shown in directions. Default is

FALSE.

print.graph A logical value. Default is TRUE. If FALSE, then do not print the frequency

network.

... Arguments to be passed to plot.igraph.

Value

A network plot.

Examples

x=list(c(1,3,4),c(2,4,5),c(3,5,1),c(4,1,2),c(5,2,3))

## Not run:

fnet(x)

fnet(x,layout=layout.fruchterman.reingold,vertex.color="grey60")

## End(Not run)

4MBM

Marfan Marfan

Description

Gene expression data includes the treatment group and 4132 gene expressions.

Usage

Marfan

Format

A data frame for 101 samples with 4133 variables: treatment,X1,...,X4132.treatment contains

41 samples from the control group and 60 samples from the Marfan group.

Source

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2174953/.

MBM Fit a two-point mixture of Beta distributions

Description

Fit a two-point mixture of Beta distributions

Usage

MBM(x, w = as.numeric(c()), a0 = as.numeric(c()), a1 = as.numeric(c()),

precision = 1e-06, MaxIter = 10000L)

Arguments

xA vector of numeric values

wA vector of two numeric values, representing the weights of two Beta distribu-

tions. Default values are 0.5, respectively.

a0 Initial values of the alpha and beta for Beta distribution f0. Default values are 1

and 1, respectively.

a1 Initial values of the alpha and beta for Beta distribution f1. Default values are

0.5 and 0.5, respectively.

precision The tolerance for convergence. Default value is 1e-6.

MaxIter The maximum iteration for the EM algorithm. Default value is 10000L.

Value

A list of four components, including the converged weight, parameters for Beta distribution f0,

parameters for Beta distribution f1, and the convergence iteration, respectively.

pickrell 5

Examples

x0=rbeta(900,0.8,0.8)

x1=rbeta(100,0.2,0.2)

## Not run:

MBM(c(x0,x1),w=c(0.8,0.2),a0=c(1,1),a1=c(0.5,0.5))

## End(Not run)

pickrell pickrell

Description

The RNA-Seq proﬁles were made of cell lines derived from lymphoblastoid cells from 69 different

Yoruba individuals from Ibadan, Nigeria. Pickrell data consists of 40 females and 29 males for

17310 gene counts data, which are well annotated and being at least 1 count-per-million (cpm) in at

least 20 samples. The raw RNA-Seq data for pickrell is available in R package tweeDEseqCount-

Data.

Usage

pickrell

Format

A DGEList S4 class, contains the gene count data, sample information, and gene annotation data.

Source

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3089435/.

power Empirical power analysis

Description

This function deals with power analysis by calculating the relevant TypeI error, power, and proba-

bility of being signiﬁcant by given global false discovery rate.

Usage

power(q, x, w = NULL, a = NULL, precision = 1e-08, MaxIter = 10000L,

theta = NULL, alpha = 0.9, type = c("left tail area", "pvalue"),

rel.tol = .Machine$double.eps^0.25, tol = .Machine$double.eps^0.5)

6power

Arguments

qA numeric value or a vector of numerical value, represent global false discovery

rate used for power analysis.

xA vector of numeric values, represent p-values or left-tail areas of test statistics

from a differential gene expression study.

wA vector of two numeric values, represent the weights of the uniform and Beta

distributions. See UBMM.

aA vector of two initial parameter values for Beta distribution. See UBMM.

precision The precision for convergence. Default value is 1e-8.

MaxIter The maximum iteration for the EM algorhthm.

theta A numerical value, represents the exponential tilting parameter for the ﬁtted

mixture model from x. Defualt is NULL.

alpha A numeric value, used to determine the probably null region in method “m1"

(see tqvalue). Default is 0.9.

type A character value, chosen from “left tail area” and “pvalue”. Default is “left tail

area”.

rel.tol the accuracy used in integrate.

tol the accuracy used in uniroot.

Value

A dataframe consists of q, TypeI, Power, ProbS, respectively. TypeI, Power, and ProbS are calcu-

lated based on the rejection region R(q) and the empirical mixture model for x.

qThe global false discovery rates provided in arguments.

TypeI P(R(q)|H0)

power P(R(q)|H1)

ProbS P(R(q))

If theta is provided, then the results contain two data frames as above, one is calculated from the

non-exponential tilted mixture model and the other from the exponential tilted mixture model, re-

spectively.

Examples

x=c(rbeta(50,0.5,0.5),runif(950))

q=seq(0.05,0.95,0.05)

## Not run:

power(q,x,alpha=0.9,type="left tail area")

power(q,x,theta=2,alpha=0.9,type="left tail area")

## End(Not run)

tqvalue 7

tqvalue The exponential tilting mixture model

Description

This function tilts the mixture model ﬁtted from the training tail-areas (or p-values) by conditioning

on the average of local fdr’s from the testing tail-areas (or p-values)

Usage

tqvalue(xl, xt, w = NULL, a = NULL, precision = 1e-08, MaxIter = 10000L,

interval = NULL, adjust = TRUE, method = c("m1", "m2"),

type = c("left tail area", "pvalue"), alpha = 0.9, q = 0.1,

ncores = 1, rel.tol = .Machine$double.eps^0.25,

tol = .Machine$double.eps^0.5)

Arguments

xl The training left-tail areas (or p-values)

xt The testing left-tail areas (or p-values)

wA vector of two numeric values, representing the weights of the uniform and

Beta distributions. See UBMM.

aA vector of two initial parameter values for Beta distribution. See UBMM.

precision The precision for convergence. Default value is 1e-8.

MaxIter The maximum iteration for the EM algorhthm.

interval A vector of two numeric values, which determines the range to search the opti-

mal theta. Default is c(-1000L,1000L).

adjust Whether or not to do the model adjustment. Default is TRUE.

method A character chosen from m1, m2. Default is m1.

type A character value, chosen from “left tail area” and “pvalue”. Default is “left tail

area”.

alpha A numeric value. Used in method “m1” to determine the probably null region.

Default is 0.9.

qA numeric value. The global false discovery rate used in method “m2”, to de-

termine the probable null region. Default is 0.1.

ncores The number of cpus used for implementing this function.

rel.tol the accuracy used in integrate.

tol the accuracy used in uniroot.

Value

A dataframe includes xl, xt, fdr, FDR, tfdr, and tFDR, respectively. fdr and FDR are the local and

global false discovery rate for each value of xt. tfdr and tFDR are the corresponding tilted local and

global false discovery rate, respectively.

The optimal theta calculated by solving log(E(exp(thetah(x))))-ctheta, where c=mean(h(xt)).

8UBMM

Examples

xl=c(rbeta(50,0.2,0.2),runif(950))

xt=c(rbeta(50,0.1,0.1),runif(950))

## Not run:

tqvalue(xl,xt,ncores=4,adjust=FALSE,type="left tail area")

## End(Not run)

UBMM Fit a mixture of uniform and Beta distribution

Description

Fit a mixture of uniform and Beta distribution

Usage

UBMM(x, w = as.numeric(c()), a = as.numeric(c()), precision = 1e-08,

MaxIter = 10000L)

Arguments

xA vector of numeric values

wA vector of two numeric values, representing the weights of the uniform and

Beta distributions. Default values are 0.5, respectively.

aInitial values of the alpha and beta for the Beta distribution. Defaults are ob-

tained from MOM estimators.

precision The tolerance for convergence. Default value is 1e-8.

MaxIter The maximum iteration for the EM algorithm. Default value is 10000L.

Value

A list of three components, including the converged weight, parameters for Beta distribution, and

the convergence iteration, respectively.

Examples

x0=runif(900)

x1=rbeta(100,0.5,0.5)

UBMM(c(x0,x1),w=c(0.8,0.2),a=c(0.7,0.8))

Index

∗Topic datasets

Marfan,4

pickrell,5

etilt,2

fnet,3

integrate,2,6,7

Marfan,4

MBM,4

pickrell,5

plot.igraph,3

power,5

tqvalue,6,7

UBMM,2,6,7,8

uniroot,2,6,7

Tiltmod Manual

Navigation menu

Versions of this User Manual:

Views

Navigation