Manual

User Manual:

Open the PDF directly: View PDF .
Page Count: 10

data_generation
estimation_fun
MAP_test
Summary_MAP
Index

MAPTest

June 17, 2019

Title MAP Test for detection DE genes

Version 0.0.0.9000

Description MAP test for analyzing temporal gene expression data.

Depends R (>= 3.4.0)

License GPL-2

Encoding UTF-8

LazyData true

RoxygenNote 6.1.1

Imports EQL,MASS,foreach,mclust,mvtnorm,parallel,randtoolbox,stats,matlib

Rtopics documented:

data_generation....................................... 1

estimation_fun ....................................... 3

MAP_test .......................................... 5

Summary_MAP....................................... 7

Index 10

data_generation Simulate MAP data set:

Description

Simulate MAP data set:

Usage

data_generation(G = 100, n_control = 10, n_treat = 10, n_rep = 3,

k_real = 4, sigma2_r = rep(1, 2), sigma1_2_r = 1,

sigma2_2_r = c(3, 2), mu1_r = 4, phi_g_r = rep(1, 100),

p_k_real = c(0.7, 0.1, 0.1, 0.1), x = x)

2data_generation

Arguments

GNumber of genes to simulate

n_control Number of time points in control group

n_treat Number of time points in treatment group

n_rep Number of replicates in both group

k_real Always be 4

sigma2_r Variance parameter for τg

sigma1_2_r Variance parameter for ηg1

sigma2_2_r Variance parameter for ηg2

mu1_r Mean parameter for ηg1

phi_g_r Dispersion parameter

p_k_real True proportion for each mixture component

xTime structured design for the simulated data

Details

The vector of read counts for gene g, treatment group i, replicate j, at time point t, Ygij (t), follows a

Negative Binomial distribution parameterized mean λgi and φg, where E[Ygij (t)] = λgi(t).λgi(t)

is further modeled as λgi(t) = Sij exp[ηg1Ii=2 +B0(t)ηg2Ii=2 +B0(t)τg]. We have B0(t)are

design matrix, which is constructed by 2 orthorgonal polynomial bases.

• t = 1,..., n_treat (or n_control if control group);

• j = 1,..., n_rep;

• g = 1,...,G; and

•[ηg1, ηg2, τg]~ 4-component gausssian mixture model

Value

• Y1 Simulated data

Examples

library(matlib)

n_basis = 2

n_control = 10

n_treat = 10

n_rep = 3

tt_treat = c(1:n_treat)/n_treat

nt = length(tt_treat)

ind_t = sort(sample(c(1:nt), n_control))

tt = tt_treat[ind_t]

tttt = c(rep(tt, n_rep), rep(tt_treat, n_rep))

z = x = matrix(0, length(tttt), n_basis)

z[,1] = 1.224745*tttt

z[,2] = -0.7905694 + 2.371708*tttt^2

x[,1] = z[,1] - Proj(z[,1], rep(1, length(tttt)))

x[,2] = z[,2] - Proj(z[,2], rep(1, length(tttt))) - Proj(z[,2], x[,1])

Y1 = data_generation(G = 100,

n_control = n_control,

n_treat = n_treat,

estimation_fun 3

n_rep = n_rep,

k_real = 4,

sigma2_r = rep(1, 2),

sigma1_2_r = 1,

sigma2_2_r = c(3,2),

mu1_r = 4,

phi_g_r = rep(1, 100),

p_k_real = c(0.7, 0.1, 0.1, 0.1),

x = x)

estimation_fun Estimation for MAP data set:

Description

Estimation for MAP data set:

Usage

estimation_fun(n_control = 10, n_treat = 10, n_rep = 3, x, Y1,

k = 4, nn = 300, phi = NULL, type = NULL, tttt)

Arguments

n_control Number of time points in control group

n_treat Number of time points in treatment group

n_rep Number of replicates in both group

xTime structured design for the simulated data

Y1 RNA-seq data set: G by (n_control + n_treat) * n_rep matrix

kAlways 4

nn Number of QMC nodes used in likelihood estimation

phi the dispersion parameter

type dispersion parameter estimation type

tttt Time points used in the design

Details

The vector of read counts for gene g, treatment group i, replicate j, at time point t, Ygij (t), follows a

Negative Binomial distribution parameterized mean λgi and φg, where E[Ygij (t)] = λgi(t).λgi(t)

is further modeled as λgi(t) = Sij exp[ηg1Ii=2 +B0(t)ηg2Ii=2 +B0(t)τg].We have B0(t)are

design matrix, which is constructed by 2 orthorgonal polynomial bases.

• t = 1,..., n_treat (or n_control if control group);

• j = 1,..., n_rep;

• g = 1,...,G; and

•[ηg1, ηg2, τg]~ 4-component gausssian mixture model. We used latented negative binomial

model with EM algorithm to estimate the paramters of mixture model.

4estimation_fun

Value

A list of parameters that we need to use for DE analysis.

data_use List includes the data information

• Y1 RNA-seq data set: G by (n_control + n_treat) * n_rep matrix

• start Starting point of the EM algorithm

• k always 4

• n_basis Number of basis function have been used to construct the design matrix

• X1 Vector indicate control data (0) or treatment data (1)

• x Design matrix been used for estimation

• tttt Time points used in the design

result1 List includes the Parameters estimations

• mu1 Mean parameter for ηg1

• sigma1_2 Variance parameter for ηg1

• sigma2 Variance parameter for τg

• sigma2_2 Variance parameter for ηg2

• p_k Proportion for each mixture component

• aa Dispertion parameter

Examples

library(matlib)

n_basis = 2

n_control = 10

n_treat = 10

n_rep = 3

tt_treat = c(1:n_treat)/n_treat

nt = length(tt_treat)

ind_t = sort(sample(c(1:nt), n_control))

tt = tt_treat[ind_t]

tttt = c(rep(tt, n_rep), rep(tt_treat, n_rep))

z = x = matrix(0, length(tttt), n_basis)

z[,1] = 1.224745*tttt

z[,2] = -0.7905694 + 2.371708*tttt^2

x[,1] = z[,1] - Proj(z[,1], rep(1, length(tttt)))

x[,2] = z[,2] - Proj(z[,2], rep(1, length(tttt))) - Proj(z[,2], x[,1])

Y1 = data_generation(G = 100,

n_control = n_control,

n_treat = n_treat,

n_rep = n_rep,

k_real = 4,

sigma2_r = rep(1, 2),

sigma1_2_r = 1,

sigma2_2_r = c(3,2),

mu1_r = 4,

phi_g_r = rep(1,100),

p_k_real = c(0.7, 0.1, 0.1, 0.1),

x = x)

MAP_test 5

aaa <- proc.time()

est_result <- estimation_fun(n_control = n_control,

n_treat = n_treat,

n_rep = n_rep,

x = x,

Y1 = Y1,

nn = 300,

k = 4,

phi = NULL,

type = 2,

tttt = tttt)

aaa1 <- proc.time()

aaa1 - aaa

#------------gaussian basis construction------------------

# method <- "gaussian"

# n_basis <- 2

# a = 1/4

#b=2

# c = sqrt(a^2 + 2 * a * b)

#A=a+b+c

# B = b/A

# phi_cal = function(k, x){

# Lambda_k =sqrt(2 * a / A) * B^k

# 1/(sqrt(sqrt(a/c) * 2^k * gamma(k+1))) *

# exp(-(c-a) * x^2) * hermite(sqrt(2 * c) * x, k, prob = F)

# }

# z = do.call(cbind, lapply(c(1:n_basis), phi_cal, x = (tttt - mean(tttt))))

# x = matrix(0, length(tttt), n_basis)

# if(n_basis == 2){

# x[,1] = z[,1] - Proj(z[,1], rep(1, length(tttt)))

# x[,2] = z[,2] - Proj(z[,2], rep(1, length(tttt))) - Proj(z[,2], x[,1])

# }else{

# x[,1] = z[,1] - Proj(z[,1], rep(1, length(tttt)))

# x[,2] = z[,2] - Proj(z[,2], rep(1, length(tttt))) - Proj(z[,2], x[,1])

# x[,3] = z[,3] - Proj(z[,3], rep(1, length(tttt))) - Proj(z[,3], x[,1]) - Proj(z[,3], x[,2])

# }

MAP_test MAP test

Description

MAP test

Usage

MAP_test(est_result, dd = NULL, Type = c(1:6), nn = 6000)

Arguments

est_result estimation_fun result:

A List contains data_use (the data information) and result1 (Parameters estima-

tions)

6MAP_test

dd True index to check the true FDR

Type Type of hypothesis

• Type = 1 general DE detection

• Type = 2 PDE with no time-by-treatment and NPDE with both treatment

and time-by-treatment

• Type = 3 NPDE with or without time-by-treatment

• Type = 4 PDE with no time-by-treatment

• Type = 5 NPDE with only time-by-treatment

• Type = 6 NPDE with both treatment and time-by-treatment

nn Number of QMC nodes used to estimate the test statistics

Details

The vector of read counts for gene g, treatment group i, replicate j, at time point t, Ygij (t), follows a

Negative Binomial distribution parameterized mean λgi and φg, where E[Ygij (t)] = λgi(t).λgi(t)

is further modeled as λgi(t) = Sij exp[ηg1Ii=2 +B0(t)ηg2Ii=2 +B0(t)τg].We have B0(t)are

design matrix, which is constructed by 2 orthorgonal polynomial bases.

• t = 1,..., n_treat (or n_control if control group);

• j = 1,..., n_rep;

• g = 1,...,G; and

•[ηg1, ηg2, τg]~ 4-component gausssian mixture model. After the parameter estimation we run

the MAP test to detect the DE genes.

Value

A list of results for DE analysis.

• Type Type of hypothesis of interest

• FDR_ﬁnal FDR estimation of each hypothesis

• FDR_real The true FDR if dd is known

• power Power if dd is known

• T_x Likelihood result for each component

• test_stat Test statistics for each hypothesis

• ct_ﬁnal_all Test statistics cut-off value at each estimated FDR level

Examples

library(matlib)

n_basis = 2

n_control = 10

n_treat = 10

n_rep = 3

tt_treat = c(1:n_treat)/n_treat

nt = length(tt_treat)

ind_t = sort(sample(c(1:nt), n_control))

tt = tt_treat[ind_t]

tttt = c(rep(tt, n_rep), rep(tt_treat, n_rep))

z = x = matrix(0, length(tttt), n_basis)

Summary_MAP 7

z[,1] = 1.224745*tttt

z[,2] = -0.7905694 + 2.371708*tttt^2

x[,1] = z[,1] - Proj(z[,1], rep(1, length(tttt)))

x[,2] = z[,2] - Proj(z[,2], rep(1, length(tttt))) - Proj(z[,2], x[,1])

Y1 = data_generation(G = 100,

n_control = n_control,

n_treat = n_treat,

n_rep = n_rep,

k_real = 4,

sigma2_r = rep(1, 2),

sigma1_2_r = 1,

sigma2_2_r = c(3,2),

mu1_r = 4,

phi_g_r = rep(1,100),

p_k_real = c(0.7, 0.1, 0.1, 0.1),

x = x)

aaa <- proc.time()

est_result <- estimation_fun(n_control = n_control,

n_treat = n_treat,

n_rep = n_rep,

x = x,

Y1 = Y1,

nn = 300,

k = 4,

phi = NULL,

type = 2,

tttt = tttt)

aaa1 <- proc.time()

aaa1 - aaa

G <- 100

k_real <- 4

p_k_real <- c(0.7, 0.1, 0.1, 0.1)

dd = rep(c(0:(k_real-1)), p_k_real * G)

result = MAP_test(est_result = est_result, Type = c(1:6), dd = dd, nn = 300)

Summary_MAP Summary for MAP Test

Description

Summary for MAP Test

Usage

Summary_MAP(test_result, alpha = 0.05)

Arguments

test_result MAP_test result:

A list of results for DE analysis

alpha Nominal level of FDR

8Summary_MAP

Details

The vector of read counts for gene g, treatment group i, replicate j, at time point t, Ygij (t), follows a

Negative Binomial distribution parameterized mean λgi and φg, where E[Ygij (t)] = lambdagi(t).

λgi(t)is further modeled as λgi(t) = Sij exp[ηg1Ii=2 +B0(t)ηg2Ii=2 +B0(t)τg]We have B’(t) are

design matrix, which is constructed by 2 orthorgonal polynomial bases.

• t = 1,..., n_treat (or n_control if control group);

• j = 1,..., n_rep;

• g = 1,...,G; and

•[ηg1, ηg2, τg]~ 4-component gausssian mixture model. After the parameter estimation we run

the MAP test to detect the DE genes. At given nominal level, a list of differential genes are

returned at a given hypothesis.

Value

A list of results for DE analysis.

• Type Type of hypothesis

• Reject_index DE genes list

• FDR_hat Estimated FDR

Examples

library(matlib)

set.seed(1)

n_basis = 2

n_control = 10

n_treat = 10

n_rep = 3

tt_treat = c(1:n_treat)/n_treat

nt = length(tt_treat)

ind_t = sort(sample(c(1:nt), n_control))

tt = tt_treat[ind_t]

tttt = c(rep(tt, n_rep), rep(tt_treat, n_rep))

z = x = matrix(0, length(tttt), n_basis)

z[,1] = 1.224745*tttt

z[,2] = -0.7905694 + 2.371708*tttt^2

x[,1] = z[,1] - Proj(z[,1], rep(1, length(tttt)))

x[,2] = z[,2] - Proj(z[,2], rep(1, length(tttt))) - Proj(z[,2], x[,1])

Y1 = data_generation(G = 100,

n_control = n_control,

n_treat = n_treat,

n_rep = n_rep,

k_real = 4,

sigma2_r = rep(1, 2),

sigma1_2_r = 1,

sigma2_2_r = c(3,2),

mu1_r = 4,

phi_g_r = rep(1,100),

p_k_real = c(0.7, 0.1, 0.1, 0.1),

x = x)

aaa <- proc.time()

est_result <- estimation_fun(n_control = n_control,

Summary_MAP 9

n_treat = n_treat,

n_rep = n_rep,

x = x,

Y1 = Y1,

nn = 300,

k = 4,

phi = NULL,

type = 2,

tttt = tttt)

aaa1 <- proc.time()

aaa1 - aaa

G <- 100

k_real <- 4

p_k_real <- c(0.7, 0.1, 0.1, 0.1)

dd = rep(c(0:(k_real-1)), p_k_real * G)

result = MAP_test(est_result = est_result, Type = c(1:6), dd = dd, nn = 300)

ss <- Summary_MAP(result)

Index

data_generation,1

estimation_fun,3

MAP_test,5

Summary_MAP,7

Manual

Navigation menu

Versions of this User Manual:

Views

Navigation