Manual
User Manual:
Open the PDF directly: View PDF .
Page Count: 10
Download | |
Open PDF In Browser | View PDF |
MAPTest June 17, 2019 Title MAP Test for detection DE genes Version 0.0.0.9000 Description MAP test for analyzing temporal gene expression data. Depends R (>= 3.4.0) License GPL-2 Encoding UTF-8 LazyData true RoxygenNote 6.1.1 Imports EQL,MASS,foreach,mclust,mvtnorm,parallel,randtoolbox,stats,matlib R topics documented: data_generation estimation_fun MAP_test . . . Summary_MAP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Index data_generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 3 5 7 10 Simulate MAP data set: Description Simulate MAP data set: Usage data_generation(G = 100, n_control = 10, n_treat = 10, n_rep = 3, k_real = 4, sigma2_r = rep(1, 2), sigma1_2_r = 1, sigma2_2_r = c(3, 2), mu1_r = 4, phi_g_r = rep(1, 100), p_k_real = c(0.7, 0.1, 0.1, 0.1), x = x) 1 2 data_generation Arguments G Number of genes to simulate n_control Number of time points in control group n_treat Number of time points in treatment group n_rep Number of replicates in both group k_real Always be 4 sigma2_r Variance parameter for τg sigma1_2_r Variance parameter for ηg1 sigma2_2_r Variance parameter for ηg2 mu1_r Mean parameter for ηg1 phi_g_r Dispersion parameter p_k_real True proportion for each mixture component x Time structured design for the simulated data Details The vector of read counts for gene g, treatment group i, replicate j, at time point t, Ygij (t), follows a Negative Binomial distribution parameterized mean λgi and φg , where E[Ygij (t)] = λgi (t). λgi (t) is further modeled as λgi (t) = Sij exp[ηg1 Ii=2 + B 0 (t)ηg2 Ii=2 + B 0 (t)τg ]. We have B 0 (t) are design matrix, which is constructed by 2 orthorgonal polynomial bases. • t = 1,..., n_treat (or n_control if control group); • j = 1,..., n_rep; • g = 1,...,G; and • [ηg1 , ηg2 , τg ] ~ 4-component gausssian mixture model Value • Y1 Simulated data Examples library(matlib) n_basis = 2 n_control = 10 n_treat = 10 n_rep = 3 tt_treat = c(1:n_treat)/n_treat nt = length(tt_treat) ind_t = sort(sample(c(1:nt), n_control)) tt = tt_treat[ind_t] tttt = c(rep(tt, n_rep), rep(tt_treat, n_rep)) z = x = matrix(0, length(tttt), n_basis) z[,1] = 1.224745*tttt z[,2] = -0.7905694 + 2.371708*tttt^2 x[,1] = z[,1] - Proj(z[,1], rep(1, length(tttt))) x[,2] = z[,2] - Proj(z[,2], rep(1, length(tttt))) - Proj(z[,2], x[,1]) Y1 = data_generation(G = 100, n_control = n_control, n_treat = n_treat, estimation_fun 3 n_rep = n_rep, k_real = 4, sigma2_r = rep(1, 2), sigma1_2_r = 1, sigma2_2_r = c(3,2), mu1_r = 4, phi_g_r = rep(1, 100), p_k_real = c(0.7, 0.1, 0.1, 0.1), x = x) Estimation for MAP data set: estimation_fun Description Estimation for MAP data set: Usage estimation_fun(n_control = 10, n_treat = 10, n_rep = 3, x, Y1, k = 4, nn = 300, phi = NULL, type = NULL, tttt) Arguments n_control Number of time points in control group n_treat Number of time points in treatment group n_rep Number of replicates in both group x Time structured design for the simulated data Y1 RNA-seq data set: G by (n_control + n_treat) * n_rep matrix k Always 4 nn Number of QMC nodes used in likelihood estimation phi the dispersion parameter type dispersion parameter estimation type tttt Time points used in the design Details The vector of read counts for gene g, treatment group i, replicate j, at time point t, Ygij (t), follows a Negative Binomial distribution parameterized mean λgi and φg , where E[Ygij (t)] = λgi (t). λgi (t) is further modeled as λgi (t) = Sij exp[ηg1 Ii=2 + B 0 (t)ηg2 Ii=2 + B 0 (t)τg ]. We have B 0 (t) are design matrix, which is constructed by 2 orthorgonal polynomial bases. • t = 1,..., n_treat (or n_control if control group); • j = 1,..., n_rep; • g = 1,...,G; and • [ηg1 , ηg2 , τg ] ~ 4-component gausssian mixture model. We used latented negative binomial model with EM algorithm to estimate the paramters of mixture model. 4 estimation_fun Value A list of parameters that we need to use for DE analysis. data_use List includes the data information • Y1 RNA-seq data set: G by (n_control + n_treat) * n_rep matrix • start Starting point of the EM algorithm • k always 4 • n_basis Number of basis function have been used to construct the design matrix • X1 Vector indicate control data (0) or treatment data (1) • x Design matrix been used for estimation • tttt Time points used in the design result1 List includes the Parameters estimations • mu1 Mean parameter for ηg1 • sigma1_2 Variance parameter for ηg1 • sigma2 Variance parameter for τg • sigma2_2 Variance parameter for ηg2 • p_k Proportion for each mixture component • aa Dispertion parameter Examples library(matlib) n_basis = 2 n_control = 10 n_treat = 10 n_rep = 3 tt_treat = c(1:n_treat)/n_treat nt = length(tt_treat) ind_t = sort(sample(c(1:nt), n_control)) tt = tt_treat[ind_t] tttt = c(rep(tt, n_rep), rep(tt_treat, n_rep)) z = x = matrix(0, length(tttt), n_basis) z[,1] = 1.224745*tttt z[,2] = -0.7905694 + 2.371708*tttt^2 x[,1] = z[,1] - Proj(z[,1], rep(1, length(tttt))) x[,2] = z[,2] - Proj(z[,2], rep(1, length(tttt))) - Proj(z[,2], x[,1]) Y1 = data_generation(G = 100, n_control = n_control, n_treat = n_treat, n_rep = n_rep, k_real = 4, sigma2_r = rep(1, 2), sigma1_2_r = 1, sigma2_2_r = c(3,2), mu1_r = 4, phi_g_r = rep(1,100), p_k_real = c(0.7, 0.1, 0.1, 0.1), x = x) MAP_test 5 aaa <- proc.time() est_result <- estimation_fun(n_control = n_control, n_treat = n_treat, n_rep = n_rep, x = x, Y1 = Y1, nn = 300, k = 4, phi = NULL, type = 2, tttt = tttt) aaa1 <- proc.time() aaa1 - aaa #------------gaussian basis construction-----------------# method <- "gaussian" # n_basis <- 2 # a = 1/4 # b = 2 # c = sqrt(a^2 + 2 * a * b) # A = a + b + c # B = b/A # phi_cal = function(k, x){ # Lambda_k =sqrt(2 * a / A) * B^k # 1/(sqrt(sqrt(a/c) * 2^k * gamma(k+1))) * # exp(-(c-a) * x^2) * hermite(sqrt(2 * c) * x, k, prob = F) # # } # z = do.call(cbind, lapply(c(1:n_basis), phi_cal, x = (tttt - mean(tttt)))) # x = matrix(0, length(tttt), n_basis) # if(n_basis == 2){ # x[,1] = z[,1] - Proj(z[,1], rep(1, length(tttt))) # x[,2] = z[,2] - Proj(z[,2], rep(1, length(tttt))) - Proj(z[,2], x[,1]) # # }else{ # x[,1] = z[,1] - Proj(z[,1], rep(1, length(tttt))) # x[,2] = z[,2] - Proj(z[,2], rep(1, length(tttt))) - Proj(z[,2], x[,1]) # x[,3] = z[,3] - Proj(z[,3], rep(1, length(tttt))) - Proj(z[,3], x[,1]) - Proj(z[,3], x[,2]) # } MAP_test MAP test Description MAP test Usage MAP_test(est_result, dd = NULL, Type = c(1:6), nn = 6000) Arguments est_result estimation_fun result: A List contains data_use (the data information) and result1 (Parameters estimations) 6 MAP_test dd True index to check the true FDR Type Type of hypothesis • Type = 1 general DE detection • Type = 2 PDE with no time-by-treatment and NPDE with both treatment and time-by-treatment • Type = 3 NPDE with or without time-by-treatment • Type = 4 PDE with no time-by-treatment • Type = 5 NPDE with only time-by-treatment • Type = 6 NPDE with both treatment and time-by-treatment Number of QMC nodes used to estimate the test statistics nn Details The vector of read counts for gene g, treatment group i, replicate j, at time point t, Ygij (t), follows a Negative Binomial distribution parameterized mean λgi and φg , where E[Ygij (t)] = λgi (t). λgi (t) is further modeled as λgi (t) = Sij exp[ηg1 Ii=2 + B 0 (t)ηg2 Ii=2 + B 0 (t)τg ]. We have B 0 (t) are design matrix, which is constructed by 2 orthorgonal polynomial bases. • t = 1,..., n_treat (or n_control if control group); • j = 1,..., n_rep; • g = 1,...,G; and • [ηg1 , ηg2 , τg ] ~ 4-component gausssian mixture model. After the parameter estimation we run the MAP test to detect the DE genes. Value A list of results for DE analysis. • Type Type of hypothesis of interest • FDR_final FDR estimation of each hypothesis • FDR_real The true FDR if dd is known • power Power if dd is known • T_x Likelihood result for each component • test_stat Test statistics for each hypothesis • ct_final_all Test statistics cut-off value at each estimated FDR level Examples library(matlib) n_basis = 2 n_control = 10 n_treat = 10 n_rep = 3 tt_treat = c(1:n_treat)/n_treat nt = length(tt_treat) ind_t = sort(sample(c(1:nt), n_control)) tt = tt_treat[ind_t] tttt = c(rep(tt, n_rep), rep(tt_treat, n_rep)) z = x = matrix(0, length(tttt), n_basis) Summary_MAP 7 z[,1] = 1.224745*tttt z[,2] = -0.7905694 + 2.371708*tttt^2 x[,1] = z[,1] - Proj(z[,1], rep(1, length(tttt))) x[,2] = z[,2] - Proj(z[,2], rep(1, length(tttt))) - Proj(z[,2], x[,1]) Y1 = data_generation(G = 100, n_control = n_control, n_treat = n_treat, n_rep = n_rep, k_real = 4, sigma2_r = rep(1, 2), sigma1_2_r = 1, sigma2_2_r = c(3,2), mu1_r = 4, phi_g_r = rep(1,100), p_k_real = c(0.7, 0.1, 0.1, 0.1), x = x) aaa <- proc.time() est_result <- estimation_fun(n_control = n_control, n_treat = n_treat, n_rep = n_rep, x = x, Y1 = Y1, nn = 300, k = 4, phi = NULL, type = 2, tttt = tttt) aaa1 <- proc.time() aaa1 - aaa G <- 100 k_real <- 4 p_k_real <- c(0.7, 0.1, 0.1, 0.1) dd = rep(c(0:(k_real-1)), p_k_real * G) result = MAP_test(est_result = est_result, Type = c(1:6), dd = dd, nn = 300) Summary for MAP Test Summary_MAP Description Summary for MAP Test Usage Summary_MAP(test_result, alpha = 0.05) Arguments test_result MAP_test result: A list of results for DE analysis alpha Nominal level of FDR 8 Summary_MAP Details The vector of read counts for gene g, treatment group i, replicate j, at time point t, Ygij (t), follows a Negative Binomial distribution parameterized mean λgi and φg , where E[Ygij (t)] = lambdagi (t). λgi (t) is further modeled as λgi (t) = Sij exp[ηg1 Ii=2 + B 0 (t)ηg2 Ii=2 + B 0 (t)τg ] We have B’(t) are design matrix, which is constructed by 2 orthorgonal polynomial bases. • t = 1,..., n_treat (or n_control if control group); • j = 1,..., n_rep; • g = 1,...,G; and • [ηg1 , ηg2 , τg ] ~ 4-component gausssian mixture model. After the parameter estimation we run the MAP test to detect the DE genes. At given nominal level, a list of differential genes are returned at a given hypothesis. Value A list of results for DE analysis. • Type Type of hypothesis • Reject_index DE genes list • FDR_hat Estimated FDR Examples library(matlib) set.seed(1) n_basis = 2 n_control = 10 n_treat = 10 n_rep = 3 tt_treat = c(1:n_treat)/n_treat nt = length(tt_treat) ind_t = sort(sample(c(1:nt), n_control)) tt = tt_treat[ind_t] tttt = c(rep(tt, n_rep), rep(tt_treat, n_rep)) z = x = matrix(0, length(tttt), n_basis) z[,1] = 1.224745*tttt z[,2] = -0.7905694 + 2.371708*tttt^2 x[,1] = z[,1] - Proj(z[,1], rep(1, length(tttt))) x[,2] = z[,2] - Proj(z[,2], rep(1, length(tttt))) - Proj(z[,2], x[,1]) Y1 = data_generation(G = 100, n_control = n_control, n_treat = n_treat, n_rep = n_rep, k_real = 4, sigma2_r = rep(1, 2), sigma1_2_r = 1, sigma2_2_r = c(3,2), mu1_r = 4, phi_g_r = rep(1,100), p_k_real = c(0.7, 0.1, 0.1, 0.1), x = x) aaa <- proc.time() est_result <- estimation_fun(n_control = n_control, Summary_MAP 9 n_treat = n_treat, n_rep = n_rep, x = x, Y1 = Y1, nn = 300, k = 4, phi = NULL, type = 2, tttt = tttt) aaa1 <- proc.time() aaa1 - aaa G <- 100 k_real <- 4 p_k_real <- c(0.7, 0.1, 0.1, 0.1) dd = rep(c(0:(k_real-1)), p_k_real * G) result = MAP_test(est_result = est_result, Type = c(1:6), dd = dd, nn = 300) ss <- Summary_MAP(result) Index data_generation, 1 estimation_fun, 3 MAP_test, 5 Summary_MAP, 7 10
Source Exif Data:
File Type : PDF File Type Extension : pdf MIME Type : application/pdf PDF Version : 1.5 Linearized : No Page Count : 10 Page Mode : UseOutlines Author : Title : Subject : Creator : LaTeX with hyperref package Producer : pdfTeX-1.40.16 Create Date : 2019:06:17 13:05:04-06:00 Modify Date : 2019:06:17 13:05:04-06:00 Trapped : False PTEX Fullbanner : This is pdfTeX, Version 3.14159265-2.6-1.40.16 (TeX Live 2015) kpathsea version 6.2.1EXIF Metadata provided by EXIF.tools