Guide For Users User

User Manual:

Open the PDF directly: View PDF PDF.
Page Count: 4

DownloadGuide For Users User
Open PDF In BrowserView PDF
Guide for Users
Jingwen Yang
2018-11-07
Pacakge Installation
AnceTran is an R package that performs analyses of transcriptome evolution based on RNA-seq expression
data or ChIP-seq TF binding data. Here, we use HNF4A-binding data for 4 mice species as an example to
show how AnceTran works. A convenient way to install package from github is through devtools package:
install.packages('devtools')
devtools::install_github("jingwyang/AnceTran")
After installation, AnceTran can be loaded in the usual way:
library('AnceTran')

Input Format:
AnceTran package takes binding score data in certain format:
• Binding score file should be a text file in the matrix shape, Rows correspond to orthologous. Columns
correspond to sample names. Sample names are in format of “TaxaName_SubtaxaName_ReplicatesName”.
The example files are included in the AnceTran package, which can be found in extdata folder in the
package. One can load them in to take a look:

BindingScore.table =read.table(system.file('extdata','HNF4A_meanIntensity_4Mouse.txt',package = 'AnceTra
head(BindingScore.table[,1:5])
##
##
##
##
##
##
##

1
2
3
4
5
6

GeneID BL6_HNF4A CAST_HNF4A SPRET_HNF4A CAR_HNF4A
ENSMUSG00000000001 244.6250
338.4167
159.0
96.5000
ENSMUSG00000000003
0.0000
41.0000
0.0
0.0000
ENSMUSG00000000028 184.5000
199.6875
289.4 107.0000
ENSMUSG00000000037
0.0000
0.0000
41.0
20.0000
ENSMUSG00000000049 224.2632
179.7917
191.5 120.1875
ENSMUSG00000000056 266.2500
317.0769
141.4 204.8333

Construction:
The construction function TFconstruct loads in the BindingScore data file, and wraps them in a list of
taxonTF objects (one taxaTF object).

library('AnceTran')
taxa.objects = tTFConstruct(BSFile=system.file('extdata','HNF4A_meanIntensity_4Mouse.txt',package = 'Anc

1

The construction process takes several minutes on a desktop computer depending on data size and hardware
performance. Specify “taxa” and “subtaxa” options in the function when using partial of your data. The
construction process will be faster. If you are hesitated to test the AnceTran, the package has already
bundled a constructed object and you can load the object through:
data(TF.objects)

Data filtering and normalization
We excluded genes whose TF binding score equals to 0 in all species. To account for differences in sequencing depths between species, we quantile-normalized these binding score values across species and also
log-transformed the values for the further analysis.

library('limma')
TF_table = TFtab(objects = TF.objects, taxa = "all", tf = "all",rowindex = NULL, filtering = FALSE, norm
keep<-rowSums((TF_table == 0)) < ncol(TF_table)
TF_table<-TF_table[keep,]
TF_table<-data.frame(log2(normalizeQuantiles(TF_table[,])+1))

Distance matrix
First, we generate an TF-binding distance matrix of these mice species using sOU method:
library('ape')
dismat <- TFdist.sou(bsMat = TF_table)
colnames(dismat)=colnames(TF_table)
rownames(dismat)=colnames(dismat)
dismat
##
##
##
##
##

BL6_HNF4A
CAST_HNF4A
SPRET_HNF4A
CAR_HNF4A

BL6_HNF4A CAST_HNF4A SPRET_HNF4A CAR_HNF4A
0.0000000 0.0000000
0.0000000
0
0.3588558 0.0000000
0.0000000
0
0.4497102 0.4869901
0.0000000
0
0.6862219 0.7693106
0.6649105
0

TF-binding tree building
After the TF-binding distance matrix is created, you can construct character tree by Neighbor-Joining
method, and bootstrap values based on re-sampling orthologous genes with replacements can also be generated by boot.phylo function:
tf_tree <- NJ(dismat)
tf_tree <- root(tf_tree, outgroup = "CAR_HNF4A", resolve.root = T)
tf_tree <- no0br(tf_tree)
f <- function(xx) {
mat <- TFdist.sou(t(xx))
# the distance metrics here should be the same as you specified
# when you created the TF-binding distance matrix
2

colnames(mat) <- rownames(xx)
rownames(mat) <- colnames(mat)
root(NJ(mat), "CAR_HNF4A", resolve.root = T)
}
bs <-

boot.phylo(tf_tree, t(TF_table), f, B = 100)

##
Running bootstraps:
100 / 100
## Calculating bootstrap values... done.
tf_tree$node.label = bs
plot(tf_tree, show.node.label = TRUE)

CAR HNF4A

CAST HNF4A

100
100

BL6 HNF4A
100
SPRET HNF4A
By now, an TF-binding character tree is successfully constructed.
Creating variance co-variance matrix
var_mat <- varMatInv(dismat,TF_table,phy = tf_tree)

Ancestral TF-binding state estimation
Here, we extract the TF-binding values of gene MUP20 as an example:
mup20_binding <- TF_table[which(rownames(TF_table) == "ENSMUSG00000078672"),]
Then we infer the TF-binding scores at ancestral nodes of the TF-binding tree:
mup20_anc <- aee(mup20_binding, tf_tree, var_mat, select = "all")
Finally, we map these estimations on the 4 mice species tree to give a direct presentation of these values:

3

tf_tree$node.label <- sprintf("%.4f",mup20_anc$est)
tf_tree$tip.label <- paste0(tf_tree$tip.label, " ", sprintf("%.4f", mup20_binding))
plot(tf_tree, edge.color = "grey80", edge.width = 4,show.node.label = T,align.tip.label = T,main="Ancest

Ancestial HNF4A−Binding Estimation of Gene MUP20
CAR HNF4A 7.9461

CAST HNF4A 7.6619

7.6058
7.4973

BL6 HNF4A 7.3259
7.4697
SPRET HNF4A 7.1899

4



Source Exif Data:
File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.5
Linearized                      : No
Page Count                      : 4
Page Mode                       : UseOutlines
Author                          : Jingwen Yang
Title                           : Guide for Users
Subject                         : 
Creator                         : LaTeX with hyperref
Producer                        : pdfTeX-1.40.19
Create Date                     : 2018:11:07 20:06:41+08:00
Modify Date                     : 2018:11:07 20:06:41+08:00
Trapped                         : False
PTEX Fullbanner                 : This is pdfTeX, Version 3.14159265-2.6-1.40.19 (TeX Live 2018) kpathsea version 6.3.0
EXIF Metadata provided by EXIF.tools

Navigation menu