Manual

User Manual:

Open the PDF directly: View PDF PDF.
Page Count: 14

SKFlatAnalyzer User Guide
Jae Sung Kim
Seoul National University
jae.sung.kim@cern.ch
January 30, 2019
Contents
1 Introduction 2
1.1 SKFlat............................... 2
1.2 SKFlatAnalyzer.......................... 2
2 Directories 4
2.1 DataFormats/........................... 4
2.2 Analyzers/............................. 4
2.3 include/ .............................. 4
2.4 src/ ................................ 4
2.5 data/$SKFlatV .......................... 4
2.6 python/ .............................. 4
2.7 script/............................... 4
2.8 lib/................................. 4
3 Structure 5
3.1 Analyzer inheritance . . . . . . . . . . . . . . . . . . . . . . . 5
3.2 Physics object inheritance . . . . . . . . . . . . . . . . . . . . 6
4 Analyzer class and submission command 7
4.1 SKFlatNtuple........................... 7
4.2 AnalyzerCore ........................... 7
4.3 MyAnalyzer............................ 8
4.4 Jobmacro............................. 8
4.5 SKFlat.py ............................. 9
4.6 Simple way for debugging . . . . . . . . . . . . . . . . . . . . 9
5 Macro run order 10
5.1 Example of a macro with comments inline . . . . . . . . . . . 10
6 Migration from CATAnalyzer 11
7 Rules for developers 12
7.1 File/Function/Variable names are important . . . . . . . . . . 12
7.2 Equality operator between float or double . . . . . . . . . . . 12
7.3 std::map is good, but be careful . . . . . . . . . . . . . . . . . 12
7.4 When using random variables.. . . . . . . . . . . . . . . . . . 13
1
1 Introduction
1.1 SKFlat
A flat ntuple
Use MiniAOD as an input
GihHub link : https://github.com/CMSSNU/SKFlatMaker
1.2 SKFlatAnalyzer
ROOT6 based analyzer
SNU (tamsa1), KISTI and KNU batch are supported by same submis-
sion commands (2019.01.22)
Use SKFlat as an input
Run over each event, and do the analysis!!
Construct physics objects using branch elements:
2
Muon mu;
double rc = muon roch sf>at ( i ) ;
double r c e r r = muon roch sf up>a t ( i ) ;
mu. SetMiniAODPt ( muon pt>at ( i ) ) ;
mu. SetPtEtaPhiM ( muon pt>at ( i ) rc , muon eta>at ( i ) , muon phi
>at ( i ) , muon mass>a t ( i ) ) ;
GitHub link : https://github.com/CMSSNU/SKFlatAnalyzer
3
2 Directories
2.1 DataFormats/
Physics objects
2.2 Analyzers/
Ntuple handler (SKFlatNtuple) and Analyzers
2.3 include/
Header files
2.4 src/
Source files (define class, functions, ...)
2.5 data/$SKFlatV
Various data files including .root, .txt (e.g., fake rates, scale factors, ...).
Defined as an environment variable, $DATA DIR.
2.6 python/
Python scripts for job submission
2.7 script/
Any useful scripts
2.8 lib/
Compiled shared-libraries moved here
4
3 Structure
3.1 Analyzer inheritance
Figure 1: Diagram of analyzer inheritance.
5
3.2 Physics object inheritance
Figure 2: Diagram of physics object inheritance.
6
4 Analyzer class and submission command
Every analyzer inherits AnalyzerCore
AnalyzerCore inherits SKFlatNtuple
4.1 SKFlatNtuple
Almost same as the output from TTree::MakeClass()
SKFlatNtuple::Loop() loops over each event
4.2 AnalyzerCore
Inherits SKFlatNtuple
Includes header files of physics objects classes
Physics analysis functions
s td : : v ect or <Muon>Analyze rCore : : GetAllMuons ( ) ; // r e t u r n a l l
muons
s td : : v ect or <Muon>Anal yzer Core : : GetMuons ( T St ri ng id , double
ptmin , double fetamax ) ; // r e t u r n muons p a s s i n g ID
selection
s td : : v ect or <Muon>Anal yze rCor e : : Sel ect Muo ns ( s td : : v ect or <Muon
>muons , TString id , double ptmin , double fetamax ) ; //
S e l e c t muons p a s s i n g i d ou t o f prec o l l e c t e d muon
c o l l e c t i o n s
Histogram related functions
F i l l H i s t ( TString histname , double value , double weight , int
n bin , double x min , double x max ) ; // histogram i s
saved i n the d e f a u l t d i r e c t o r y o f the output r oo t f i l e
J S F i l l H i s t ( TStrin g s u f f i x , TStri ng histname , double value ,
double weight , in t n bin , double x min , double x max ) ;
// histogram i s saved in the d i r e c t o r y named s u f f i x o f
the output r oo t f i l e
(Example)
vector<Electron>e l e c t r o n s = Ge tE lect r ons ( param .
El ect ro n T igh t I D , 1 0 . , 2 . 5 ) ;
for(unsigned int i =0; i <e l e c t r o n s . s i z e ( ) ; i++){
El ec t ro n e l = e l e c t r o n s . at ( i ) ;
7
F i l l H i s t ( ” Re l Is o ” , e l . Re lIs o ( ) , 1 , 1 00 , 0 . , 1 . ) ;
J S F i l l H i s t ( param . E lec tron T ig ht ID , R e l Is o +param .
E l ec tr on T i g h t I D , e l . R e l I s o ( ) , 1 , 10 0 , 0 . , 1 . ) ;
}
Even if two histograms are in different directories, if their names are
the same, we have warning message : “Warning in <TFile::Append>:
Replacing existing TH1: RelIso (Potential memory leak).”
So I recommend you to add directory name as a prefix/suffix of the
histogram name :
Instead of “RelIso” alone, use “RelIso ”+<Directory Name>
4.3 MyAnalyzer
Inherits AnalyzeCore
Run by the job macro
4.4 Job macro
Macro will be created automatically by SKFlat.py command
MyAnalyzer object is declared
Input sample information ([DATA] DataStream /[MC] Sample name,
input files, xsec, sumW) is set
Output file path is set
SKFlatNtuple::Init() is run : Initializing branch element variables
AnalyzerCore::initializeAnalyzer() is run
This function is virtual, and can be redefined in ExampleRun
8
Anything you want to do before the event loop can be done here
Userflag is supported by python/SKFlat.py, by the option
−−userflags flag1,flag2,flag3”
The existence of a flag can be checked by using
AnalyzerCore::HasFlag(TString flag)
SKFlatNtuple::Loop() is run : loop over events
AnalyzerCore::WriteHist() is run : write histograms in the output
4.5 SKFlat.py
Script for batch job submission
4.6 Simple way for debugging
When debugging, it is better use the master node rather than using batch
system. Here is a quick intstruction to create a debugging macro script.
Let’s say you want to debug ‘MyAnalzyer’. Then, run (-n 10 can be any
number) :
SKFlayt.py -a MyAnalzyer -n 10 -i <sample>-y <year>−−no exec
Go to the job directory. It should be :
$SKFlatRunlogDir/<MyAnalzyer <TIMESTAMP>Year<year> <sample> <machine>
If in KISTI, you will have run XYZ.C’s. If in SNU or KNU, you will have
job XYZ/run.C’s. Copy one of them (let’s call it as “run.C”) to $SK-
Flat WD. If SNU or KNU, edit the path for libDataFormats.so and libAna-
lyzers.so as follows;
R LOAD LIBRARY ( . / l i b / libDataFormats . so )
R LOAD LIBRARY ( . / l i b / l i b A n a l y z e r s . s o )
Now, “run.C” uses the libraries in $SKFlat WD/lib, which is updated when
you run “make”. So, edit your codes, compile, and then do “root -l -b -q
run.C” in $SKFlat WD.
Here are some useful lines you can add in “run.C” :
“m.MaxEvent = 1000;” : run 1000 events only
“m.NSkipEvent = 10;” : skip first 10 events, and then run “m.MaxEvent”
events. If “m.MaxEvent” is not set, run to the end.
“m.LogEvery = 2” : print current event number for every 2 events.
9
5 Macro run order
5.1 Example of a macro with comments inline
R LOAD LIBRARY( l i b P h y s i c s . so )
R LOAD LIBRARY( l i b T r e e . s o )
R LOAD LIBRARY( l i b H i s t . so )
R LOAD LIBRARY ( . / l i b / libDataFormats . so )
R LOAD LIBRARY ( . / l i b / l i b A n a l y z e r s . s o )
void run ( ) {
//==== D e c l ar i n g an a n a l y z e r c l a s s i mm edia tely ru ns f o l l o w i n g s
i n o r d e r s ;
//==== 1) C o ns tr uc to r o f SKFlatNtuple i s c a l l e d
//==== 2) C o ns tr uc to r o f AnalyzerCore i s c a l l e d
//==== 3) C o ns tr uc to r o f ExampleRun i s c a l l e d
ExampleRun m;
//==== SKFlat n tup le d i r e c t o r y s t r u c t u re . .
m. SetTreeName ( ” r e co Tr ee / SKFlat” ) ;
//==== DATA or MC?
m. IsDATA = true ;
//==== I f DATA, PD name
m. DataStream = ” SingleMuon ” ;
//==== DATA year
m. DataYear = 20 16 ;
//==== F i l e s to be ran with t h i s macro
m. AddFile ( ” SKFlatNtuple 2016 DATA 100 . ro ot ” ) ;
//==== output r o o t f i l e path
m. S et Out f i leP ath ( ” h i s t s . r o ot ” ) ;
//==== SKFlatNtuple : : I n i t ( ) , which does SetBranchAddress ( )
m. I n i t ( ) ;
//==== AnalyzerCore : : i n i t i a l i z e A n a l y z e r T o o l s Read his to gr am s
or i n i t i a l i z e MCCorrection h e l p e r s or datad ri ve n e s t i m a t o r s
m. initializeAnalyzerTools () ;
//==== Any i n i t i a l i z a t i o n j u s t b e f o r e running event lo op . This
i s only ran once wi th in a macro . For example , you should run
An alyz erC ore : : HasFlag ( ) he re . More example can be found HERE
m. initializeAnalyzer () ;
//==== F i n a l l y , run e vent l o o p s
m. Loop ( ) ;
//==== A l l e v e n t s a r e r an . Now w r i t e hi st og r a ms t o t he outp ut
r o o t f i l e
m. WriteHist ( ) ;
}
10
6 Migration from CATAnalyzer
Direct copy from CATAnalyzer codes to SKFlatAnalyzer won’t work, but
here are some tips.
FillHist(histname, variable, weight, x min, x max, n bin)
FillHist(histname, variable, weight, n bin, x min, x max)
: follow the order of arguments of TH1 in ROOT
11
7 Rules for developers
Some rules you should follow, if you want to make a pull request to the
master branch.
7.1 File/Function/Variable names are important
Let’s spend enough time for naming our new file/function/variable... Good
naming makes programming efficient.
7.2 Equality operator between float or double
Guess what you would get from “root -l -b -q test.C” with below.
float GetFatJetSF( float tau21cut ) {
i f ( t au 21 cu t == 0 . 4 5 ) {
return 0 . 4 5 ;
}
i f ( tau21 cut == 0 . 6 ) {
return 0.6;
}
e l s e {
return 1.;
}
}
void t e s t ( ) {
cout << ” Value : << GetFatJetSF ( 0 . 4 5 ) << en dl ;
}
Result is Value : 1. It works properly if you change float GetFat-
JetSF(float tau21cut) to float GetFatJetSF(double tau21cut). How-
ever, it is NOT recommended to apply equality operator between floats. If
you really need it, you can do |AB|< e with a very small e(e.g., 0.001).
7.3 std::map is good, but be careful
We use a lot of std::map in the analyzer; rootfile for MCCorrection are saved
as “std::map<TString, TH1D>histmap”, and histogram can be accessed by
“histmap[key]”. But if you store so many histograms into the map, it spends
so much time to obtain “histmap[mykey]”, because it checks “mykey==key”
12
for each keys. If you have saved thousands of fake-rate histograms into a map
and run a fake estimation, it will take years... If you are applying muon scale
factors, “map hist Muon[YOUR ID]” is ran for each event and each muons.
If you wrote too many IDs in ID/Muon/histmap.txt, you will waste your
time looping over unnecessary keys. To save your time, you can add a “#”
at the beginning of each lines in “ID/Muon/histmap.txt” (i.e., deactivating
it) :
ID SF NUM MediumID DEN genTracks RunAveraged SF ID.root NUM MediumID DEN genTracks eta pt
#ID SF NUM MediumID DEN genTracks RunAveraged SF ID.root NUM MediumID DEN genTracks eta pt
Then histogram for Medium ID will not be saved in the histmap.
7.4 When using random variables..
Some functions use random variables (e.g., smearing from a distribution).
If you use default random seed, your results can be changed everytime you
run the analyzer. Easiest way to avoid this issue is using a combination
of RunNumber and EventNumber as a seed. E.g., seed = RunNumber ×
1000000000 + EventNumber.
13

Navigation menu