MATLAB_ADTx Matlab ADT User Manual
User Manual:
Open the PDF directly: View PDF .
Page Count: 12

1
MATLABAudioDatabaseToolbox
UserManual
Version1.00,July2008
MatlabADTwasdevelopedattheSignalandImageProcessingLab(SIPL),
DepartmentofElectricalEngineering,Technion‐IIT,allrightsresevered.
(c)2008,Technion–IIT.
2
Table of Contents
Introduction ................................................................................................................... 3
QuickStart ..................................................................................................................... 4
FunctionOverview .......................................................................................................... 6
FunctionReferenceGuide ............................................................................................... 7
UsageExamples ............................................................................................................ 10
Appendix‐TIMIT/NTIMITFields ..................................................................................... 11

3
Introduction
MatlabADT(AudioDatabaseToolbox)enableseasyaccessandfilteringofaudio
databasessuchasTIMITandYOHObytheirmetadata.Thedatabasetoolboxcomes
toreplacethemanualfilteringandcustomcodingusuallyrequiredforaccessingsuch
databases.Thistoolboxwillsaveyouthelearningtimeofthedatabasestructureand
willenableyoutofocusonalgorithmicaspectsofyourcode.
Thefollowingdatabasesaresupported:
1. TIMIT‐AcousticPhoneticContinuousSpeechCorpus(American‐English).
Supportedsearchcriteria:word,phoneme,usage,sex,dialect,speakerand
sentence.FormoreinformationonTIMITseetheappendixattheendofthis
document.
2. NTIMIT‐TelephoneNetworkAcousticPhoneticContinuousSpeechCorpus.
Supportedsearchcriteria:word,phoneme,usage,sex,dialect,speakerand
sentence.FormoreinformationonNTIMITseetheappendixattheendofthis
document.
3. CTIMIT‐CellularTelephoneAcousticPhoneticContinuousSpeechCorpus.
4. YOHO‐SpeakerVerificationCorpus.Supportedsearchcriteria:usage,speaker,
session,numbers.
5. TI‐Digits‐Speaker‐Independentrecognitionofconnecteddigitsequences.
Supportedsearchcriteria:usage,group,type,speakeranddigit.
6. ChildrenVoices‐HebrewSpeech.
7. HebrewBGU‐Hebrewwordsampleswere.
8. GutenbergBooks‐MP3formatbooks.
Formoreinformationondatabasestructuresseethedatabasedocumentation
availableontheSIPLsite.
ForanyprobleminusingMatlabADTpleasecontact:matlab_adt@sipl.technion.ac.il

4
Quick Start
InstallationoutsideofSIPL:
a. Extractthefilesandadd"MatlabADT"directorytoyourMATLABpath.
b. InMATLAB,executethecommand:
db=ADT('timit','C:\timit_path_on_your_computer','setup');
ThiswillloadTIMITandsetthedefaultpathtothedirectoryentered.
ForNTIMITorCTIMITruncorrespondingcommands.
c. Incaseyouwanttouseotherdatabases:inMatlabADT\@gendb\instance
directoryenterthesubdirectoriesandchangedefalt_path.txttothe
directoryofthespecificdatabaseonyourcomputer.
InstallationatSIPLworkstations:
a. Addthe"sipl_matlab_utils"directorytotheMATLABpathbyexecutingthe
command:addpath('\\piano\Data\sipl_matlab_utils');orbyusingMATLAB
menus.
b. Executethecommandtimitdemo();tocheckinstallation.
Usage:
1. Loadthedesireddatabase
Loadingthedatabaseobject:
DatabasenameLoadingcommand
TIMITdb=ADT('timit');
NTIMITdb=ADT('ntimit');
CTIMITdb=ADT('ctimit');
YOHOdb=gendb('yoho');
Ti‐Digitsdb=gendb('tidigits');
ChildrenVoicesdb=gendb('childrenvoices');
HebrewBGUdb=gendb('hebrewbgu');
GutenbergBooksdb=gendb('gutenbergbooks');
lloperationsonthedatabasewillbeperformedusingthedatabaseobjectwhichis
passedtothemasthefirstparameter.
2. Makeaqueryforyourwanteddata
[wavdata]=query(db,'dialect','dr1','word',{'she','it'},30);
Thiscalltothequeryfunctionwillreturnthewavedataofthefirst30words'she'
or'it'formdialect'dr1'intheformofacellarray
5
3. Accessingthewavedata
oneWord=wavdata{1};

6
Function Overview
FunctionnameDescription
ADT/gendb Createsdatabaseobject
query Returnsfiledataformdatabasebyfilteringcriteria
filterdb Createsasubsetofthedatabasebyfilteringcriteria
read Returnsfiledataformdatabase
play
getpath
length
Playsadatabaseentire
Returnsthepathofaspecificentire
Returnsthenumberofentriesintheobject

7
Function Reference Guide
NotethefollowingusageexamplesareusingtheTIMITdatabase.
query
Usage[wave fs metadata] = query(db, [criterion1,
value1, [criterion2,value2,…]],[max_returns]);
ArgumentsCriterialistdescribingthequery
Returnvaluewave cellarrayofwaveforms
metadatastructurearraydescribingwaveforms
fs samplingfrequency
Remarks
Thisfunctionqueriesthedatabaseforentriesmatchingthe"criterion–value"pairs.
Criteriaarethefieldsofthemetadatadefinedbydatabasestructure.Followingisa
shortdescriptionofthesefields.Itisoptionaltopassmaximumamountofreturned
entries.
valueparametermaybeoneofthefollowing:
1. criterioncontent:thequerywillreturnonlyentriesmatchingthisvalue
Example:
query(db, 'sentence', 'SA1')
returnsallinstancesofsentence'SA1'.
2. criterioncontentnegation:ifcriterionvalueisprecededby~(tilde)thenthe
querywillreturnonlythoseentriesthatdoesnotmatchthespecifiedvalue:
Example:
query (db, 'dialect', '~dr2')
returnsallsentencesofdialectregionotherthandr2.
3. Cellarrayofcriteriacontentorcriteriacontentnegation:thequerywill
returnentriesmatchingeitheroneofcriteria.

8
Example:
wav = query(db, 'phoneme', {'d', 'p'});
returnsallinstancesofphonemes'd'or'p'
4. criterioncontent*:eitheroneofcriteria.
Example:
wav = query(db, 'word', ‘sh*’);
returnsallinstancesofwordsstartingwithsh
filterdb
Usagefiltered_db = filterdb(db, [criterion1,
value1, [criterion2, value2,…]], max_returns);
ArgumentsCriterialistdescribingthequery
ReturnvalueSubsetoftimitdatabasepassedasanargument
Remarks
Unlikequeryfunction,filterdbdoesnotloadallwaveformstomemorybut
returnsfiltereddatabaseobject.Thisisusefulwhentheresultingsetofthequeryis
toobigtofitinmemory.Consequentcallstoreadfunctioncanbemadetoread
contentofthisfiltereddatabaseobject.
SeeRemarkssectionofqueryfunctionforthedescriptionofcriteriaarguments.
read
Usage[wave fs metadata] = read(db[, index]);
Argumentstimitdatabaseobjectand(optionally)indexoftheentrytobe
returned
Returnvaluewave cellarrayofwaveforms
metadatastructurearraydescribingwaveforms
fs samplingfrequency

9
Remarks
Returnsindexentryinthedatabase.Ifindexargumentisnotspecified,returnsall
entriesfromthedatabaseobjectpassedasanargument.
play
Usageplay(db,index);
Argumentsdatabaseobject,andindexoftheentiretoplay
Returnvaluenull
getpath
Usagepath = getpath(db,index);
Argumentsdatabaseobject,andindextoanentire
ReturnvalueThepathtothefilecontainingtheentire
length
Usagen = length(db);
Argumentsdatabaseobject
ReturnvalueNumberofentriesinthedatabase
10
Usage Examples
Example1
%TIMITDEMO - a timit database interface demonstration,
%Reads text by searching TIMIT database for the words.
db =ADT('timit'); %loads TIMIT database interface
db = filterdb(db,'sex','m'); %filters database for male speakers
text = 'have a good time using this program and enjoy yourself i hope
this program is useful';
volume = 1; %assigning the volume
text =strread(text,'%s');%converting Text form string to cell array
for ii=1:length(text)
[word smpr] = query(db,'word',text{ii},1);%querying for the ii
word returnig the first match
if ~isempty(word)
max_v = max(word{1}); %normalizing the volume
word{1}=word{1}.*(volume/max_v); %normalizing the volume
sound(word{1},smpr);%play word (words return form query in
the form of cell array
pause(0.2); %pauses for 0.2s
end
end
Example2
%GENDBDEMO a demonstration of gendb
%loading YOHO database
db = gendb('yoho');
%filtering YOHO database returning 3 first eateries matching the
query
db = filterdb(db,'usage','verify','speaker','10*',3);
%loading Hebrew BGU database
db2 = gendb('Hebrew BGU');
%selecting a random index
index = fix(rand(1)*length(db2))+1;
%playing index entire
play(db2,index);
%reading wave data from filtered YOHO database
wavedata = read(db);
%plotting entire number 2
plot(wavedata{2});

11
Appendix - TIMIT/NTIMIT Fields
FiledContentDescription
Usagetrain,testTrain/testsetdefinedbyTIMIT
dialectdr1,dr2,dr3,dr4,dr5,dr6,dr7,dr8DialectregionsasdefinedinTIMIT
documentation
Sexm,fGenderofthespeaker
speakerFouralphanumericcharactersTheidofthespeakerasgivenbyTIMIT
sentenceUpto5characters.Formatdefinedin
TIMITdocumentation
SentenceidasdefinedbyTIMIT
wordWordAnywordfromasentencepresentin
TIMIT
phonemePhonemeAnyphonemefromawordpresentin
TIMIT.Phonemecodescanbefoundin
TIMITdocumentation
TheTIMIT/NTIMITdatabaseeateriescantaketheformofsentenceswordsor
phonemes.
Thequeryorreadfunctionswillreturnacellarray,Thereturnedcellarrayof
waveformswillcontainwaveformsofentiresentences,wordsorphonemes,depends
whetherthequeryresultissentence,wordorphoneme.Thequeryresultisdeterminedby
the"finest"queryterminaquery.
QueryexampleFinestterm
inaquery
Contentofeachcellina
returnedcellarray
query(db, 'sentence', 'SA1',
'dialect', 'dr1', 'sex', 'm')
'sentence',
'dialect','sex'
Sentence
query(db, 'sentence', 'SA1',
'dialect', 'dr1', 'sex', 'm',
'word', 'she')
'word'Word

12
query(db, 'sentence', 'SA1',
'dialect', 'dr1', 'sex', 'm',
'phoneme', 's')
'phoneme'Phoneme
query(db, 'sentence', 'SA1',
'dialect', 'dr1', 'sex', 'm',
'word', '#all')
'word'Word
query(db, 'sentence', 'SA1',
'dialect', 'dr1', 'sex', 'm',
'phoneme', '#all')
'phoneme'Phoneme
Specialvalue'#all'canbeusedtoforcefinestterminaquerywithoutactuallyfilteringby
thatcriterion.Forexamplequery(db, 'sentence', 'SA1', 'dialect', 'dr1', 'sex',
'm', 'phoneme', '#all')willreturnallphonemesintherequestedsentence.











