MATLAB_ADTx Matlab ADT User Manual
User Manual:
Open the PDF directly: View PDF
.
Page Count: 12
| Download | |
| Open PDF In Browser | View PDF |
MATLAB Audio Database Toolbox
User Manual
Version 1.00, July 2008
MatlabADT was developed at the Signal and Image Processing Lab (SIPL),
Department of Electrical Engineering, Technion ‐ IIT, all rights resevered.
(c) 2008, Technion – IIT.
1
Table of Contents
Introduction ...................................................................................................................3
Quick Start .....................................................................................................................4
Function Overview ..........................................................................................................6
Function Reference Guide ...............................................................................................7
Usage Examples............................................................................................................ 10
Appendix ‐ TIMIT/NTIMIT Fields ..................................................................................... 11
2
Introduction
MatlabADT (Audio Database Toolbox) enables easy access and filtering of audio
databases such as TIMIT and YOHO by their metadata. The database toolbox comes
to replace the manual filtering and custom coding usually required for accessing such
databases. This toolbox will save you the learning time of the database structure and
will enable you to focus on algorithmic aspects of your code.
The following databases are supported:
1. TIMIT ‐ Acoustic Phonetic Continuous Speech Corpus (American‐English).
Supported search criteria: word, phoneme, usage, sex, dialect, speaker and
sentence. For more information on TIMIT see the appendix at the end of this
document.
2. NTIMIT ‐ Telephone Network Acoustic Phonetic Continuous Speech Corpus.
Supported search criteria: word, phoneme, usage, sex, dialect, speaker and
sentence. For more information on NTIMIT see the appendix at the end of this
document.
3. CTIMIT ‐ Cellular Telephone Acoustic Phonetic Continuous Speech Corpus.
4. YOHO ‐ Speaker Verification Corpus. Supported search criteria: usage, speaker,
session, numbers.
5. TI‐Digits ‐ Speaker‐Independent recognition of connected digit sequences.
Supported search criteria: usage, group, type, speaker and digit.
6. Children Voices ‐ Hebrew Speech.
7. Hebrew BGU ‐ Hebrew word samples were.
8. Gutenberg Books ‐ MP3 format books.
For more information on database structures see the database documentation
available on the SIPL site.
For any problem in using MatlabADT please contact: matlab_adt@sipl.technion.ac.il
3
Quick Start
Installation outside of SIPL:
a. Extract the files and add "MatlabADT" directory to your MATLAB path.
b. In MATLAB, execute the command:
db = ADT('timit','C:\timit_path_on_your_computer','setup');
This will load TIMIT and set the default path to the directory entered.
For NTIMIT or CTIMIT run corresponding commands.
c. In case you want to use other databases: in MatlabADT\@gendb\instance
directory enter the sub directories and change defalt_path.txt to the
directory of the specific database on your computer.
Installation at SIPL workstations:
a. Add the "sipl_matlab_utils" directory to the MATLAB path by executing the
command: addpath('\\piano\Data\sipl_matlab_utils'); or by using MATLAB
menus.
b. Execute the command timitdemo(); to check installation.
Usage:
1. Load the desired database
Loading the database object:
Database name
TIMIT
NTIMIT
CTIMIT
YOHO
Ti‐Digits
Children Voices
Hebrew BGU
Gutenberg Books
Loading command
db = ADT('timit');
db = ADT('ntimit');
db = ADT('ctimit');
db = gendb('yoho');
db = gendb('tidigits');
db = gendb(' children voices');
db = gendb(' hebrew bgu');
db = gendb(' gutenberg books');
ll operations on the database will be performed using the database object which is
passed to them as the first parameter.
2. Make a query for your wanted data
[wavdata] = query(db,'dialect','dr1','word',{'she','it'},30);
This call to the query function will return the wave data of the first 30 words 'she'
or 'it' form dialect 'dr1' in the form of a cell array
4
3. Accessing the wave data
oneWord = wavdata{1};
5
Function Overview
Function name
Description
ADT/gendb
Creates database object
query
Returns file data form database by filtering criteria
filterdb
Creates a subset of the database by filtering criteria
read
Returns file data form database
play
Plays a database entire
getpath
Returns the path of a specific entire
length
Returns the number of entries in the object
6
Function Reference Guide
Note the following usage examples are using the TIMIT database.
query
Usage
[wave fs metadata] = query(db, [criterion1,
value1, [criterion2,value2,…]],[max_returns]);
Arguments
Criteria list describing the query
Return value
wave
cell array of waveforms
metadata
structure array describing waveforms
fs
sampling frequency
Remarks
This function queries the database for entries matching the "criterion – value" pairs.
Criteria are the fields of the metadata defined by database structure. Following is a
short description of these fields. It is optional to pass maximum amount of returned
entries.
value parameter may be one of the following:
1. criterion content: the query will return only entries matching this value
Example:
query(db, 'sentence', 'SA1')
returns all instances of sentence 'SA1'.
2. criterion content negation: if criterion value is preceded by ~ (tilde) then the
query will return only those entries that does not match the specified value:
Example:
query (db, 'dialect', '~dr2')
returns all sentences of dialect region other than dr2.
3. Cell array of criteria content or criteria content negation: the query will
return entries matching either one of criteria.
7
Example:
wav = query(db, 'phoneme', {'d', 'p'});
returns all instances of phonemes 'd' or 'p'
4. criterion content *: either one of criteria.
Example:
wav = query(db, 'word', ‘sh*’);
returns all instances of words starting with sh
filterdb
Usage
filtered_db = filterdb(db, [criterion1,
value1, [criterion2, value2,…]], max_returns);
Arguments
Criteria list describing the query
Return value
Subset of timit database passed as an argument
Remarks
Unlike query function, filterdb does not load all waveforms to memory but
returns filtered database object. This is useful when the resulting set of the query is
too big to fit in memory. Consequent calls to read function can be made to read
content of this filtered database object.
See Remarks section of query function for the description of criteria arguments.
read
Usage
[wave fs metadata] = read(db[, index]);
Arguments
timit database object and (optionally) index of the entry to be
returned
Return value
wave
cell array of waveforms
metadata
structure array describing waveforms
fs
sampling frequency
8
Remarks
Returns index entry in the database. If index argument is not specified, returns all
entries from the database object passed as an argument.
play
Usage
play(db,index);
Arguments
database object, and index of the entire to play
Return value
null
getpath
Usage
path = getpath(db,index);
Arguments
database object, and index to an entire
Return value
The path to the file containing the entire
length
Usage
n = length(db);
Arguments
database object
Return value
Number of entries in the database
9
Usage Examples
Example 1
%TIMITDEMO - a timit database interface demonstration,
%Reads text by searching TIMIT database for the words.
db =ADT('timit');
%loads TIMIT database interface
db = filterdb(db,'sex','m'); %filters database for male speakers
text = 'have a good time using this program and enjoy yourself i hope
this program is useful';
volume = 1; %assigning the volume
text =strread(text,'%s');%converting Text form string to cell array
for ii=1:length(text)
[word smpr] = query(db,'word',text{ii},1);%querying for the ii
word returnig the first match
if ~isempty(word)
max_v = max(word{1});
%normalizing the volume
word{1}=word{1}.*(volume/max_v); %normalizing the volume
sound(word{1},smpr);%play word (words return form query in
the form of cell array
pause(0.2);
%pauses for 0.2s
end
end
Example 2
%GENDBDEMO a demonstration of gendb
%loading YOHO database
db = gendb('yoho');
%filtering YOHO database returning 3 first eateries matching the
query
db = filterdb(db,'usage','verify','speaker','10*',3);
%loading Hebrew BGU database
db2 = gendb('Hebrew BGU');
%selecting a random index
index = fix(rand(1)*length(db2))+1;
%playing index entire
play(db2,index);
%reading wave data from filtered YOHO database
wavedata = read(db);
%plotting entire number 2
plot(wavedata{2});
10
Appendix - TIMIT/NTIMIT Fields
Filed
Content
Description
Usage
train, test
Train/test set defined by TIMIT
dialect
dr1, dr2, dr3, dr4, dr5, dr6, dr7, dr8
Dialect regions as defined in TIMIT
documentation
Sex
m,f
Gender of the speaker
speaker
Four alphanumeric characters
The id of the speaker as given by TIMIT
sentence
Up to 5 characters. Format defined in
TIMIT documentation
Sentence id as defined by TIMIT
word
Word
Any word from a sentence present in
TIMIT
phoneme
Phoneme
Any phoneme from a word present in
TIMIT. Phoneme codes can be found in
TIMIT documentation
The TIMIT/NTIMIT database eateries can take the form of sentences words or
phonemes.
The query or read functions will return a cell array, The returned cell array of
waveforms will contain waveforms of entire sentences, words or phonemes, depends
whether the query result is sentence, word or phoneme. The query result is determined by
the "finest" query term in a query.
Query example
Finest term
in a query
query(db, 'sentence', 'SA1',
Sentence
'dialect', 'dr1', 'sex', 'm')
'sentence',
'dialect', 'sex'
query(db, 'sentence', 'SA1',
'word'
Word
'dialect', 'dr1', 'sex', 'm',
'word', 'she')
11
Content of each cell in a
returned cell array
query(db, 'sentence', 'SA1',
'phoneme'
Phoneme
'word'
Word
'phoneme'
Phoneme
'dialect', 'dr1', 'sex', 'm',
'phoneme', 's')
query(db, 'sentence', 'SA1',
'dialect', 'dr1', 'sex', 'm',
'word', '#all')
query(db, 'sentence', 'SA1',
'dialect', 'dr1', 'sex', 'm',
'phoneme', '#all')
Special value '#all' can be used to force finest term in a query without actually filtering by
that criterion. For example query(db, 'sentence', 'SA1', 'dialect', 'dr1', 'sex',
'm', 'phoneme', '#all') will return all phonemes in the requested sentence.
12
Source Exif Data:
File Type : PDF File Type Extension : pdf MIME Type : application/pdf PDF Version : 1.5 Linearized : Yes Author : Yair & Cheli Create Date : 2009:02:09 22:10:48+02:00 Modify Date : 2009:02:11 09:45:26+02:00 XMP Toolkit : Adobe XMP Core 4.2.1-c041 52.342996, 2008/05/07-20:48:00 Creator Tool : PScript5.dll Version 5.2.2 Metadata Date : 2009:02:11 09:45:26+02:00 Format : application/pdf Title : Microsoft Word - MATLAB_ADT.docx Creator : Yair & Cheli Producer : Acrobat Distiller 9.0.0 (Windows) Document ID : uuid:9a670ebb-392e-4837-9586-1e6ef71cc90e Instance ID : uuid:c1e96b48-3e83-4c9b-aab4-8ddce6b6edc7 Page Count : 12EXIF Metadata provided by EXIF.tools