MATLAB_ADTx Matlab ADT User Manual

User Manual:

Open the PDF directly: View PDF PDF.
Page Count: 12

DownloadMATLAB_ADTx Matlab ADT - User Manual
Open PDF In BrowserView PDF
MATLAB Audio Database Toolbox
User Manual
Version 1.00, July 2008

MatlabADT was developed at the Signal and Image Processing Lab (SIPL),
Department of Electrical Engineering, Technion ‐ IIT, all rights resevered.
(c) 2008, Technion – IIT.

1

Table of Contents
Introduction ...................................................................................................................3
Quick Start .....................................................................................................................4
Function Overview ..........................................................................................................6
Function Reference Guide ...............................................................................................7
Usage Examples............................................................................................................ 10
Appendix ‐ TIMIT/NTIMIT Fields ..................................................................................... 11

2

Introduction
MatlabADT (Audio Database Toolbox) enables easy access and filtering of audio
databases such as TIMIT and YOHO by their metadata. The database toolbox comes
to replace the manual filtering and custom coding usually required for accessing such
databases. This toolbox will save you the learning time of the database structure and
will enable you to focus on algorithmic aspects of your code.

The following databases are supported:
1. TIMIT ‐ Acoustic Phonetic Continuous Speech Corpus (American‐English).
Supported search criteria: word, phoneme, usage, sex, dialect, speaker and
sentence. For more information on TIMIT see the appendix at the end of this
document.
2. NTIMIT ‐ Telephone Network Acoustic Phonetic Continuous Speech Corpus.
Supported search criteria: word, phoneme, usage, sex, dialect, speaker and
sentence. For more information on NTIMIT see the appendix at the end of this
document.
3. CTIMIT ‐ Cellular Telephone Acoustic Phonetic Continuous Speech Corpus.
4. YOHO ‐ Speaker Verification Corpus. Supported search criteria: usage, speaker,
session, numbers.
5. TI‐Digits ‐ Speaker‐Independent recognition of connected digit sequences.
Supported search criteria: usage, group, type, speaker and digit.
6. Children Voices ‐ Hebrew Speech.
7. Hebrew BGU ‐ Hebrew word samples were.
8. Gutenberg Books ‐ MP3 format books.

For more information on database structures see the database documentation
available on the SIPL site.

For any problem in using MatlabADT please contact: matlab_adt@sipl.technion.ac.il

3

Quick Start
Installation outside of SIPL:
a. Extract the files and add "MatlabADT" directory to your MATLAB path.
b. In MATLAB, execute the command:
db = ADT('timit','C:\timit_path_on_your_computer','setup');
This will load TIMIT and set the default path to the directory entered.
For NTIMIT or CTIMIT run corresponding commands.
c. In case you want to use other databases: in MatlabADT\@gendb\instance
directory enter the sub directories and change defalt_path.txt to the
directory of the specific database on your computer.
Installation at SIPL workstations:
a. Add the "sipl_matlab_utils" directory to the MATLAB path by executing the
command: addpath('\\piano\Data\sipl_matlab_utils'); or by using MATLAB
menus.
b. Execute the command timitdemo(); to check installation.

Usage:
1. Load the desired database
Loading the database object:
Database name
TIMIT
NTIMIT
CTIMIT
YOHO
Ti‐Digits
Children Voices
Hebrew BGU
Gutenberg Books

Loading command
db = ADT('timit');
db = ADT('ntimit');
db = ADT('ctimit');
db = gendb('yoho');
db = gendb('tidigits');
db = gendb(' children voices');
db = gendb(' hebrew bgu');
db = gendb(' gutenberg books');

ll operations on the database will be performed using the database object which is
passed to them as the first parameter.
2. Make a query for your wanted data
[wavdata] = query(db,'dialect','dr1','word',{'she','it'},30);
This call to the query function will return the wave data of the first 30 words 'she'
or 'it' form dialect 'dr1' in the form of a cell array
4

3. Accessing the wave data
oneWord = wavdata{1};

5

Function Overview

Function name

Description

ADT/gendb

Creates database object

query

Returns file data form database by filtering criteria

filterdb

Creates a subset of the database by filtering criteria

read

Returns file data form database

play

Plays a database entire

getpath

Returns the path of a specific entire

length

Returns the number of entries in the object

6

Function Reference Guide
Note the following usage examples are using the TIMIT database.

query
Usage

[wave fs metadata] = query(db, [criterion1,
value1, [criterion2,value2,…]],[max_returns]);

Arguments

Criteria list describing the query

Return value

wave

cell array of waveforms

metadata

structure array describing waveforms

fs

sampling frequency

Remarks
This function queries the database for entries matching the "criterion – value" pairs.
Criteria are the fields of the metadata defined by database structure. Following is a
short description of these fields. It is optional to pass maximum amount of returned
entries.
value parameter may be one of the following:
1. criterion content: the query will return only entries matching this value
Example:
query(db, 'sentence', 'SA1')
returns all instances of sentence 'SA1'.
2. criterion content negation: if criterion value is preceded by ~ (tilde) then the
query will return only those entries that does not match the specified value:
Example:
query (db, 'dialect', '~dr2')
returns all sentences of dialect region other than dr2.
3. Cell array of criteria content or criteria content negation: the query will
return entries matching either one of criteria.

7

Example:
wav = query(db, 'phoneme', {'d', 'p'});
returns all instances of phonemes 'd' or 'p'
4. criterion content *: either one of criteria.
Example:
wav = query(db, 'word', ‘sh*’);
returns all instances of words starting with sh

filterdb
Usage

filtered_db = filterdb(db, [criterion1,
value1, [criterion2, value2,…]], max_returns);

Arguments

Criteria list describing the query

Return value

Subset of timit database passed as an argument

Remarks
Unlike query function, filterdb does not load all waveforms to memory but
returns filtered database object. This is useful when the resulting set of the query is
too big to fit in memory. Consequent calls to read function can be made to read
content of this filtered database object.
See Remarks section of query function for the description of criteria arguments.

read
Usage

[wave fs metadata] = read(db[, index]);

Arguments

timit database object and (optionally) index of the entry to be
returned

Return value

wave

cell array of waveforms

metadata

structure array describing waveforms

fs

sampling frequency

8

Remarks
Returns index entry in the database. If index argument is not specified, returns all
entries from the database object passed as an argument.

play
Usage

play(db,index);

Arguments

database object, and index of the entire to play

Return value

null

getpath
Usage

path = getpath(db,index);

Arguments

database object, and index to an entire

Return value

The path to the file containing the entire

length
Usage

n = length(db);

Arguments

database object

Return value

Number of entries in the database

9

Usage Examples
Example 1
%TIMITDEMO - a timit database interface demonstration,
%Reads text by searching TIMIT database for the words.
db =ADT('timit');
%loads TIMIT database interface
db = filterdb(db,'sex','m'); %filters database for male speakers
text = 'have a good time using this program and enjoy yourself i hope
this program is useful';
volume = 1; %assigning the volume
text =strread(text,'%s');%converting Text form string to cell array
for ii=1:length(text)
[word smpr] = query(db,'word',text{ii},1);%querying for the ii
word returnig the first match
if ~isempty(word)
max_v = max(word{1});
%normalizing the volume
word{1}=word{1}.*(volume/max_v); %normalizing the volume
sound(word{1},smpr);%play word (words return form query in
the form of cell array
pause(0.2);
%pauses for 0.2s
end
end

Example 2
%GENDBDEMO a demonstration of gendb
%loading YOHO database
db = gendb('yoho');
%filtering YOHO database returning 3 first eateries matching the
query
db = filterdb(db,'usage','verify','speaker','10*',3);
%loading Hebrew BGU database
db2 = gendb('Hebrew BGU');
%selecting a random index
index = fix(rand(1)*length(db2))+1;
%playing index entire
play(db2,index);
%reading wave data from filtered YOHO database
wavedata = read(db);
%plotting entire number 2
plot(wavedata{2});

10

Appendix - TIMIT/NTIMIT Fields

Filed

Content

Description

Usage

train, test

Train/test set defined by TIMIT

dialect

dr1, dr2, dr3, dr4, dr5, dr6, dr7, dr8

Dialect regions as defined in TIMIT
documentation

Sex

m,f

Gender of the speaker

speaker

Four alphanumeric characters

The id of the speaker as given by TIMIT

sentence

Up to 5 characters. Format defined in
TIMIT documentation

Sentence id as defined by TIMIT

word

Word

Any word from a sentence present in
TIMIT

phoneme

Phoneme

Any phoneme from a word present in
TIMIT. Phoneme codes can be found in
TIMIT documentation

The TIMIT/NTIMIT database eateries can take the form of sentences words or
phonemes.
The query or read functions will return a cell array, The returned cell array of
waveforms will contain waveforms of entire sentences, words or phonemes, depends
whether the query result is sentence, word or phoneme. The query result is determined by
the "finest" query term in a query.

Query example

Finest term
in a query

query(db, 'sentence', 'SA1',

Sentence

'dialect', 'dr1', 'sex', 'm')

'sentence',
'dialect', 'sex'

query(db, 'sentence', 'SA1',

'word'

Word

'dialect', 'dr1', 'sex', 'm',
'word', 'she')

11

Content of each cell in a
returned cell array

query(db, 'sentence', 'SA1',

'phoneme'

Phoneme

'word'

Word

'phoneme'

Phoneme

'dialect', 'dr1', 'sex', 'm',
'phoneme', 's')
query(db, 'sentence', 'SA1',
'dialect', 'dr1', 'sex', 'm',
'word', '#all')
query(db, 'sentence', 'SA1',
'dialect', 'dr1', 'sex', 'm',
'phoneme', '#all')

Special value '#all' can be used to force finest term in a query without actually filtering by
that criterion. For example query(db, 'sentence', 'SA1', 'dialect', 'dr1', 'sex',
'm', 'phoneme', '#all') will return all phonemes in the requested sentence.

12



Source Exif Data:
File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.5
Linearized                      : Yes
Author                          : Yair & Cheli
Create Date                     : 2009:02:09 22:10:48+02:00
Modify Date                     : 2009:02:11 09:45:26+02:00
XMP Toolkit                     : Adobe XMP Core 4.2.1-c041 52.342996, 2008/05/07-20:48:00
Creator Tool                    : PScript5.dll Version 5.2.2
Metadata Date                   : 2009:02:11 09:45:26+02:00
Format                          : application/pdf
Title                           : Microsoft Word - MATLAB_ADT.docx
Creator                         : Yair & Cheli
Producer                        : Acrobat Distiller 9.0.0 (Windows)
Document ID                     : uuid:9a670ebb-392e-4837-9586-1e6ef71cc90e
Instance ID                     : uuid:c1e96b48-3e83-4c9b-aab4-8ddce6b6edc7
Page Count                      : 12
EXIF Metadata provided by EXIF.tools

Navigation menu