TMN_ResearchIdeas
This is a place where I’m putting sketches of ideas we are currently developing.
Level 1 is just an idea.. no real implementation
Level 2 means we’ve done at least some small scale work around the lab
Level 3 means we’re definitely working on it
Level 4 means it;s been publically presented there’s at least a poster or powerpoint available (on request) or an abstract published.
Level 5 We should have published it by now, but haven’t
Level 6 means there’s something submitted.
Level 7 means there is at least something published
Any substantial ideas suggested before stage 7 will be acknowledged.
Level |
First posted |
Short title |
1 |
11-Mar-07 |
Bootstrap ASR alignment for semi-automated annotation |
2 |
11-Mar-07 |
Cross-language vowel polyhedron normalization |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Bootstrap ASR alignment for semi-automated annotation.
(Lev 1)
This is a very rough idea. A colleague asked on behalf of another colleague about the possibility of using ASR (e.g Dragon Dictate (TM), NaturallySpeaking (TM)) to help in automatically annotating kids speech. Given the not very encouraging experience of Tracey Derwing and colleagues several years ago with the then available commercial ASR dictation packages with ESL students,
I thought this would be tough because adapting the codebooks would be onerous.
But how would this work? (If anybody’s tried it, let me know and I’ll blog it here)
Have a trained operator (probably a grad student) who has also trained a dictation engine well (and knows how to talk to it)
1) listen to kid’s speech one sentence at a time.
2) Repeat a version of the kid’s utterance with the same word content but with the operator’s own more recognizable pronunciation.
Given acceptable word-level transcription, use forced alignment (at phone – level) to get a rough cut of an aligned word and phone level transcription.
Perhaps a method like this is already in routine use. But I haven’t come across it. Comments welcome I’ll post here unless instructed to do otherwise.
From Geoff Morrison(http://cns.bu.edu/~gsm2/) 4 March 2007:
Research idea
number 1 for getting the word-level transcription was exactly what I used to do
to transcribe monologues when I was writing ESL textbooks. An outstanding
problem was that the system wasn't good at transcribing false starts and
"ungramatical" phrases, it appeared to be trying to match to phrases
which would be grammatical under the rules of written language.
Cross-language vowel polyhedron normalization
The rough idea here is to use putative constraints on in the
form ‘containing vowel polyhedron’ to estimate a log scale factor (see Nearey and Assmann in press)
This is an extension of recent work with Geoff Morrison on cross language normalizatioin