TMN_ResearchIdeas

This is a place where  I’m  putting sketches of ideas we are currently developing.

 

Level 1 is just an idea.. no real implementation

Level 2 means we’ve done at least some small scale work around the lab

Level 3 means we’re definitely working on it

Level 4 means it;s been publically presented there’s at least a poster or powerpoint available (on request)  or an abstract published.

Level 5 We should have published it by now, but haven’t

Level 6 means there’s something submitted.

Level 7 means there is at least something published

Any substantial ideas suggested before stage 7 will be acknowledged.

 

Level

First posted

Short title

1

11-Mar-07

Bootstrap ASR alignment for semi-automated annotation

2

11-Mar-07

Cross-language vowel polyhedron normalization

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Bootstrap ASR alignment for semi-automated annotation. (Lev 1)

This is a very rough idea. A colleague asked on behalf of another colleague about the possibility of using ASR (e.g Dragon Dictate (TM), NaturallySpeaking (TM)) to help in automatically annotating kids speech. Given the not very encouraging experience of Tracey Derwing and colleagues several years ago with the then available commercial ASR dictation packages with ESL students,

I thought this would be tough because adapting the codebooks would be onerous.

But how would this work? (If anybody’s tried it, let me know and I’ll blog it here)

 

Have a trained operator (probably a grad student) who has also  trained a dictation engine well (and knows how to talk to it)

1)    listen to kid’s speech one sentence at a time.

2)    Repeat a version of the kid’s utterance with the same word content but with the operator’s own more recognizable pronunciation.

Given acceptable word-level transcription, use forced alignment (at phone – level) to get a rough cut of an aligned word and phone level transcription.

 

Perhaps a method like this  is already in routine use. But I haven’t come across it. Comments welcome I’ll post here unless instructed to do otherwise.

From Geoff Morrison(http://cns.bu.edu/~gsm2/) 4 March 2007:

Research idea number 1 for getting the word-level transcription was exactly what I used to do to transcribe monologues when I was writing ESL textbooks. An outstanding problem was that the system wasn't good at transcribing false starts and "ungramatical" phrases, it appeared to be trying to match to phrases which would be grammatical under the rules of written language.

 

 

 

Cross-language vowel polyhedron normalization

The rough idea here is to use putative constraints on in the form ‘containing vowel polyhedron’ to estimate a log scale factor   (see Nearey and Assmann in press)

This is an extension of recent work with Geoff Morrison on cross language normalizatioin