This is a place where  IÕm  putting sketches of ideas we are currently developing.


Level 1 is just an idea.. no real implementation

Level 2 means weÕve done at least some small scale work around the lab

Level 3 means weÕre definitely working on it

Level 4 means it;s been publically presented thereÕs at least a poster or powerpoint available (on request)  or an abstract published.

Level 5 We should have published it by now, but havenÕt

Level 6 means thereÕs something submitted.

Level 7 means there is at least something published

Any substantial ideas suggested before stage 7 will be acknowledged.



First posted

Short title



Bootstrap ASR alignment for semi-automated annotation



Cross-language vowel polyhedron normalization























Bootstrap ASR alignment for semi-automated annotation. (Lev 1)

This is a very rough idea. A colleague asked on behalf of another colleague about the possibility of using ASR (e.g Dragon Dictate (TM), NaturallySpeaking (TM)) to help in automatically annotating kids speech. Given the not very encouraging experience of Tracey Derwing and colleagues several years ago with the then available commercial ASR dictation packages with ESL students,

I thought this would be tough because adapting the codebooks would be onerous.

But how would this work? (If anybodyÕs tried it, let me know and IÕll blog it here)


Have a trained operator (probably a grad student) who has also  trained a dictation engine well (and knows how to talk to it)

1)    listen to kidÕs speech one sentence at a time.

2)    Repeat a version of the kidÕs utterance with the same word content but with the operatorÕs own more recognizable pronunciation.

Given acceptable word-level transcription, use forced alignment (at phone – level) to get a rough cut of an aligned word and phone level transcription.


Perhaps a method like this  is already in routine use. But I havenÕt come across it. Comments welcome IÕll post here unless instructed to do otherwise.

From Geoff Morrison( 4 March 2007:

Research idea number 1 for getting the word-level transcription was exactly what I used to do to transcribe monologues when I was writing ESL textbooks. An outstanding problem was that the system wasn't good at transcribing false starts and "ungramatical" phrases, it appeared to be trying to match to phrases which would be grammatical under the rules of written language.




Cross-language vowel polyhedron normalization

The rough idea here is to use putative constraints on in the form Ôcontaining vowel polyhedronÕ to estimate a log scale factor   (see Nearey and Assmann in press)

This is an extension of recent work with Geoff Morrison on cross language normalizatioin