[rfc] a library for gesture recognition, speech recognition, and synthesis

newer
program_options bad_any_cast in...

older
Information about create_only and...

Stjepan Rajko

21 Oct 2009 21 Oct '09

8:37 p.m.

Hello, For the past few years I've been working on the AME Patterns library - a generic library for modeling, recognition and synthesis of sequential patterns. So far it's been applied mostly to gesture recognition (e.g., recognition of mouse gestures, full body gestures from video or motion capture data, accelerometer gestures), but recently it has also shown some good results on speech datasets. It can also be used to synthesize the same kinds of patterns it can recognize (e.g., you could use it to synthesize a gesture or speech, or purely numerical patterns). The library provides support for hidden Markov models and other related models. The design is concept-based, and the library uses the Boost Graph Library, Boost.Range (including the RangeEx extension), Boost.Fusion, Boost.Math, Boost.Random, and other boost libraries. The documentation for the library is available here: http://ame4.hc.asu.edu/amelia/patterns/ However, the documentation is lacking, so I'm curious to know which parts of the functionality (if any) are of interest to the boost community, so I know what parts of the documentation to focus on. I am also considering proposing a presentation of this library to BoostCon. The library is currently released under the GPL, which I realize is not acceptable to some. Please let me know your thoughts. If anyone is interested in using this library, please let me know what you'd like to use it for and I'd be happy to help you get started. Kind regards, Stjepan

Attachments:

attachment.html (text/html — 1.6 KB)

Show replies by date

Roland Bock

22 Oct 22 Oct

9:04 a.m.

New subject: [rfc] a library for gesture recognition, speech recognition, and synthesis

Stjepan Rajko wrote:

...

Hello,

For the past few years I've been working on the AME Patterns library - a generic library for modeling, recognition and synthesis of sequential patterns. So far it's been applied mostly to gesture recognition (e.g., recognition of mouse gestures, full body gestures from video or motion capture data, accelerometer gestures), but recently it has also shown some good results on speech datasets. It can also be used to synthesize the same kinds of patterns it can recognize (e.g., you could use it to synthesize a gesture or speech, or purely numerical patterns). The library provides support for hidden Markov models and other related models. The design is concept-based, and the library uses the Boost Graph Library, Boost.Range (including the RangeEx extension), Boost.Fusion, Boost.Math, Boost.Random, and other boost libraries.

The documentation for the library is available here: http://ame4.hc.asu.edu/amelia/patterns/

However, the documentation is lacking, so I'm curious to know which parts of the functionality (if any) are of interest to the boost community, so I know what parts of the documentation to focus on. I am also considering proposing a presentation of this library to BoostCon.

The library is currently released under the GPL, which I realize is not acceptable to some.

Please let me know your thoughts. If anyone is interested in using this library, please let me know what you'd like to use it for and I'd be happy to help you get started.

Kind regards,

Stjepan

Support for Hidden Markov models would be of interest to me. I hope to start some part-of-speech-tagging and similar analysis in about half a year. Regards, Roland

Stjepan Rajko

23 Oct 23 Oct

4:56 p.m.

New subject: [rfc] a library for gesture recognition, speech recognition, and synthesis

On Thu, Oct 22, 2009 at 2:04 AM, Roland Bock <rbock@eudoxos.de> wrote:

...

Stjepan Rajko wrote:

...
Hello,

For the past few years I've been working on the AME Patterns library - a generic library for modeling, recognition and synthesis of sequential patterns. ...

Support for Hidden Markov models would be of interest to me. I hope to start some part-of-speech-tagging and similar analysis in about half a year.

That's a neat problem that I haven't tried yet. I downloaded the Brown corpus and will try to get some results on part-of-speech-tagging. Thanks for your interest, Stjepan

Roland Bock

5:18 p.m.

New subject: [rfc] a library for gesture recognition, speech recognition, and synthesis

Stjepan Rajko wrote:

...

On Thu, Oct 22, 2009 at 2:04 AM, Roland Bock <rbock@eudoxos.de <mailto:rbock@eudoxos.de>> wrote:

Stjepan Rajko wrote:

Hello,

For the past few years I've been working on the AME Patterns library - a generic library for modeling, recognition and synthesis of sequential patterns. ...

Support for Hidden Markov models would be of interest to me. I hope to start some part-of-speech-tagging and similar analysis in about half a year.

That's a neat problem that I haven't tried yet. I downloaded the Brown corpus and will try to get some results on part-of-speech-tagging.

Thanks for your interest,

Stjepan

Wow! Keep me posted, please :-) I hope to join in a few months... Regards, Roland

Stjepan Rajko

26 Oct 26 Oct

10:21 p.m.

New subject: [rfc] a library for gesture recognition, speech recognition, and synthesis

On Fri, Oct 23, 2009 at 10:18 AM, Roland Bock <rbock@eudoxos.de> wrote:

...

Stjepan Rajko wrote:

On Thu, Oct 22, 2009 at 2:04 AM, Roland Bock <rbock@eudoxos.de <mailto:

...
rbock@eudoxos.de>> wrote:

Stjepan Rajko wrote:

Hello,

For the past few years I've been working on the AME Patterns library - a generic library for modeling, recognition and synthesis of sequential patterns. ...

Support for Hidden Markov models would be of interest to me. I hope to start some part-of-speech-tagging and similar analysis in about half a year.

That's a neat problem that I haven't tried yet. I downloaded the Brown corpus and will try to get some results on part-of-speech-tagging.

Thanks for your interest,

Stjepan

Wow! Keep me posted, please :-)

OK, I just completed a small experiment on the 9 texts of the Brown Corpus categorized as "humor". I used 6 of the texts for training, and 3 for testing. I created one submodel per tag ( http://kh.aksis.uib.no/icame/manuals/brown/INDEX.HTM#bc6), trained each from the training data, and then connected the submodels into a larger model with transitions also trained by the training data. Here are the results: Out of 7159 tagged parts of speech (words, symbols, etc.) present in the 3 test texts: 5190 were tagged correctly 300 were tagged incorrectly 1669 were not tagged, because the word or symbol was not present (at least not in a verbatim form) in the training data. So, if you only consider the 7159-1669=5490 parts that could possibly be tagged based on what the training data covers, you get a 94.5% success rate. By using a larger training set, the number of non-tagged parts should go down. Also, I'm sure there are domain-specific tricks to improving the results. BTW., 95% of work to get this done was putting together the code that reads the corpus, since I already have generic code that does this kind of experiment.

...

I hope to join in a few months...

Great! I hope to have things cleaned up and better documented by then. Best, Stjepan

Roland Bock

27 Oct 27 Oct

10:07 a.m.

New subject: [rfc] a library for gesture recognition, speech recognition, and synthesis

Stjepan Rajko wrote: [...]

...

OK, I just completed a small experiment on the 9 texts of the Brown Corpus categorized as "humor". I used 6 of the texts for training, and 3 for testing.

I created one submodel per tag (http://kh.aksis.uib.no/icame/manuals/brown/INDEX.HTM#bc6), trained each from the training data, and then connected the submodels into a larger model with transitions also trained by the training data.

Here are the results:

Out of 7159 tagged parts of speech (words, symbols, etc.) present in the 3 test texts: 5190 were tagged correctly 300 were tagged incorrectly 1669 were not tagged, because the word or symbol was not present (at least not in a verbatim form) in the training data.

So, if you only consider the 7159-1669=5490 parts that could possibly be tagged based on what the training data covers, you get a 94.5% success rate.

By using a larger training set, the number of non-tagged parts should go down. Also, I'm sure there are domain-specific tricks to improving the results.

BTW., 95% of work to get this done was putting together the code that reads the corpus, since I already have generic code that does this kind of experiment.

That is most impressive! I am looking forward to analysing the work you did and using it for German, too (which will be more complex, if I am not mistaken). Alas, as I wrote earlier, I have to patiently complete some other things before diving in :-)

...

Great! I hope to have things cleaned up and better documented by then.

Thanks for your efforts! I really appreciate it! Regards, Roland

5780

Age (days ago)

5786

Last active (days ago)

List overview

Download

5 comments

2 participants

participants (2)

Roland Bock
Stjepan Rajko