Machine Learning For Plausible Gesture Generation From Speech For Virtual Humans

Ferstl, Ylva

dc.contributor.advisor	McDonnell, Rachel	en
dc.contributor.author	Ferstl, Ylva	en
dc.date.accessioned	2021-08-03T12:57:51Z
dc.date.available	2021-08-03T12:57:51Z
dc.date.issued	2021	en
dc.date.submitted	2021	en
dc.identifier.citation	Ferstl, Ylva, Machine Learning For Plausible Gesture Generation From Speech For Virtual Humans, Trinity College Dublin.School of Computer Science & Statistics, 2021	en
dc.identifier.other	Y	en
dc.identifier.uri	http://hdl.handle.net/2262/96795
dc.description	APPROVED	en
dc.description.abstract	The growing use of virtual humans in an array of applications such as games, human-computer interfaces, and virtual reality demands the design of appealing and engaging characters, while minimizing the cost and time of creation. Nonverbal behavior is an integral part of human communication and important for believable embodied virtual agents. Co-speech gesture represents a key aspect of nonverbal communication and virtual agents are more engaging when exhibiting gesture behavior. Hand-animation of gesture is costly and does not scale to applications where agents may produce new utterances after deployment. Automatized gesture generation is therefore attractive, enabling any new utterance to be animated on the go. A major body of research has been dedicated to methods of automatic gesture generation, but generating expressive and defined gesture motion has commonly relied on explicit formulation of if-then rules or probabilistic modelling of annotated features. Able to work on unlabelled data, machine learning approaches are catching up, however, they often still produce averaged motion failing to capture the speech-gesture relationship adequately. The results from machine-learned models point to the high complexity of the speech-to-motion learning task. In this work, we explore a number of machine learning methods for improving the speech-to-motion learning outcome, including the use of transfer learning from speech and motion models, adversarial training, as well as modelling explicit expressive gesture parameters from speech. We develop a method for automatically segmenting individual gestures from a motion stream, enabling detailed analysis of the speech-gesture relationship. We present two large multimodal datasets of conversational speech and motion, designed specifically for this modelling problem. We finally present and evaluate a novel speech-to-gesture system, merging methods of machine learning and database sampling.	en
dc.publisher	Trinity College Dublin. School of Computer Science & Statistics. Discipline of Computer Science	en
dc.rights	Y	en
dc.subject	gesture generation	en
dc.subject	computer animation	en
dc.subject	motion modelling	en
dc.subject	machine learning	en
dc.subject	conversational agents	en
dc.title	Machine Learning For Plausible Gesture Generation From Speech For Virtual Humans	en
dc.type	Thesis	en
dc.type.supercollection	thesis_dissertations	en
dc.type.supercollection	refereed_publications	en
dc.type.qualificationlevel	Doctoral	en
dc.identifier.peoplefinderurl	https://tcdlocalportal.tcd.ie/pls/EnterApex/f?p=800:71:0::::P71_USERNAME:YFERSTL	en
dc.identifier.rssinternalid	232435	en
dc.rights.ecaccessrights	openAccess
dc.contributor.sponsor	Science Foundation Ireland (SFI)	en

Files in this item

Name:: Thesis_twosided.pdf
Size:: 10.42Mb
Format:: PDF
Description:: Published (author's copy) - Peer ...

View/Open

Name:: license.txt
Size:: 3.530Kb
Format:: Text file

View/Open

This item appears in the following Collection(s)

Computer Science (Theses and Dissertations)
Computer Science (Theses and Dissertations)
Trinity College Dublin Theses & Dissertations

Show simple item record

Browse

My Account

Machine Learning For Plausible Gesture Generation From Speech For Virtual Humans

Files in this item

This item appears in the following Collection(s)