Serveur d'exploration sur les dispositifs haptiques

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

A multimodal dataset of spontaneous speech and movement production on object affordances

Identifieur interne : 000611 ( Pmc/Curation ); précédent : 000610; suivant : 000612

A multimodal dataset of spontaneous speech and movement production on object affordances

Auteurs : Argiro Vatakis [Grèce] ; Katerina Pastra [Grèce]

Source :

RBID : PMC:4718047

Abstract

In the longstanding effort of defining object affordances, a number of resources have been developed on objects and associated knowledge. These resources, however, have limited potential for modeling and generalization mainly due to the restricted, stimulus-bound data collection methodologies adopted. To-date, therefore, there exists no resource that truly captures object affordances in a direct, multimodal, and naturalistic way. Here, we present the first such resource of ‘thinking aloud’, spontaneously-generated verbal and motoric data on object affordances. This resource was developed from the reports of 124 participants divided into three behavioural experiments with visuo-tactile stimulation, which were captured audiovisually from two camera-views (frontal/profile). This methodology allowed the acquisition of approximately 95 hours of video, audio, and text data covering: object-feature-action data (e.g., perceptual features, namings, functions), Exploratory Acts (haptic manipulation for feature acquisition/verification), gestures and demonstrations for object/feature/action description, and reasoning patterns (e.g., justifications, analogies) for attributing a given characterization. The wealth and content of the data make this corpus a one-of-a-kind resource for the study and modeling of object affordances.


Url:
DOI: 10.1038/sdata.2015.78
PubMed: 26784391
PubMed Central: 4718047

Links toward previous steps (curation, corpus...)


Links to Exploration step

PMC:4718047

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">A multimodal dataset of spontaneous speech and movement production on object affordances</title>
<author>
<name sortKey="Vatakis, Argiro" sort="Vatakis, Argiro" uniqKey="Vatakis A" first="Argiro" last="Vatakis">Argiro Vatakis</name>
<affiliation wicri:level="1">
<nlm:aff id="a1">
<institution>Cognitive Systems Research Institute (CSRI)</institution>
, 11525 Athens,
<country>Greece</country>
</nlm:aff>
<country xml:lang="fr">Grèce</country>
<wicri:regionArea># see nlm:aff country strict</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Pastra, Katerina" sort="Pastra, Katerina" uniqKey="Pastra K" first="Katerina" last="Pastra">Katerina Pastra</name>
<affiliation wicri:level="1">
<nlm:aff id="a1">
<institution>Cognitive Systems Research Institute (CSRI)</institution>
, 11525 Athens,
<country>Greece</country>
</nlm:aff>
<country xml:lang="fr">Grèce</country>
<wicri:regionArea># see nlm:aff country strict</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="a2">
<institution>Institute for Language and Speech Processing (ILSP), ‘Athena’ Research Center</institution>
, 15125 Athens,
<country>Greece</country>
</nlm:aff>
<country xml:lang="fr">Grèce</country>
<wicri:regionArea># see nlm:aff country strict</wicri:regionArea>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">26784391</idno>
<idno type="pmc">4718047</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4718047</idno>
<idno type="RBID">PMC:4718047</idno>
<idno type="doi">10.1038/sdata.2015.78</idno>
<date when="2016">2016</date>
<idno type="wicri:Area/Pmc/Corpus">000611</idno>
<idno type="wicri:Area/Pmc/Curation">000611</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">A multimodal dataset of spontaneous speech and movement production on object affordances</title>
<author>
<name sortKey="Vatakis, Argiro" sort="Vatakis, Argiro" uniqKey="Vatakis A" first="Argiro" last="Vatakis">Argiro Vatakis</name>
<affiliation wicri:level="1">
<nlm:aff id="a1">
<institution>Cognitive Systems Research Institute (CSRI)</institution>
, 11525 Athens,
<country>Greece</country>
</nlm:aff>
<country xml:lang="fr">Grèce</country>
<wicri:regionArea># see nlm:aff country strict</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Pastra, Katerina" sort="Pastra, Katerina" uniqKey="Pastra K" first="Katerina" last="Pastra">Katerina Pastra</name>
<affiliation wicri:level="1">
<nlm:aff id="a1">
<institution>Cognitive Systems Research Institute (CSRI)</institution>
, 11525 Athens,
<country>Greece</country>
</nlm:aff>
<country xml:lang="fr">Grèce</country>
<wicri:regionArea># see nlm:aff country strict</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="a2">
<institution>Institute for Language and Speech Processing (ILSP), ‘Athena’ Research Center</institution>
, 15125 Athens,
<country>Greece</country>
</nlm:aff>
<country xml:lang="fr">Grèce</country>
<wicri:regionArea># see nlm:aff country strict</wicri:regionArea>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Scientific Data</title>
<idno type="eISSN">2052-4463</idno>
<imprint>
<date when="2016">2016</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>In the longstanding effort of defining object affordances, a number of resources have been developed on objects and associated knowledge. These resources, however, have limited potential for modeling and generalization mainly due to the restricted, stimulus-bound data collection methodologies adopted. To-date, therefore, there exists no resource that truly captures object affordances in a direct, multimodal, and naturalistic way. Here, we present the first such resource of ‘thinking aloud’, spontaneously-generated verbal and motoric data on object affordances. This resource was developed from the reports of 124 participants divided into three behavioural experiments with visuo-tactile stimulation, which were captured audiovisually from two camera-views (frontal/profile). This methodology allowed the acquisition of approximately 95 hours of video, audio, and text data covering: object-feature-action data (e.g., perceptual features, namings, functions), Exploratory Acts (haptic manipulation for feature acquisition/verification), gestures and demonstrations for object/feature/action description, and reasoning patterns (e.g., justifications, analogies) for attributing a given characterization. The wealth and content of the data make this corpus a one-of-a-kind resource for the study and modeling of object affordances.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Vatakis, A" uniqKey="Vatakis A">A. Vatakis</name>
</author>
<author>
<name sortKey="Pastra, K" uniqKey="Pastra K">K. Pastra</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="data-paper">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Sci Data</journal-id>
<journal-id journal-id-type="iso-abbrev">Sci Data</journal-id>
<journal-title-group>
<journal-title>Scientific Data</journal-title>
</journal-title-group>
<issn pub-type="epub">2052-4463</issn>
<publisher>
<publisher-name>Nature Publishing Group</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">26784391</article-id>
<article-id pub-id-type="pmc">4718047</article-id>
<article-id pub-id-type="pii">sdata201578</article-id>
<article-id pub-id-type="doi">10.1038/sdata.2015.78</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Data Descriptor</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>A multimodal dataset of spontaneous speech and movement production on object affordances</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Vatakis</surname>
<given-names>Argiro</given-names>
</name>
<xref ref-type="corresp" rid="c1">a</xref>
<xref ref-type="aff" rid="a1">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Pastra</surname>
<given-names>Katerina</given-names>
</name>
<xref ref-type="aff" rid="a1">1</xref>
<xref ref-type="aff" rid="a2">2</xref>
</contrib>
<aff id="a1">
<label>1</label>
<institution>Cognitive Systems Research Institute (CSRI)</institution>
, 11525 Athens,
<country>Greece</country>
</aff>
<aff id="a2">
<label>2</label>
<institution>Institute for Language and Speech Processing (ILSP), ‘Athena’ Research Center</institution>
, 15125 Athens,
<country>Greece</country>
</aff>
</contrib-group>
<author-notes>
<corresp id="c1">
<label>a</label>
A.V. (email:
<email>argiro.vatakis@gmail.com</email>
).</corresp>
<fn id="con">
<label></label>
<p>A.V. conceived and implemented the experiments, contributed to data validation and transcription and wrote the manuscript. K.P. conceived and provided conceptual discussions on the experiments, validated the data and contributed to the manuscript.</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>19</day>
<month>01</month>
<year>2016</year>
</pub-date>
<pub-date pub-type="collection">
<year>2016</year>
</pub-date>
<volume>3</volume>
<elocation-id>150078</elocation-id>
<history>
<date date-type="received">
<day>26</day>
<month>06</month>
<year>2015</year>
</date>
<date date-type="accepted">
<day>15</day>
<month>12</month>
<year>2015</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright © 2016, Macmillan Publishers Limited</copyright-statement>
<copyright-year>2016</copyright-year>
<copyright-holder>Macmillan Publishers Limited</copyright-holder>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by-nc/4.0/">
<pmc-comment>author-paid</pmc-comment>
<license-p>This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by-nc/4.0/">http://creativecommons.org/licenses/by-nc/4.0/</ext-link>
Metadata associated with this Data Descriptor is available at
<ext-link ext-link-type="uri" xlink:href="http://www.nature.com/sdata/">http://www.nature.com/sdata/</ext-link>
and is released under the CC0 waiver to maximize reuse. </license-p>
</license>
</permissions>
<abstract>
<p>In the longstanding effort of defining object affordances, a number of resources have been developed on objects and associated knowledge. These resources, however, have limited potential for modeling and generalization mainly due to the restricted, stimulus-bound data collection methodologies adopted. To-date, therefore, there exists no resource that truly captures object affordances in a direct, multimodal, and naturalistic way. Here, we present the first such resource of ‘thinking aloud’, spontaneously-generated verbal and motoric data on object affordances. This resource was developed from the reports of 124 participants divided into three behavioural experiments with visuo-tactile stimulation, which were captured audiovisually from two camera-views (frontal/profile). This methodology allowed the acquisition of approximately 95 hours of video, audio, and text data covering: object-feature-action data (e.g., perceptual features, namings, functions), Exploratory Acts (haptic manipulation for feature acquisition/verification), gestures and demonstrations for object/feature/action description, and reasoning patterns (e.g., justifications, analogies) for attributing a given characterization. The wealth and content of the data make this corpus a one-of-a-kind resource for the study and modeling of object affordances.</p>
</abstract>
<kwd-group kwd-group-type="npg.subject">
<title>Subject terms</title>
<kwd>Human behaviour</kwd>
<kwd>Language</kwd>
<kwd>Perception</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec disp-level="1">
<title>Background & Summary</title>
<p>Our everyday interaction with objects is quite natural, where we somehow ‘know’ which object is most suitable for a given goal. Pounding, for example, can be prototypically accomplished with a hammer. However, any object that is rigid and heavy enough has the potential to serve as a hammer (e.g., a stone). Thus, object affordances and object feature knowledge is necessary for goal attainment. In the quest of understanding how people perceive objects and their affordances, researchers from both the cognitive and computational sciences have collected data on objects and object features or function and intended use (e.g., refs
<xref ref-type="bibr" rid="b1">1–4</xref>
).</p>
<p>Data on object categories have originated from naming studies
<sup>
<xref ref-type="bibr" rid="b5">5–8</xref>
</sup>
, however these do not provide any data on object affordances. Data on general object knowledge (e.g., featural, taxonomic, encyclopaedic) originate mainly from studies on semantic feature production norms for lexical concepts of familiar objects through questionnaires (e.g., refs
<xref ref-type="bibr" rid="b1">1</xref>
,
<xref ref-type="bibr" rid="b3">3</xref>
). For instance, in McRae
<italic>et al.</italic>
<sup>
<xref ref-type="bibr" rid="b1">1</xref>
</sup>
, participants reported a total of 2526 distinct semantic production norms for a total of 541 living and nonliving entities. These data allow for a wealth of cognitive and linguistic measures, but they are bound to the specific stimulus presented and the restricted and directed responding (i.e., written responses following specific examples leading to generic and unimodal responses). This limits the possibility for modeling and generalization of object affordances.</p>
<p>Currently, there exists no data resource that captures object affordances in a direct, multimodal, and naturalistic way. Additionally, there is no resource that collectively encompasses data on: a) feature distinctiveness for action/goal-related decision making (e.g., ‘heavy enough for hammering a nail’), b) feature distinctiveness for object category identification (for the stimuli presented experimentally, but more importantly for others not presented during experimentation; e.g., ‘it is sharp like a knife’), c) means of acquiring object/function-related information (e.g., ‘sharp enough [acquired haptically by rubbing] for cutting’), and d) reasoning patterns for assigning object name/function (e.g., ‘could also be a ball, if it was bigger’). Development of such a resource requires the acquisition of information in a way that resembles everyday human-object interaction, which includes: multisensory access to an object, unrestricted and undirected interaction with it, and multimodal ways of responding. Furthermore, it requires a set of unfamiliar stimuli so as to elicit data beyond the expected information one may get from known/familiar everyday objects.</p>
<p>Here, we describe the first such multimodal resource of ‘thinking aloud’ verbal and spontaneously-generated motoric data on object affordances. The data were elicited by the use of unfamiliar visual and tactile stimuli and an undirected and unrestricted manipulation and response task. Specifically, we utilized man-made lithic tools with a particular use unknown to the modern man (cf.
<sup>
<xref ref-type="bibr" rid="b9">9</xref>
</sup>
) and asked participants to freely describe the objects and their potential function(s). Their responses were captured audiovisually in three different behavioural experiments (see
<xref ref-type="fig" rid="f1">Fig. 1</xref>
). In Experiments 1–2, the stimulation was photographs of 22 lithic tools in a fixed (Exp. 1) or participant-controlled viewing orientation (Exp. 2), while in Exp. 3, 9 lithic tools were freely viewed and touched/manipulated (see Methods). In all three experiments, the stimuli were presented either in isolation or hand-held, so as to indirectly elicit more movement-related information.</p>
<p>The above-mentioned methodology resulted in approximately 45 gigabytes of video, audio, and text data, categorized in the following data types: A) Object-feature-action: verbally expressed perceptual features (e.g., shape), namings, and actions/functions, B) Exploratory Acts (EAs): haptic manipulation for acquisition/verification of features (see also ref.
<xref ref-type="bibr" rid="b10">10</xref>
on Exploratory Procedures), C) Gestures-demonstrations: production of pantomime gestures for object/feature/action description (e.g., ‘writing with a pen’-[hand configured as if holding a pen]) and actual demonstrations of uses, and D) Reasoning patterns: linguistic patterns to:
<italic>justify</italic>
a specific characterization, describe an object/feature’s
<italic>intended use</italic>
and the
<italic>effects</italic>
of an action,
<italic>compare</italic>
objects/features, and specify
<italic>conditions</italic>
to be met for a given characterization. The large set of data provided directly and the potential modeling of these data, make this dataset a one-of-a-kind source for the study of
<italic>how</italic>
and
<italic>why</italic>
people ‘know’ how to accomplish an unlimited number of goals.</p>
</sec>
<sec disp-level="1">
<title>Methods</title>
<sec disp-level="2">
<title>Participants</title>
<p>124 Greek participants (93 females) aged between 17 and 52 years (Mean age=23 years) were given course credit (i.e., students attended the courses: Cognitive Psychology I, Cognitive Psychology II, or Current topics in Cognitive Science), in return for taking part in the experiment. Specifically, 43 (32 females, M=24.6 years of age), 42 (33 females, M=20.7 years of age), and 39 (28 females, M=23.7 years of age) students participated in Experiments 1, 2, and 3, respectively, with no participants partaking in more than one experiment. All of the participants were naïve as to the purpose of the study and all reported excellent knowledge of the Greek language. Upon completion of the experiment, the participants were asked about their knowledge of archaeology and none of them reported any such knowledge. The experiments took approximately 2–5 h each to complete.</p>
<p>The participants were asked to provide their consent for the publication of their data. Audio-only data are available for those participants who preferred their video recordings not to be publicly available (33 out of the 124 participants denied public release of their video recordings; see Data Records).</p>
</sec>
<sec disp-level="2">
<title>Apparatus and materials</title>
<p>The experiments were conducted in a sound attenuated audiovisual recording studio. During the experiments, the participants were seated comfortably at a small table facing straight ahead (see
<xref ref-type="fig" rid="f1">Fig. 1</xref>
).</p>
<p>In Experiment 1, the visual stimuli were presented on a 19-in. TFT colour LCD monitor (WXGA+ 1440×900 pixel resolution; 60-Hz refresh rate) placed approximately 70 cm in front of the participant. The visual stimuli (size: 456×456) consisted of 22 images of lithic tools that were presented either in isolation or with a hand holding them in the correct configuration as defined by their actual use (see
<xref ref-type="fig" rid="f1">Fig. 1</xref>
). The visual stimuli used in this experiment were taken from the online museum image database: ‘The world museum of man’. The images were presented on a white background using the MATLAB programming software (Version 6.5) with the Psychophysics Toolbox extensions
<sup>
<xref ref-type="bibr" rid="b11">11</xref>
,
<xref ref-type="bibr" rid="b12">12</xref>
</sup>
. Before each image presentation, a fixation followed by a mask were presented for 200 and 24 ms, respectively. The mask was used in order to avoid interference effects between the visual stimuli presented.</p>
<p>In Experiment 2, the set-up and stimuli were identical to that of Exp. 1 with the sole difference that the stimuli were presented in printed cards instead of the computer screen. The visual images were scaled on 10×12 laminated cards. At the back of each card an alphanumeric labelling (1A, 2A etc.) was used in order to facilitate identification of a given stimulus.</p>
<p>In Exp. 3, the experimental set-up was identical to that of Exp. 2 with the sole difference that a new set of stimuli were used and participants could see, touch, and manipulate this new set. The stimuli consisted of 9 different lithic tools presented: a) in isolation on a printed card (participant-controlled orientation), b) the actual tool, and c) the image of a hand holding the tool in the correct configuration. The lithic tools used in this experiment were custom-made imitations of lithic tools.</p>
<p>The experimental sessions were recorded using two Sony Digital Video Cameras. The cameras recorded simultaneously a frontal- and profile-view of the participants (see
<xref ref-type="fig" rid="f1">Fig. 1</xref>
). The profile-view was used for capturing participants’ movements. The two views were synchronized by a clap, which was produced by the participants or the experimenter before the start of the experiment.</p>
</sec>
<sec disp-level="2">
<title>Procedures</title>
<p>Before the start of the experiment, the participants were informed that they would be presented with a series of images of objects (and the actual objects in Exp. 3) and the same objects held by an agent. They were asked to provide a detailed verbal description of each object and its possible uses. They were also informed that defining a potential use for a given object may sometimes be difficult, in which case they could continue with the next object without reporting a use. The task was self-paced and the participants were free to spend as much time as they wished talking about a given object before advancing to the next one. For Exps. 2 and 3, participants were also asked to create object categories based on any information they wanted and report the criterion for category creation.</p>
<p>The participants were informed that they will be recorded and were asked to complete an informed consent form. The experimenter monitored the participants through a monitor placed behind a curtain out of the participant’s sight. This was done in order to provide the participants with some privacy and allow them to complete the task without the intervention of the experimenter.</p>
</sec>
<sec disp-level="2">
<title>Movie processing</title>
<p>The audiovisual recordings were captured and processed using the video processing software Vegas Pro 8.0 (Sony Creative Software Inc.). The initial recordings were captured at: video of 25 fps interlaced, 720×576, DV and audio of 48 Hz, 16-bit, stereo, uncompressed. The videos were further processed to: video of H.264, 25 fps, 720×576 and audio of ACC, 48 Hz. The latter processing was done in order to decrease the size of each video file and allow compatibility with most media players currently available.</p>
</sec>
</sec>
<sec disp-level="1">
<title>Data Records</title>
<p>The data is freely available and stored at Figshare (Data Citation 1). This resource contains an excel file (Experimental_Information.xls; Data Citation 1) with information on: a) the participant’s assigned number and experiment (e.g., PN#_E#, where PN corresponds to the participant number and E to the experiment), which serves as a guide to the corresponding video, audio, and transcription files, b) basic demographic information (e.g., gender, age), and c) the available data files for each participant, details regarding their size (in mb) and duration (in secs), and potential problems with these files. These problems were mostly due to dropped frames in one of the two cameras and in rare cases missing files. The excel file is composed of three different sheets that correspond to the three different experiments conducted (refer to Methods).</p>
<p>The audiovisual videos (.mp4), audio files (.aac), and transcription files (.trs) are organized by experiment and participant (Note: Audiovisual and audio/transcribed files are not equal in number given that some participants did not allow public release of their video but only their audio recordings). Each participant file contains the frontal (F) and profile (P) video recordings (e.g., PN1_E1_F that refers to participant 1, experiment 1, frontal view) and the transcribed file along with the audio file. Also, the videos are labelled according to the experimental condition: where ‘NH’ denotes that the object is in isolation, ‘H’ that the object is held by an agent, and ‘T’ that the actual, physical object is presented (e.g., PN1_E1_F_H that refers to participant 1, experiment 1, frontal view, object held by an agent). These files are compressed in a.rar format per participant and per experiment (see
<xref ref-type="table" rid="t1">Table 1</xref>
for an overview of the data).</p>
</sec>
<sec disp-level="1">
<title>Technical Validation</title>
<p>In the three experiments conducted, we implemented a ‘thinking aloud’
<sup>
<xref ref-type="bibr" rid="b13">13</xref>
</sup>
approach in order to create a data resource with a rich body of linguistic and motoric information on objects and object affordances. Such resource should include information not only related to object namings and uses but also to object features, actions related to object/uses, and potential associations of all these elements (i.e., reasoning patterns). We validated whether or not this resource satisfied the initial goal posed by measuring the breadth of linguistic information collected.</p>
<p>All participant reports were transcribed manually using the speech-to-text transcription environment Transcriber
<sup>
<xref ref-type="bibr" rid="b14">14</xref>
</sup>
. Segmentation of speech into utterances was determined by the experimenter guided by pauses and intonation patterns. This was necessary so that the information reported was categorized correctly in terms of their object referent. A total of 287 files were transcribed (approximately 95 h) with a 30-minute file requiring approximately 3–4 h of transcription. Acoustic events (e.g., sneezing, clapping), non-speech segments (e.g., prolonged periods of silence), and speech phenomena (e.g., corrections, fillers) were also transcribed.</p>
<p>The transcribed verbal data were then semantically annotated in the Anvil annotation environment
<sup>
<xref ref-type="bibr" rid="b15">15</xref>
</sup>
using a very basic specification scheme covering: object features, object namings, object uses, and reasoning patterns. The latter comprised: a) justifications of the naming or use of an object, b) comparisons of a feature or object that were present during experimentation or were absent but participant reported, c) conditionals: conditions that had to be met in order to attribute a feature, name, or use for a given object, and d) analogies.</p>
<p>This annotation indicated 2942 unique object categories for which feature and affordance categories have been captured, going beyond the limited set of the 10 lithic tool categories to a large number of modern objects. For these object categories, 2090 unique feature and affordance categories have been captured, as well as 5567 reasoning pattern instances.
<xref ref-type="table" rid="t2">Table 2</xref>
shows the exact numbers of these data per type and related examples. It must be noted here that we only report unique counts rather than frequency of occurrence of a given category, as we consider this a more objective measure of the wealth of information obtained, given also that the information obtained went way beyond the stimuli presented to the participants.</p>
<p>Furthermore, annotation of motoric elements in the audiovisual data took place in the ELAN annotation environment
<sup>
<xref ref-type="bibr" rid="b16">16</xref>
</sup>
and comprised two broad categories: Exploratory Acts (EAs) and gestures/movements. The EAs identified are an extended set of exploratory actions on objects than previously reported (e.g., see Exploratory Procedures
<sup>
<xref ref-type="bibr" rid="b10">10</xref>
</sup>
) and were characterized by movements that allowed for feature discovery and/or verification. They totaled 11.209 instances. The gestures/movements noted were: a) emblems, b) deictic, c) metaphoric: pictorial gestures for abstract concepts, d) iconic-pantomime: gestures for the enactment of actions and object features, e) pantomime metaphoric: gestures for the enactment of actions with the hand mimicking the tool, f) demonstrations: the actual enactment of the use of an object with no goal attained, and g) body movements (see
<xref ref-type="table" rid="t2">Table 2</xref>
).</p>
<p>Together, the general data (verbal and motoric) categories briefly described here demonstrate that this resource is indeed a one-of-a-kind reference of how people talk about objects, how they perceive them, and discover their affordances. This data set can provide valuable information on the object parts and/or features that are salient to the observer for a given action and/or use and on the modality-dependent information needed to infer an object’s identity and/or function
<sup>
<xref ref-type="bibr" rid="b10">10</xref>
</sup>
. Finally, this is the first resource that allows for modeling object affordances from data on objects that were never presented during experimentation, thus opening the path for the discovery of object affordances.</p>
</sec>
<sec disp-level="1">
<title>Additional Information</title>
<p>
<bold>How to cite this article:</bold>
Vatakis, A. & Pastra, K. A multimodal dataset of spontaneous speech and movement production on object affordances.
<italic>Sci. Data</italic>
3:150078 doi: 10.1038/sdata.2015.78 (2016).</p>
</sec>
<sec sec-type="supplementary-material" id="S1">
<title>Supplementary Material</title>
<supplementary-material id="d33e21" content-type="local-data">
<media xlink:href="sdata201578-isa1.zip"></media>
</supplementary-material>
</sec>
</body>
<back>
<ref-list content-type="data-citations">
<title>Data Citations</title>
<ref id="d1">
<mixed-citation publication-type="data">
<source>Figshare</source>
<name>
<surname>Vatakis</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Pastra</surname>
<given-names>K.</given-names>
</name>
<year>2015</year>
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.6084/m9.figshare.1457788">http://dx.doi.org/10.6084/m9.figshare.1457788</ext-link>
</mixed-citation>
</ref>
</ref-list>
<ack>
<p>This work was funded by the European Commission Framework Program 7 project POETICON (ICT-215843) and POETICON++ (ICT-288382). We would like to thank Elissavet Bakou, Stamatis Paraskevas, and Ifigenia Pasiou for assistance during the audiovisual recordings, Paraskevi Botini for assistance with the transcription process, Maria Giagkou for assistance with the annotation process, Dimitris Mavroeidis for assistance in video compression/processing, Panagiotis Dimitrakis for assistance in data processing, and Guendalina Mantovani for providing the lithic tools used in Experiment 3.</p>
</ack>
<ref-list>
<ref id="b1">
<mixed-citation publication-type="journal">
<name>
<surname>McRae</surname>
<given-names>K.</given-names>
</name>
,
<name>
<surname>Cree</surname>
<given-names>G. S.</given-names>
</name>
,
<name>
<surname>Seidenberg</surname>
<given-names>M. S.</given-names>
</name>
&
<name>
<surname>McNorgan</surname>
<given-names>C.</given-names>
</name>
<article-title>Semantic feature production norms for a large set of living and nonliving things</article-title>
.
<source>Behav. Res. Methods Instrum. Comput.</source>
<volume>37</volume>
,
<fpage>547</fpage>
<lpage>559</lpage>
(
<year>2005</year>
).</mixed-citation>
</ref>
<ref id="b2">
<mixed-citation publication-type="journal">
<name>
<surname>Snodgrass</surname>
<given-names>J. C.</given-names>
</name>
&
<name>
<surname>Vanderwart</surname>
<given-names>M.</given-names>
</name>
<article-title>A standardized set of 260 pictures: Norms for name agreement, image agreement, familiarity, and visual complexity</article-title>
.
<source>J. Exp. Psychol. Hum. Learn</source>
<volume>6</volume>
,
<fpage>174</fpage>
<lpage>215</lpage>
(
<year>1980</year>
).
<pub-id pub-id-type="pmid">7373248</pub-id>
</mixed-citation>
</ref>
<ref id="b3">
<mixed-citation publication-type="journal">
<name>
<surname>Wu</surname>
<given-names>L.-l.</given-names>
</name>
&
<name>
<surname>Barsalou</surname>
<given-names>L. W.</given-names>
</name>
<article-title>Perceptual simulation in conceptual combination: Evidence from property generation</article-title>
.
<source>Acta Psychol.</source>
<volume>132</volume>
,
<fpage>173</fpage>
<lpage>189</lpage>
(
<year>2009</year>
).</mixed-citation>
</ref>
<ref id="b4">
<mixed-citation publication-type="journal">
<name>
<surname>Froimovich</surname>
<given-names>G.</given-names>
</name>
,
<name>
<surname>Rivlin</surname>
<given-names>E.</given-names>
</name>
,
<name>
<surname>Shimshoni</surname>
<given-names>I.</given-names>
</name>
&
<name>
<surname>Soldea</surname>
<given-names>O.</given-names>
</name>
<article-title>Efficient search and verification for function based classification from real range images</article-title>
.
<source>Comput. Vis. Image Underst.</source>
<volume>105</volume>
,
<fpage>200</fpage>
<lpage>217</lpage>
(
<year>2007</year>
).</mixed-citation>
</ref>
<ref id="b5">
<mixed-citation publication-type="journal">
<name>
<surname>Johnson</surname>
<given-names>C. J.</given-names>
</name>
,
<name>
<surname>Paivio</surname>
<given-names>A.</given-names>
</name>
&
<name>
<surname>Clark</surname>
<given-names>J. M.</given-names>
</name>
<article-title>Cognitive components of picture naming</article-title>
.
<source>Psychol. Bull.</source>
<volume>120</volume>
,
<fpage>113</fpage>
<lpage>139</lpage>
(
<year>1996</year>
).
<pub-id pub-id-type="pmid">8711012</pub-id>
</mixed-citation>
</ref>
<ref id="b6">
<mixed-citation publication-type="journal">
<name>
<surname>Vinson</surname>
<given-names>D. P.</given-names>
</name>
&
<name>
<surname>Vigliocco</surname>
<given-names>G.</given-names>
</name>
<article-title>Semantic feature production norms for a large set of objects and events</article-title>
.
<source>Behav. Res. Methods</source>
<volume>40</volume>
,
<fpage>183</fpage>
<lpage>190</lpage>
(
<year>2008</year>
).
<pub-id pub-id-type="pmid">18411541</pub-id>
</mixed-citation>
</ref>
<ref id="b7">
<mixed-citation publication-type="journal">
<name>
<surname>Snodgrass</surname>
<given-names>J. G.</given-names>
</name>
&
<name>
<surname>Yuditsky</surname>
<given-names>T.</given-names>
</name>
<article-title>Naming times for the Snodgrass and Vanderwart pictures</article-title>
.
<source>Behav. Res. Methods Instrum. Comput.</source>
<volume>28</volume>
,
<fpage>516</fpage>
<lpage>536</lpage>
(
<year>1996</year>
).</mixed-citation>
</ref>
<ref id="b8">
<mixed-citation publication-type="journal">
<name>
<surname>Szekely</surname>
<given-names>A.</given-names>
</name>
<italic>et al.</italic>
<article-title>Timed picture naming: Extended norms and validation against previous studies</article-title>
.
<source>Behav. Res. Methods Instrum. Comput.</source>
<volume>35</volume>
,
<fpage>621</fpage>
<lpage>633</lpage>
(
<year>2003</year>
).
<pub-id pub-id-type="pmid">14748507</pub-id>
</mixed-citation>
</ref>
<ref id="b9">
<mixed-citation publication-type="journal">
<name>
<surname>Vingerhoets</surname>
<given-names>G.</given-names>
</name>
,
<name>
<surname>Vandamme</surname>
<given-names>K.</given-names>
</name>
&
<name>
<surname>Vercammen</surname>
<given-names>A.</given-names>
</name>
<article-title>Conceptual and physical object qualities contribute differently to motor affordances</article-title>
.
<source>Brain Cogn.</source>
<volume>69</volume>
,
<fpage>481</fpage>
<lpage>489</lpage>
(
<year>2009</year>
).
<pub-id pub-id-type="pmid">19046798</pub-id>
</mixed-citation>
</ref>
<ref id="b10">
<mixed-citation publication-type="journal">
<name>
<surname>Klatzky</surname>
<given-names>R. L.</given-names>
</name>
,
<name>
<surname>Lederman</surname>
<given-names>S. J.</given-names>
</name>
&
<name>
<surname>Metzger</surname>
<given-names>V. A.</given-names>
</name>
<article-title>Identifying objects by touch: An ‘expert system’</article-title>
.
<source>Percept. Psychophys.</source>
<volume>37</volume>
,
<fpage>299</fpage>
<lpage>302</lpage>
(
<year>1985</year>
).
<pub-id pub-id-type="pmid">4034346</pub-id>
</mixed-citation>
</ref>
<ref id="b11">
<mixed-citation publication-type="journal">
<name>
<surname>Brainard</surname>
<given-names>D. H.</given-names>
</name>
<article-title>The Psychophysics Toolbox</article-title>
.
<source>Spat. Vis.</source>
<volume>10</volume>
,
<fpage>433</fpage>
<lpage>436</lpage>
(
<year>1997</year>
).
<pub-id pub-id-type="pmid">9176952</pub-id>
</mixed-citation>
</ref>
<ref id="b12">
<mixed-citation publication-type="journal">
<name>
<surname>Pelli</surname>
<given-names>D. G.</given-names>
</name>
<article-title>The VideoToolbox software for visual psychophysics: Transforming numbers into movies</article-title>
.
<source>Spat. Vis.</source>
<volume>10</volume>
,
<fpage>437</fpage>
<lpage>442</lpage>
(
<year>1997</year>
).
<pub-id pub-id-type="pmid">9176953</pub-id>
</mixed-citation>
</ref>
<ref id="b13">
<mixed-citation publication-type="journal">
<name>
<surname>Ericsson</surname>
<given-names>K.</given-names>
</name>
&
<name>
<surname>Simon</surname>
<given-names>H.</given-names>
</name>
<article-title>Verbal reports as data</article-title>
.
<source>Psychol. Rev.</source>
<volume>87</volume>
,
<fpage>215</fpage>
<lpage>251</lpage>
(
<year>1980</year>
).</mixed-citation>
</ref>
<ref id="b14">
<mixed-citation publication-type="journal">
<name>
<surname>Barras</surname>
<given-names>C.</given-names>
</name>
,
<name>
<surname>Geoffrois</surname>
<given-names>E.</given-names>
</name>
,
<name>
<surname>Wu</surname>
<given-names>Z.</given-names>
</name>
&
<name>
<surname>Liberman</surname>
<given-names>M.</given-names>
</name>
<article-title>Transcriber: Development and use of a tool for assisting speech corpora production</article-title>
.
<source>Speech Commun.</source>
<volume>33</volume>
,
<fpage>1</fpage>
<lpage>2</lpage>
(
<year>2000</year>
).</mixed-citation>
</ref>
<ref id="b15">
<mixed-citation publication-type="book">
<name>
<surname>Kipp</surname>
<given-names>M.</given-names>
</name>
<source>Gesture generation by imitation—From human behavior to computer character animation</source>
(Boca Raton, Florida: Dissertation.com,
<year>2004</year>
).</mixed-citation>
</ref>
<ref id="b16">
<mixed-citation publication-type="journal">
<name>
<surname>Lausberg</surname>
<given-names>H.</given-names>
</name>
&
<name>
<surname>Sloetjes</surname>
<given-names>H.</given-names>
</name>
<article-title>Coding gestural behavior with the NEUROGES-ELAN system</article-title>
.
<source>Behav. Res. Methods Instrum. Comput.</source>
<volume>41</volume>
,
<fpage>841</fpage>
<lpage>849</lpage>
(
<year>2009</year>
).</mixed-citation>
</ref>
</ref-list>
<fn-group>
<fn fn-type="conflict">
<p>The authors declare no competing financial interests.</p>
</fn>
</fn-group>
</back>
<floats-group>
<fig id="f1">
<label>Figure 1</label>
<caption>
<title>A schematic overview of the development and content of the multimodal resource of ‘thinking aloud’ verbal and spontaneously-generated motoric data on object affordances.</title>
<p></p>
</caption>
<graphic xlink:href="sdata201578-f1"></graphic>
</fig>
<table-wrap position="float" id="t1">
<label>Table 1</label>
<caption>
<title>An overview of the data captured by a frontal- and profile-view camera for each of the three experiments conducted.</title>
</caption>
<table frame="hsides" rules="groups" border="1">
<colgroup>
<col align="left"></col>
<col align="center" char="."></col>
<col align="center"></col>
<col align="center" char="."></col>
<col align="center" char="."></col>
</colgroup>
<thead valign="bottom">
<tr>
<th align="left" valign="top" charoff="50">
<bold>Experiment</bold>
</th>
<th align="center" valign="top" char="." charoff="50">
<bold>Participants</bold>
</th>
<th align="center" valign="top" charoff="50">
<bold>Stimulus input</bold>
</th>
<th align="center" valign="top" char="." charoff="50">
<bold>Total duration (in hours)</bold>
</th>
<th align="center" valign="top" char="." charoff="50">
<bold>Total size (in gigabytes)</bold>
</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left" valign="top" charoff="50">1</td>
<td align="center" valign="top" char="." charoff="50">43</td>
<td align="center" valign="top" charoff="50">Visual</td>
<td align="center" valign="top" char="." charoff="50">35.64</td>
<td align="center" valign="top" char="." charoff="50">14.04</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">2</td>
<td align="center" valign="top" char="." charoff="50">42</td>
<td align="center" valign="top" charoff="50">Visual</td>
<td align="center" valign="top" char="." charoff="50">30.89</td>
<td align="center" valign="top" char="." charoff="50">10.91</td>
</tr>
<tr>
<td align="left" valign="top" charoff="50">3</td>
<td align="center" valign="top" char="." charoff="50">39</td>
<td align="center" valign="top" charoff="50">Visual; Tactile</td>
<td align="center" valign="top" char="." charoff="50">28.54</td>
<td align="center" valign="top" char="." charoff="50">13.27</td>
</tr>
</tbody>
</table>
</table-wrap>
<table-wrap position="float" id="t2">
<label>Table 2</label>
<caption>
<title>Verbal and motoric elements annotated in the audiovisual data of the three experiments conducted, along with counts of unique categories or instances of those elements and representative examples.</title>
</caption>
<graphic xlink:href="sdata201578-t1"></graphic>
</table-wrap>
</floats-group>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/HapticV1/Data/Pmc/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000611 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd -nk 000611 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    HapticV1
   |flux=    Pmc
   |étape=   Curation
   |type=    RBID
   |clé=     PMC:4718047
   |texte=   A multimodal dataset of spontaneous speech and movement production on object affordances
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Curation/RBID.i   -Sk "pubmed:26784391" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd   \
       | NlmPubMed2Wicri -a HapticV1 

Wicri

This area was generated with Dilib version V0.6.23.
Data generation: Mon Jun 13 01:09:46 2016. Site generation: Wed Mar 6 09:54:07 2024