On 12 December 2005 the TALK and AMI projects jointly sponsored a workshop in Edinburgh entitled "Standards for Multimodal Dialogue Context." The workshop's keynote speaker was Henry Thompson (a member of the W3C with a background in linguistics, and the first academic to be appointed to the W3C's Technical Architecture Group). A key participant was Dave Raggett (a member of the W3C with expertise in multimodal interaction and standards such as EMMA). There were over 30 participants, from 7 different countries and 14 different research institutes. The meeting included representatives of the W3C's Multimodal Interaction Working Group and of the ISO Technical Committee 37 SubCommittee 4, which is concerned with language resources management. There were a totoal of 9 research presentations and 3 discussion periods. See http://homepages.inf.ed.ac.uk/olemon/standcon-agenda.html for the workshop agenda and a link to the online proceedings.
The goal of the workshop was to develop an understanding about what areas relating to context representations in multimodal dialogue systems were ready for standardization. It began with words of advice from the W3C about how to start a standardization effort, stressing the need to start by establishing common terminology, and only attempting to standardize where a large enough community could reach a consensus that it would be adopted, leaving openings in the standard for extension to cover the areas where no consensus would be possible.
Within this context, the workshop interspersed discussion of possibilities for standardization with talks about annotation tools, efforts to annotate dialogue resources, representations of dialogue context, and systems for categorizing the intentions behind dialogue utterances using sets of "dialogue acts".
Although many in the dialogue community would wish for one standard set of dialogue acts, the participants felt that it would be difficult to agree on a set that fit all dialogue processing needs and all dialogue genres. Similarly, although there is some commonality in the concept of tracking dialogue context in a system as a dialogue progresses towards its goal, the kinds of contextual information that are tracked differ widely, and the mechanisms themselves can be quite varied. Parts of the community would like to work on the same basic data in order to compare approaches and results better, although this is not always practical, and simply better data sharing would be of help, partcularly for emerging statistical approaches such as Reinforcement Learning. The best opportunity for standardization was seen as having a metadata standard that catalogues the types of entities that could be tracked as dialogue context, so that released data sets could describe what they contain against this standard. One possible mechanism for taking this idea forward is via the W3C's pre-standard "incubator", which helps groups create a public note of an idea for a standard that can lead to the momentum for a standardization working group.
The mailing list for further discussion of these issues is dialogue-standards@ inf.ed.ac.uk