Paper at SSCS 2008 — Using Term Clouds to Represent Segment-Level Semantic Content of Podcasts

Abstract¶

Spoken audio, like any time-continuous medium, is notoriously difficult to browse or skim without support of an interface providing semantically annotated jump points to signal the user where to listen in. Creation of time-aligned metadata by human annotators is prohibitively expensive, motivating the investigation of representations of segment-level semantic content based on transcripts generated by automatic speech recognition (ASR). This paper examines the feasibility of using term clouds to provide users with a structured representation of the semantic content of podcast episodes. Podcast episodes are visualized as a series of sub-episode segments, each represented by a term cloud derived from a transcript generated by automatic speech recognition (ASR). Quality of segment-level term clouds is measured quantitatively and their utility is investigated using a small-scale user study based on human labeled segment boundaries. Since the segment-level clouds generated from ASR-transcripts prove useful, we examine an adaptation of text tiling techniques to speech in order to be able to generate segments as part of a completely automated indexing and structuring system for browsing of spoken audio. Results demonstrate that the segments generated are comparable with human selected segment boundaries.

References¶

[1] Marguerite Fuller, Manos Tsagkias, Eamonn Newman, Jana Besser, Martha Larson, Gareth J.F. Jones, and Maarten de Rijke. 2008. Using Term Clouds to Represent Segment-Level Semantic Content of Podcasts. In Proceedings of the 2^nd SIGIR Workshop on Searching Spontaneous Conversational Speech (SSCS 2008). UvA Link PDF

Abstract

User generated spoken audio remains a challenge for Automatic Speech Recognition (ASR) technology and content-based audio surrogates derived from ASR-transcripts must be error robust. An investigation of the use of term clouds as surrogates for podcasts demonstrates that ASR term clouds closely approximate term clouds derived from human-generated transcripts across a range of cloud sizes. A user study confirms the conclusion that ASR-clouds are viable surrogates for depicting the content of podcasts.

References

[1] Manos Tsagkias, Martha Larson, and Maarten de Rijke. 2008. Term clouds as surrogates for user generated speech. In Proceedings of the 31^st annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR ‘08). Association for Computing Machinery, New York, NY, USA, 773–774. ACM Link PDF

59.68% similar — Paper at SIGIR 2008 — Term Clouds as Surrogates for User Generated Speech
I took part in SIREN 2008, a research event in the Netherlands, presenting our work with Martha Larson and Maarten de Rijke, on Term Clouds as Surrogates for User Generated Speech [1]; see post

References

[1] Manos Tsagkias, Martha Larson, and Maarten de Rijke. 2008. Term clouds as surrogates for user generated speech. In Proceedings of the 31^st annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR ‘08). Association for Computing Machinery, New York, NY, USA, 773–774. ACM Link PDF

45.94% similar — Talk at SIREN 2008 on Speech Term Clouds
Abstract

Podcasts display an unevenness characteristic of domains dominated by user generated content, resulting in potentially radical variation of the user preference they enjoy. We report on work that uses easily extractable surface features of podcasts in order to achieve solid performance on two podcast preference prediction tasks: classification of preferred vs. non-preferred podcasts and ranking podcasts by level of preference. We identify features with good discriminative potential by carrying out manual data analysis, resulting in a refinement of the indicators of an existent podcast preference framework. Our preference prediction is useful for topic-independent ranking of podcasts, and can be used to support download suggestion or collection browsing.

References

30.81% similar — Paper at ECIR 2009 — Exploiting Surface Features for the Prediction of Podcast Preference
References

[1] Manos Tsagkias, Martha Larson, Wouter Weerkamp, and Maarten de Rijke. 2008. PodCred: a framework for analyzing podcast preference. In Proceedings of the 2^nd ACM Workshop on Information Credibility On the Web (WICOW ‘08). Association for Computing Machinery, New York, NY, USA, 67–74. ACM link PDF

26.49% similar — Paper at WICOW 2008 — PodCred: A Framework for Analyzing Podcast Preference
Abstract

We focus on improving the effectiveness of a Virtual Assistant (VA) in recognizing emerging entities in spoken queries. We introduce a method that uses historical user interactions to forecast which entities will gain in popularity and become trending, and it subse- quently integrates the predictions within the Automated Speech Recognition (ASR) component of the VA. Experiments show that our proposed approach results in a 20% relative reduction in errors on emerging entity name utterances without degrading the overall recognition quality of the system.

Happy to share the news about my first joint pubication with the Siri Speech team at Apple. Our short paper Predicting Entity Popularity to Improve Spoken Entity Recognition by Virtual Assistants with Christophe van Gysel, myself, Ernie Pusateri, and Ilya Oparin, is accepted at SIGIR 2020.

22.55% similar — Paper at SIGIR 2020 — Predicting Entity Popularity to Improve Spoken Entity Recognition by Virtual Assistants
Much of what is discussed in social media is inspired by events in the news and, vice versa, social media provide us with a handle on the impact of news events. We address the following linking task: given a news article, find social media utterances that implicitly reference it. We follow a three-step approach: we derive multiple query models from a given source news article, which are then used to retrieve utterances from a target social media index, resulting in multiple ranked lists that we then merge using data fusion techniques. Query models are created by exploiting the structure of the source article and by using explicitly linked social media utterances that discuss the source article. To combat query drift resulting from the large volume of text, either in the source news article itself or in social media utterances explicitly linked to it, we introduce a graph-based method for selecting discriminative terms.

16.96% similar — Paper at WSDM 2011 — Linking online news and social media
Happy to share yet another publication with the Siri Speech team at Apple, this time led by Sashank Gondala, who interned with us last year. Our full paperError-driven Pruning of Language Models for Virtual Assistants is accepted at ICASSP 2021.

16.26% similar — Paper at ICASSP 2021 — Error-driven Pruning of Language Models for Virtual Assistants

Paper at SSCS 2008 — Using Term Clouds to Represent Segment-Level Semantic Content of Podcasts

Marguerite Fuller¹, Manos Tsagkias², Eamonn Newman¹, Jana Besser², Martha Larson², Gareth J.F. Jones¹, and Maarten de Rijke²

¹Dublin City University, ²University of Amsterdam

20 July 2008

Keywords: paper, sigir, sscs, speech, information retrieval

Abstract¶

References¶

Abstract¶

References¶

Related Posts

Abstract

References

References

Abstract

References

References

Abstract