Thursday, April 23, 15:30 - 18:00, Location: Show and Tell Area A
Sheng-yi Kong, Miao-ru Wu, Che-kuang Lin, Yi-sheng Fu, Yungyu Chung, Yu Huang, Yun-Nung Chen, Lin-shan Lee
This proposal presents a spoken language system which automatically organizes the course lectures (video/audio/slides) for efficient learning on demand by the user. By properly matching the video/audio with the slides used, we divide the course lectures into hierarchical "major segments" with variable length based on the topics discussed. Key term extraction, hierarchical summarization and semantic structuring are then performed over these "major segments". A key term graph is constructed, based on which the various major segments of the course are linked. In this way, the user can ask questions to the system, and develop his own road map of learning the knowledge he needs considering his available time and his background knowledge, based on the semantic structure provided by the system. The system is referred to as "NTU Virtual Instructor", whose technical content has been written as a paper (paper number 4520) accepted for oral presentation in ICASSP 2009.
With fast advances of spoken language technologies, it is now possible to manage huge quantities of multimedia information based on the included audio information, because the speech information associated with the multimedia is usually the key for extracting the information. One of the application tasks in this area is on course lectures, not only because life-long learning has become necessary in the era of knowledge explosion, but because it is now possible to distribute huge quantities of complete course lectures worldwide very easily.
A major difficulty of utilizing the many complete course lectures available is that it takes quite long time to listen to a complete course (e.g. a complete course may include 45 hrs.), and it may not be easy for leaders or researchers working in the industry to spend so much time to learn a complete course. On the other hand, the content of a course is usually well structured; the learner cannot understand an advanced subject without knowing related background. As a result, direct retrieval of the course content for some advanced subjects is usually not helpful to the learner, simply because the retrieved results are difficult to understand. Also, after learning a subject the learner usually doesn't know what the related subjects are which should be learned next.
In "NTU Virtual Instructor" the above problem is solved by properly understanding and organizing the audio part of the lectures. When the user enters a query, the system not only retrieves the relevant short and major segments of the course lectures, but offers the summaries of the major segments (to help the user decide whether he should view the major segment or not), the key terms involved, and all related major segments (including those on fundamental or more advanced subjects) based on the relationships in the key term graph. So the user can easily learn on his personal demand, either starting with a fundamental query and learning forward, or starting with an advanced subject and learning backward.