Malcolm Slaney, David V. Anderson
Psychoacoustics is the study of how humans perceive sound. In this tutorial we explain the basics of sound perception using many audio examples and also explain how a knowledge of psychoacoustics is applied to applications including speech and audio compression, signal enhancement, speech recognition, virtual reality, auditory scene analysis, quality assessment, and hearing-loss compensation.
Psychoacoustis is of growing importance to signal processors---interfacing to human perception is a big challenge for the engineering world. One example of an early success is MP3 music and the underlying psychoacoustic masking model. Speech recognition based on auditory models (MFCC and RASTA) are successful, but better recognizers will require understanding of the human "cocktail-party effect." Better compression and sound understanding systems will benefit from compressed-domain sampling and independent component analysis. We will show how the human auditory system is modeled with signal processing, and show how signal processing can better address the needs of the human perception system.
This tutorial will be illustrated with many audio and video examples, classroom demonstrations and tests that are hard to appreciate from reading a paper.
Malcolm Slaney is a principal scientist at Yahoo! Research Laboratory. He received his PhD from Purdue University for his work on computed imaging. He is a coauthor, with A. C. Kak, of the IEEE book "Principles of Computerized Tomographic Imaging." This book was recently republished by SIAM in their "Classics in Applied Mathematics" Series. He is coeditor, with Steven Greenberg, of the book "Computational Models of Auditory Function."
Before Yahoo!, Dr. Slaney has worked at Bell Laboratory, Schlumberger Palo Alto Research, Apple Computer, Interval Research and IBM's Almaden Research Center. He is also a (consulting) Professor at Stanford's CCRMA where he organizes and teaches the Hearing Seminar. His research interests include auditory modeling and perception, multimedia analysis and synthesis, compressed-domain processing, music similarity and audio search, and machine learning.
David V. Anderson is an associate professor in the School of Electrical and Computer Engineering at Georgia Tech and director of ACES, the Advanced Center for Embedded Systems. He received his PhD from Georgia Tech for work in audio signal representations. Dr. Anderson's research interests include audio and psycho-acoustics, signal processing in the context of human auditory characteristics, and the real-time application of such techniques using both analog and digital hardware. His research has included the development of a digital hearing aid algorithm that has now been made into a successful commercial product.
Dr. Anderson was awarded the National Science Foundation CAREER Award for excellence as a young educator and researcher in 2004 and the Presidential Early Career Award for Scientists and Engineers in the same year. He has over 100 technical publications and 5 patents. Dr. Anderson is a senior member of the IEEE, and a member the Acoustical Society of America, and American Society for Engineering Education.