No Comments

SoundStudioMFC.zip - 229K SoundStudioCS.zip - 176K
windowsmedia.zip - 919K

Introduction

SoundStudio is my latest demo application showing how to use peakmeter with real sound. You should find source code to read and decode multimedia audio stream.
Most of the code is part of the Multimedia Library that I've been using for the last couple of years. These classes are very easy to use but I should document them in more details once I have some time. If you have questions, let me know.

Description

Real-time Peak Meter processing is composed of three blocks.
Mic processing
  • The source: source of audio data, this can be a microphone, a digital sound file (CD audio) or even a video if you consider grabbing the audio part of it. Whatever your source the principle is the same, you will have to convert the data to a format that is suitable for Signal Processing.
  • DSP Processing: what I'm calling DSP processing block here is the component that is responsible to convert the audio data to peak amplitude format.
  • Peak Meter rendering: this is basically the rendering control (just like the Peak Meter control of this article). The rendering engine doesn't have to be all fancy like this control. A progress control can be used to show Peak audio. In fact if you look around Windows control panel, this is exactly what is being used.

Audio Source

The microphone is the most basic audio source that you can think of.
Basically the previous figure illustrates how your computer receives digital samples of the sound that is produced from the microphone. The way the whole process works is that you specify the format that you wish to receive the data, mostly PCM data (Pulse Code Modulation) and the audio capture device will collect the samples based on sampling frequency. PCM data and WAVE chunks are common terms to describe the data that is obtained from the ADC (Analog to Digital Converter).
For example to digitize audio samples from a microphone at 22050 Hz single channel audio (mono) and 16 bits resolution; and let's say you want to receive samples every 100 ms, you will have to provide a buffer of 4410 bytes (22050*(16/8)*0.1).

I mentioned previously about file source. A file source is simply an audio data storage that gives you quick access to your samples. Audio formats like: MP3, WMA and Ogg Vorbis; all they do is compressing the data while trying to maintain the audio quality.

DSP Processing

DSP processing is very interesting subject to learn and work with. This block receives digital samples from the source. It approximates the original waveform and finds its peak magnitudes.
Since I would not be able to go in details about how FFT (Fast Fourier Transform) works in this article, I recommend the interested reader to visit some of the links in the reference section to increase his/her knowledge about this process.

FFT plays an important role in signal processing and is probably one of the most written subjects in software engineering. When dealing with digital system, the ADC (Analog to Digital Converter) gives us a set of digital audio samples (discrete signal). The theory behind it tells us that when we perform a DFT (Discrete Fourier Transform) on a discrete signal, we find its composant frequencies including their phase and amplitude. I would like to direct the advanced reader that I'm not talking only about pure sine wave where only the fundamental is visible but performing a DFT (or FFT) on audio signal generally produce some amplitudes on nearby bin frequencies.

Now all we have left to do is finding the range of frequencies for our analysis. According to the sampling theorem we can approximate the maximum audio frequency in our signal. The Nyquist theorem states that the baseband Fs > 2B, which means the sampling frequency (digital) must be at least twice than any frequency in the range of B frequencies (analog) in order to reconstruct the original signal and prevent aliasing. Aliasing is the effect that causes different signal to be indistinguishable (or aliases of one another) when sampled. Thus, if we sample at 44.1KHz (audio CD quality), we could capture the entire range of sound (roughly 20KHz).

Note that the sampling frequency needs to be at least twice our max frequency, using 44.1KHz to digitize a 20KHz sine waves improves our spectrum analysis by getting a few more samples than necessary.

Ideal FFT
Ideal FFT


Peak Meter Rendering

Peak Meter is just a "piece of cake" since the DSP block takes all the burden of what has to be done. What we do next, we select a group of frequencies and display the amplitude of their closest bin frequencies. Peak Meter control can be as simple as using a progress control. But I guess we like to be fancy from time to time! The PeakMeter control presented here does just that.
It is best to choose frequencies in the range below because they are typically found in normal conversation and music.
Audio Frequency map
Peak Meter control is described in more details here

SoundStudio Demo

SoundStudio Application is a simple sound player application capable of playing various audio files (.wav, .mp3. and .wma). It uses the WindowsMedia .NET library to parse the audio. The WindowsMedia .NET version included in this article doesn't have all the features. Full feature release will be available soon.



References

Discrete Fourier Transform
Discrete Fourier Transform (Math)
FFT
Nyquist Theorem

History

07/10/2008: SoundStudio code release (MFC version)
09/02/2008: SoundStudioCS and WindowsMedia 1.0 code update
11/09/2008: SoundStudioCS and WindowsMedia 1.3 code update