Noise Meter - Introduction to Noise Suppression
After understanding the basic differences between noise suppression (suppressing speaker environmental noise for remote listeners to hear clearly) and active noise reduction (offsetting the listener's own environmental noise), let's focus on how to achieve noise suppression.
One method is to use multiple microphones to suppress data. Collecting data from multiple locations will result in devices receiving similar (but still differentiated) signals. The voice signal received by the microphone near the speaking population is significantly stronger than that of the secondary microphone. Two microphones will receive non voice background sound with similar signal strength. Subtract the sound information collected by the strong voice microphone and the secondary microphone, and the remaining majority is the voice information. The larger the distance between microphones, the greater the signal difference between closer and farther microphones, making it easier to use this simple algorithm to suppress noise. However, when you are not speaking, or when you expect voice data to change over time (such as when you walk or run, and your phone keeps shaking), the effectiveness of this method will decrease. Multi microphone noise suppression is certainly reliable, but there are drawbacks to additional hardware and processing.
So, what if there was only one microphone? If additional sound sources are not used for verification/comparison, a single microphone solution will rely on understanding the received noise characteristics and filtering them out. This is related to the previously mentioned definitions of steady-state and non-stationary noise. Steady state noise can be effectively filtered out through DSP algorithms, while non-stationary noise poses a challenge, deep neural networks (DNNs) can help solve the problem.
This method requires a dataset for training the network. This dataset consists of different (steady-state and non-stationary) noise and clear speech, creating a synthesized noisy speech pattern. Feed the dataset as input to DNN and output it with clear voice. This will create a neural network model that will eliminate noise and only output clear speech.
Even with trained DNNs, there are still some challenges and indicators to consider. If you want to run in real-time with low latency, you need strong processing power or a smaller DNN. The more parameters in DNN, the slower its running speed. The audio sampling rate has a similar effect on sound suppression. A higher sampling rate means that DNN needs to handle more parameters, but in turn, it will achieve higher quality output. Narrowband voice communication is an ideal choice for real-time noise suppression.
