Subjects:
- 3D Positional Audio.
- Virtual Reality Audio.
- Fast FFT Convolution.
- Hearing Loss Corrections.
- Noise Reduction.
- Beamforming Microphone Array.
- Web Audio, WebRTC.
A project of audio communications named vlrPhone is under study and development.
For more information on this project, see at the following addresses:
-
WhmSoft
-
VoIP Apps
vlrPhone uses a high quality audio codec (compression an decompression method), entirely based on FFT (Fast Fourier Transform).
At this stage of the development, from the version 7 of the codecs, the project provides 3 utilities on Windows for testing the audio codecs (named VLC and VLR) and other features:
- test.exe: reads an input Wave file, encodes the file, decodes the file and creates another output Wave file.
- fshift.exe: in addition, allows to test the frequency shifts to correct the hearing losses, and the noise reduction.
- pshift.exe: in addition, allows to test the pitch shifts to correct the hearing losses, the pitch shifts in general, and the noise reduction.
For more information on these utilities, see at the following addresses:
-
Frequency Shifting
-
Pitch Shifting
-
Noise Reduction
The vlrFilter software will resume and complete the features of these utilities, adding a dynamic control of the parameters via a remote control.
The software will be able to be launched without any parameter, or with one or more parameters. The values of the missing parameters will be read in a configuration file if they are in this file, otherwise default values will be taken.
The values of these parameters will be able to be tested and set in real time by the use of a remote control.
Note that to date, these utilities work with the VLC and VLR codecs.
These utilities also support the uncompressed PCM codecs (pcm8, pcm16, pcm32 and pcm48, for the 8, 16, 32 and 48 kHz sampling rates). For treatments, the FFT and inverse FFT transformations are performed on the decoder side. The vlrFilter software will also take into account the uncompressed audio data.
To date, the vlrPhone Project and the utilities use the open source PJSIP library.
For the vlrFilter project, we will use the Chromium library. The Web Audio and the WebRTC interfaces will be available with the choice of this library.
The VLC and VLR codecs will be implemented properly in the Chromium library. All the features listed here for these codecs will be available via command lines, config files or default values.
It will be possible to build custom software with APIS based on HTML5 and Javascript, or based on C/C++, directly from the modified Chromium library.
The software will support the Windows and the Linux (Raspberry PI) operating systems.
Other operating systems (iOS, Android, Mac, Chrome OS, ...) will be considered later.
The input will be mono or stereo, with various possibilities:
- An uncompressed audio input (microphone or line).
- An uncompressed Wave file.
- A compressed audio input (with the VLC and VLR codecs).
- A compressed Wave file (with the VLC and VLR codecs).
The output will be mono or stereo, with various possibilities:
- An uncompressed audio output.
- An uncompressed Wave file.
- A compressed Wave file (with the VLC and VLR codecs).
The output will be connected to speakers, to hearing aids, to earphones or to headphones via wires, the bluetooth or the wifi. Refer to the documentation of your system for this implementation.
The sampling rates will be given, in the order, by the input Wave file, or by the VLC or VLR codec, or during the start of the software, or by the global configuration file, or by a default value.
Optionally, the input and the output will support more than two channels (multichannel).
The GPU acceleration will be available through the Chromium library.
The infrared remote control will be supplied with a USB infrared receiver.
We will describe below the features that will support the remote control and the interactions with the vlrFilter software.
You have to choose a function and a number representing a sub-function (keys 0 to 9).
The [+] and [-] keys are used to change the values.
The [++] and [--] keys are used to change the values faster.
The [OK] key validates the choices.
The [Cancel] key cancels all the choices since the last validation.
The [Close] key closes the software.
The [Reset] key resets the values for a function (if you choose a function and if you press [Reset]) or for a sub-function (if you choose a sub-function and if you press [Reset]).
The [V+] and [V-] are used to change the global output volume.
The [i] key shows or hides the configuration information on the computer screen.
The software must dynamically respond to all the remote control commands.
1)
[3D]
For the 3D positional audio.
For more information on the 3D positional audio, see at the following address:
-
3D Positional Audio
- Key [1]: to change the azimuth (the angle in the horizontal plane). The azimuth varies from -180 to 180 degrees in steps of 5 degrees.
- Key [2]: to change the elevation (the angle in the vertical plane). The elevation varies from -90 to 90 degrees in steps of 5 degrees.
The 3d audio will be available through the Chromium library.
To best customize these generic HRTF filters, especially in the horizontal plane, we add two parameters controlled by the keys [3] and [4]:
- The ITD (Interaural Time Difference).
- The ILD (Interaural Level Difference).
- Key [3]: to change the value of the ITD (in microseconds) from -10000 to 10000 in steps of 100 (keys [+] or [-]) or 1000 (keys [++] or [--]). The default value for this parameter is 0 microsecond.
- Key [4]: to change the ILD value (in dB) from -40 to 40 in steps of 0.1 (keys [+] or [-]) or 1 (keys [++] or [--]). The default value of this parameter is 0 dB.
2)
[Reverb]
For the reverberation effect by the fast FFT convolution.
The effect is applied to a mono or a stereo output.
The name of the Impulse Response (IR) file is given when the software starts or is read in a global configuration file.
The format (stereo Wave file) and the conditions to be met for this file are the same as for the 3D positional audio.
The keys [1] and [2] are used to control the additional gains or attenuations to apply to the output channels.
3)
[IR360]
To select the multichannel surround effect by the fast FFT convolution.
The number of input channels and the number of output channels are given when the software starts or are read in a global configuration file.
The name of the Impulse Response (IR) file is given when the software starts or is read in a global configuration file.
The format (multichannel Wave file) and the conditions to be met for this file are the same as for the 3D positional audio.
The keys [1], [2], etc..., are used to control the additional gains or attenuations to apply to the output channels.
Note that when using the multichannel files compressed with the VLC and VLR codecs, there is no need to make FFT before performing the fast convolution.
There is no theoretical limit to the number of the channels. The actual limit is imposed by the hardware and the device drivers.
The sampling rate is 44 kHz or 48 kHz, the software working internally at 48 kHz.
This feature is available in option.
4)
[Multi]
To select the multichannel only, without the fast FFT convolution.
The number of input channels and the number of output channels are given when the software starts or are read in a global configuration file.
The keys [1], [2], [3], ..., are used to control the additional gains or attenuations to apply to the output channels.
The [Multi] option can be used to apply a simple but efficient algorithm to a beamforming microphone array.
To activate this algorithm instead of a simple multichannel audio, you must use The subfunctions * and #:
* [1], [2], [3], ... to give the additional gains or attenuations to apply to the magnitudes.
# [1], [2], [3], ... to give the additional shifts to apply to the phases.
An intermediate output channel will be created, all output channels will have the same content than this intermediate channel.
See ILD and ITD sections for examples of value.
Notes:
Our codec can efficiently encode data from the beamforming microphone arrays, with or without pre-processing on the transmitter side (compression), and with the possibility of rapid processing on the receiver side (decompression).
The quality of the desired result (noise reduction, spatial filtering, ...) depends on the number of microphones and the most efficient algorithms use FFT.
For more information on the beamforming, see at the following address:
Beamforming
There is no theoretical limit to the number of the channels. The actual limit is imposed by the hardware and the device drivers.
The sampling rate is 44 kHz or 48 kHz, the software working internally at 48 kHz.
This feature is available in option.
5)
[EQ7]
For the 7-band graphic equalizer or the audiometry, left output or left ear, and right output or right ear.
The 7 standard frequencies are controlled by the keys [1] to [7].
A FIR filter is automatically generated and applied to the audio output.
The 7 standard frequencies are:
- 125 Hz, 250 Hz, 500 Hz, 1000 Hz, 2000 Hz, 4000 Hz and 8000 Hz.
6)
[EQ10]
For the 10-band graphic equalizer, left output or left ear, and right output or right ear.
The 10 standard frequencies are controlled by the keys [0] to [9].
A FIR filter is automatically generated and applied to the audio output.
The 10 standard frequencies are:
- 32 Hz, 63 Hz, 125 Hz, 250 Hz, 500 Hz, 1000 Hz, 2000 Hz, 4000 Hz, 8000 Hz and 16000 Hz.
7)
[F-Sh]
For the frequency shifting.
For more information on the frequency shifting, see at the following address:
-
Frequency Shifting
- Key [1]: to set the value of the compression ratio between 1 and 100, in steps of 1 (Keys [+] or [-]) or 10 (keys [++] or [--]). The default value of the compression ratio is 1 (no frequency shifting).
- Key [2]: to set the value of the compression threshold (in Hz), in steps of 10 or 100. The default value of the compression threshold is 1500 Hz.
- Key [3]: to set the value of the shift offset (in Hz), in steps of 10 or 100. The default value of the shift offset is 0 Hz.
- Key [4]: to set the value of the width for the composition (in Hz), in steps of 10 or 100. The default value of the width is 0 Hz (no composition).
The output assumes a single channel.
In the case of an input with several channels, one can assume that there is a beamforming microphone array and apply a specific algorithm to this output, before applying the methods of this section. Just enter the following values:
* [1], [2], [3], ... to give the additional gains or attenuations to apply to the magnitudes.
# [1], [2], [3], ... to give the additional shifts to apply to the phases.
See the [Multi] section.
The multichannel is available in option.
8)
[P-Sh]
For the pitch shifting.
For more information on the pitch shifting, see at the following address:
-
Pitch Shifting
- Key [1]: to set the value of the shift ratio between -100 and 100, in steps of 1 (Keys [+] or [-]) or 10 (keys [++] or [--]). The default value of the shift ratio is 0 (no pitch shifting).
- Key [2]: to set the value of the shift threshold (in Hz), in steps of 10 or 100. The default value of the shift threshold is 1500 Hz.
- Key [3]: to set the value of the shift offset (in Hz), in steps of 10 or 100. The default value of the shift offset is 0 Hz.
- Key [4]: to set the value of the width for the composition (in Hz), in steps of 10 or 100. The default value of the width is 0 Hz (no composition).
The output assumes a single channel.
In the case of an input with several channels, one can assume that there is a beamforming microphone array and apply a specific algorithm to this output, before applying the methods of this section. Just enter the following values:
* [1], [2], [3], ... to give the additional gains or attenuations to apply to the magnitudes.
# [1], [2], [3], ... to give the additional shifts to apply to the phases.
See the [Multi] section.
The multichannel is available in option.
9)
[G-Sh]
Special case of the pitch shift with shift threshold = 0 Hz, shift offset = 0 Hz and no composition.
This effect corresponds to a guitar pitch shifting effect.
- Key [1]: to set the value of the shift ratio between -100 and 100, in steps of 1 (Keys [+] or [-]) or 10 (keys [++] or [--]).
The output assumes a single channel.
In the case of an input with several channels, one can assume that there is a beamforming microphone array and apply a specific algorithm to this output, before applying the methods of this section. Just enter the following values:
* [1], [2], [3], ... to give the additional gains or attenuations to apply to the magnitudes.
# [1], [2], [3], ... to give the additional shifts to apply to the phases.
See the [Multi] section.
The multichannel is available in option.
10)
[Noise]
For the noise reduction.
For more information on the noise reduction, see at the following address:
-
Noise Reduction
- Key [1]: to set the value of the foreground floor (in dB) between -180 and 0, in steps of 1 (Keys [+] or [-]) or 10 (keys [++] or [--]). The default value of the foreground floor is -80 dB.
- Key [2]: to set the value of the voiced level (in dB) between -180 and 0, in steps of 1 or 10. The default value of the voiced level is -40 dB.
- Key [3]: to set the value of the energy ratio between 0 and 1000, in steps of 0.1 or 10. The default value of the energy ratio is 1.
- Key [4]: to set the value of the noise gain or of the noise attenuation (in dB), between -180 and 180, in steps of 1 or 10. The default value of this parameter is 0 dB (no gain or attenuation).
The output assumes a single channel.
In the case of an input with several channels, one can assume that there is a beamforming microphone array and apply a specific algorithm to this output, before applying the methods of this section. Just enter the following values:
* [1], [2], [3], ... to give the additional gains or attenuations to apply to the magnitudes.
# [1], [2], [3], ... to give the additional shifts to apply to the phases.
See the [Multi] section.
The multichannel is available in option.
11)
[i1],[i2],[i3],[i4]
To display the information on the sponsors, from the level 1 to the level 4, on the computer screen.
Risks
The risks associated with this project are minimal, most of the features already exist. The multimedia remote controls are widely used today (XBMC Media Center, Windows Media Center, ...). We offer just a remote control adapted to our project.
There are some interesting features to develop or to finish. A slight delay is not excluded.
Rewards
This project cannot succeed without the support of interested users and sponsors.
The first funds raised will serve to complete the coding and to launch the manufacture of the remote controls and the infrared receivers.
1)
Software I
vlrFilter, basic version
For Windows and Linux (Raspberry PI)
June 30, 2017
5 euros
2)
Software II
vlrFilter, basic version and options
For Windows and Linux (Raspberry PI)
September 30, 2017
50 euros
3)
Infrared remote control and receiver
July 31, 2017
50 euros
Shipping:
USA, Canada, France: 15 euros
UE: 20 euros
Other countries: 25 euros
+vlrFilter, basic version and options
4)
Sponsor I
July 31, 2017
Name and URL of a sponsor's website.
The list of these sponsors is visible with the key [i1] on the computer screen.
Name and link in a page linked to the software web page, in the Sponsors I section.
100 euros
+vlrFilter, basic version and options
+ Infrared remote control and receiver and free shipping
5)
Sponsor II
July 31, 2017
Name, logo and URL of a sponsor's website.
The list of these sponsors is visible with the key [i2] on the computer screen.
Name, logo and link in a page linked to the software web page, in the Sponsors II section.
250 euros
+vlrFilter, basic version and options
+ Infrared remote control and receiver and free shipping
6)
Sponsor III
July 31, 2017
Name, logo and URL of a sponsor's website.
The list of these sponsors is visible with the key [i3] on the computer screen.
Name, logo and link in a page linked to the software web page, in the Sponsors III section.
500 euros
+vlrFilter, basic version and options
+ Infrared remote control and receiver and free shipping
7)
Sponsor IV
July 31, 2017
Name, logo and URL of a sponsor's website.
The list of these sponsors is visible with the key [i4] on the computer screen.
Name, logo and link in a page linked to the software web page, in the Sponsors IV section.
1000 euros
+vlrFilter, basic version and options
+ Infrared remote control and receiver and free shipping
Infographics