Project of very low consumption, radiation and bitrate softphones, with the support of the spatial audio, of the frequency shifts and of the ultrasonic communications. For computers, tablets, smartphones and connected objects.
vlrFilter
Multifunction Audio Filter with Remote Control.
3D positional audio, fast FFT convolution, hearing loss corrections, noise reduction. For Windows and Linux (Raspberry PI).
vlrMemos
Audio Memos, Voice Quality and 3D Positional Sound.
Apps to record audio memos, measure the voice quality and provide a 3D positional sound. For computers, tablets, smartphones and connected objects.
New:
The VLC and VLR codecs are ready on Windows. See below for the executables and the sources codes. vlrPhone.exe: to communicate via SIP (VLC and VLR codecs or G711). test.exe: to test the codecs with Wave files. play.exe: to play Wave files. fshift.exe: to test the frequency shifts and the noise reduction (V7). More Information pshift.exe: to test the pitch shifts and the noise reduction (V7). More Information Click Here for more information about the noise reduction with the VLC and VLR codecs.
The utilities support the uncompressed PCM codecs.
Next Releases:
- Additional processing to remove all noise (taking into account of the phases for the first or the greatest local peaks, and implementation of a 50% frame overlapping).
Done.
- High Quality Audio (the greatest points and the most energetic bands).
- VLC 8, VLC 16, VLC 32 and VLC 48 (the greatest local peaks).
Done.
- VLC and VLR Codecs: 8 kHz sampling rate instead of 16 kHz.
Done.
- Adding the codec VLC HQ 16: high quality and 16 kHz sampling rate.
Done.
- VLC 16, 32 and 48: use of the greatest points and the most energetic bands, instead of the local peaks only. The VLC, VLR and VLC 8 codecs always use the local peaks only.
Done.
- VLC HQ 16 and 48: transmission and use of the points energy.
These codecs have become virtually quasi-lossless in energy, compared to non-compressed frames.
Done.
- Frames partial overlapping of less than 50%, for more quality and less computations:
- VLC, VLR: no change (50%)
- VLC 8, 16, 32 and 48: 33%
- VLC HQ 16 and 48: less than 10%
Done.
- Possibility to use IIR (Infinite Impulse Response) filters or FIR (Finite Impulse Response) filters, for the VLC, VLR and VLC 8 codecs.
The other codecs (VLC 16, 32, 48, HQ 16 and HQ 48) no longer use or do not use filters.
Done.
- The FIR filters (VLC, VLR and VLC 8 codecs) are applied in the Fourier domain (fast convolutions).
Done.
- VLC 3D 48 and VLC HQ 3D 48 Codecs (3D audio codecs):
Codecs compatible with the 3D audio. 48 kHz sampling rate.
During the sound renderings, before the decompressions, HRTF (Head-Related Transfer Function) filters are applied to the channels in the Fourier domain (very fast operations), for high quality 3D positional audio outputs.
The HRTF filters are loaded at startup and can be generic or personalized.
The VLC codecs support only one channel for now.
Done. More Information
- VLC 8 Codec: FIR filter used by default. Greatest points, most energetic bands and almost same quality as VLC 16.
Done. To publish (V7).
VLC and VLR codecs:
- Increased quality.
- 16 kHz external sampling rate and 8 kHz internal sampling rate. Internal partial overlapping: 33%.
- Possibility to use a tone generator instead of IFFT (inverse FFT) during the decompression.
Done. To publish (V7).
- Tone generator:
The use of a tone generator during the decompression allows a very fine control of the frequencies (double precision), as with the magnitudes and the phases.
The tones can be generated in parallel, so the generation is compatible with the GPU programming.
The very fine control of the frequencies allows to do frequency shifts or pitch shifts in real time with very few additional operations.
This powerful feature can be used especially in case of profound hearing loss in some frequency region, during the voice communications.
Done. To publish (V7).
- Uncompressed PCM codecs:
The utilities support the uncompressed PCM codecs (pcm8, pcm16, pcm32 and pcm48, for the 8, 16, 32 and 48 kHz sampling rates).
For treatments, the FFT and inverse FFT transformations are performed on the decoder side.
Done. To publish (V7).
- Default bitrate for the VLC HQ 48 codec:
The default bitrate for the VLC HQ 48 codec goes from 64 Kbps to 96 Kbps.
Significant increase of the quality of this codec by an increase of the accuracy of the magnitudes and the phases.
Done. To publish (V7).
- Frequency shifts:
Custom control of frequencies (all codecs except the 3D audio codecs):
XML text file to define custom settings for each frequency range that you want to control. More Information
To finish.
- Pitch shifts:
Custom control of frequencies (all codecs except the 3D audio codecs):
XML text file to define custom settings for each frequency range that you want to control. More Information
To finish.
- All codecs except the codecs with frequency shifts enabled:
Possibility to load custom FIR filters for all the codecs and all the sampling rates.
This is useful for personalized audio output and hearing corrections. The filters are generated from XML text files containing the relative sensitivity of each ear at different frequencies (as the audiogram data).
For the developers, the FIR filters will be changed dynamically by calling a PJSIP function (pjmedia_codec_modify). More Information
To do.
- Stereo, multichannel and dynamic VLC 3D 48 and VLC HQ 3D 48 Codecs:
The HRTF filters will be applied to all the output channels, in particular to the left and right channels with the stereo outputs.
For the developers, the HRTF filters will be changed dynamically by calling a PJSIP function (pjmedia_codec_modify).
To do.
- Support of GPU (Graphics Processing Unit) for more efficiency and less energy consumption (FFT, FIR filters, sorts, bands of the background, tone generators).
We will use OpenGL and the Compute Shaders (OpenGL 4.3 and OpenGL ES 3.1) and/or an equivalent SDK.
To do.
- Apps for connected watches or smartwatches and connected bracelets.
- Apps for Smartphones (Android, iOS, Windows Phone).
To do.
- Asterisk module (VoIP Server).
To do.
- Implement an additional lossless compression: VLR++ codec, at less than 1200 bits per second (bps). We will use the LZW compression, with a dictionary built with thousands of previously transmitted frames, and a complete flush of data to transmit at the end of each frame. More Information
To finish.
- The codebook version of the VLC and VLR codecs, at less than 1000 bps, with the vector quantization and the nearest neighbor search (kNN). More Information
To finish.
Default bitrates and quality:
Default bitrates, without taking into account the silences. The silences (null frames) are not transmitted.
With the VLR codec, the bitrate can be limited to a specific value by:
- taking into account the non transmission of frames.
- non transmission of frames if the maximum number of bits for a second is reached (maximum bitrate).
- non transmission of frames while the average bitrate is reached (the moving average of the last ten seconds).
The maximum bitrate is limited to 4200 bps.
The average bitrate is limited to 4200 bps.
The average bitrate is rather between 2400 and 3600 bps.
The VLR codec is particularly useful where a very low bitrate is required (e.g. for the connected objects, including the connected watches or smartwatches and the connected bracelets, including the home automation and the vehicles, e.g. for the satellite communications). Go to the Listening Page
Why another HD Voice codec?
The reference of HD voice (High Definition or wideband) codec is G722.2 called AMR-WB (Adaptive Multi-Rate Wideband).
HD Voice is the future of smartphone calls.
Our codec has some advantages compared to AMR-WB:
- It is based on FFT and can be acceletated with the GPU support.
In addition to being useful for images, videos and games, graphics cards can also be useful for audio, with a reduction in battery consumption.
The GPU support (CUDA, OpenCL, Compute Shaders or another interface) is a reality for desktop and portable computers, and is another future for smartphones.
The GPU support is already a reality for many smartphones, smartwaches and connected bracelets.
- It is not limited to 16 kHz sampling rate. It can use sampling rates from 8 kHz to 48 kHz (and much more), with very low bitrates. The bitrates depend on the number and the precision of the local peaks or the points.
Generally speaking, from the frames compressed with the VLC and VLR codecs, in memory or on a storage medium, one can:
- Return to the original signal (inverse FFT with the magnitudes and phases).
- Generate the cepstrum (inverse FFT of the logarithm, for the speech recognition, the research of diseases or abnormalities). If one is interested only in the real cepstrum and if one is not interested in the contribution of the phases, one can not transmit or store the phases.
- Generate the spectral envelope. The points taken into account for the compression consisting of the logarithm of the magnitudes and the phase values (except if the phases are ignored), they constitute directly the nodes of the spectral envelope of the signal.
- Make a FFT fast convolution, just before the decompression (simple multiplication in the frequency domain), for a custom filtering or for a 3D positional sound.
- Make a FFT fast cross correlation, just before the decompression (simple multiplication in the frequency domain), to measure the similarity between two signals, for example to measure the synchrony between the ECG (ElectroCardioGram) or the EEG (ElectroEncephaloGram) signals.
Our codec can efficiently encode data from the beamforming microphone arrays, with or without pre-processing on the transmitter side (compression), and with the possibility of rapid processing on the receiver side (decompression).
The quality of the desired result (noise reduction, spatial filtering, ...) depends on the number of microphones and the most efficient algorithms use FFT. More Information
With compatible data, our codec can integrate efficiently in a Big Data architecture, with a reduction in data size for the storage and the transmission, and a reduction in the memory used for the processings if the latter use the frequency domain. More Information
For IoT (Internet of Things) and M2M (Machine to Machine), at the level of the machines, the VLC and VLR codecs can provide very important features in artificial intelligence and deep learning for the detection of the anomalies, the recognition of the objects and the prediction of the behaviors. More Information
HD Voice, High Definition Voice, Wideband Voice
VoIP, Voice over IP
VoLTE, Voice over LTE, 4G
SIP, Session Initiation Protocol
FFT, Fast Fourier Transform
Repeats Credit
Softphone
Low battery consumption
Low electro-magnetic radiation
Satellite communications
GPU, GPGPU, Compute Shaders
Connected Objects
Connected watches (smartwatches) and connected bracelets
Vector quantization, nearest neighbor search (kNN)
3D Audio, 3D Positional Audio
HRTF (Head-Related Transfer Function)
Virtual Reality Audio
Beamforming Microphone Array
Codec for the Ultrasounds
Big data
Artificial Intelligence
Supervised Learning, Unsupervised Learning
Reinforcement Learning, Automatic Learning
Deep Learning
Neural Network
Internet of Things (IoT)
Machine to Machine (M2M)
OFDM (Orthogonal Frequency-Division Multiplexing)
PLC (Power-Line Communication)
vlrPhone:
vlrPhone is an app based on PJSIP and Open Source products, whose goal is to promote, in the future versions, a new codec (compression and decompression method) for the audio and the ultrasounds.
This codec is based on FFT (Fast Fourier Transform), the greatest points and the most energetic bands. It can also use only the local peaks.
This codec is characterized by its efficiency, its simplicity and its robustness.
It will allow low battery consumption (VLC, Very Low Consumption, yellow button) and emissions of audio frames limited to the strict minimum, so the electro-magnetic radiations will be minimized (VLR, Very Low Radiation, green button).
It will allow, in real-time, the frequency shifts, the pitch shifts and the equalization for the correction of hearing losses.
It will allow also to do local or near-field ultrasonic communications, so with no electro-magnetic radiation, with or without frequency shifts.
More details can be found Here.
If the codec is not implemented, the app behaves as a normal softphone.
If the correspondent does not use vlrPhone, the app behaves as a normal softphone.
vlrFilter:
vlrPhone uses a high quality audio codec (compression an decompression method), entirely based on FFT (Fast Fourier Transform).
At this stage of the development, from the version 7 of the codecs, the project provides 3 utilities on Windows for testing the audio codecs (named VLC and VLR) and other features:
- test.exe: reads an input Wave file, encodes the file, decodes the file and creates another output Wave file.
- fshift.exe: in addition, allows to test the frequency shifts to correct the hearing losses, and the noise reduction.
- pshift.exe: in addition, allows to test the pitch shifts to correct the hearing losses, the pitch shifts in general, and the noise reduction.
The vlrFilter software will resume and complete the features of these utilities, adding a dynamic control of the parameters via a remote control.
The software will be able to be launched without any parameter, or with one or more parameters. The values of the missing parameters will be read in a configuration file if they are in this file, otherwise default values will be taken.
The values of these parameters will be able to be tested and set in real time by the use of a remote control.
Note that to date, these utilities work with the VLC and VLR codecs, and with the PCM codecs. The vlrFilter software will also take into account the uncompressed audio data.
More details can be found Here.
vlrMemos:
vlrMemos is an app to record voice or audio memos and to measure the quality of the voice.
The app will calculate and display in real-time acoustic parameters such as the LTAS (Long-Term Average Spectrum) and the HPR (High-Frequency Power Ratio).
When playing recordings, the application will add the calculation and the display of other parameters such as the Jitter, the Shimmer and the HNR (Harmonic to Noise Ratio).
The advanced parameters such as the CPP (Cepstral Peak Prominence) and the CPPS (Smoothed Cepstral Peak Prominence), which are reliable measures of the dysphonia, will be also calculated and displayed. Finally, the spectral parameters measuring the slowdown of the spectrum, the loss of complexity of the signal or the similarty between several channels will be calculated and displayed.
During the sound renderings, before the decompressions, personalized FIR (Finite Impulse Response) filters, generated from normalized and non normalized audiograms, will be applied to the channels in the Fourier domain (fast convolutions), for highly optimized and tailored audio outputs.
The app will be available for computers, tablets, smartphones, smartwatches and connected objects.
It can use an audio codec (audio compression and decompression method). This very fast and high-quality audio codec is based on FFT (Fast Fourier Transform) and can be accelerated with the GPU support (Graphics Processing Unit), for a very low battery consumption.
This codec is quasi-lossless in energy: the energy of an uncompressed frame is almost the same as the energy of the compressed frame.
This codec can provide the audio in 3D. During the sound renderings, before the decompressions, generic or personalized HRTF (Head-Related Transfer Function) filters are applied to the channels in the Fourier domain (very fast operations), for high quality 3D positional audio outputs.
The app will be compatible with the body sounds, the physiological signals and the variability data.
Using this app, one can:
- Detect anomalies in the voice.
- Monitor the effectiveness of a treatment of the voice.
- Monitor the progress during a training of the voice or during a speech therapy.
- Record and analyze the heartbeat sounds and the lung sounds.
- Record and analyze the physiological signals.
- Optionally, perform the sonification of the physiological signals and the variability data.
- Optionally, send the average values of some parameters in the form of codes of intensity and / or color (light notifications) to connected bulbs or to bridges of connected bulbs.
- Optionally, display the curves of values for the selected acoustic parameters.
- Optionally, compute and display the heart rate variability (HRV).
More details can be found Here.
Telemonitoring:
In the field of the telemonitoring, the VLC HQ 16 codec with the multichannel support (VLC HQM 16) can be used to:
- Record or transmit the body sounds (very low frequencies).
- Record or transmit data such as the ECG (ElectroCardioGram), the EEG (ElectroEncephaloGram) and the EMG (ElectroMyoGram).
- Record or transmit the ABP (Arterial Blood Pressure) waveforms data and the PPG (PhotoPlethysmoGram) waveforms data (from the pulse oximetry).
- Record or transmit the blood glucose waveforms data, in Continuous Glucose Monitoring (CGM).
- Optionally, compute and display the heart rate variability (HRV) and the blood pressure variability (BPV).
More details can be found Here.
Big Data:
The Big Data is one of the hottest themes at the moment in computer science.
For the data that support lossy compression, the VLC and VLR codecs are compatible with Big Data algorithms such as MapReduce, storage systems like NoSQL, and can integrate perfectly in architectures like Hadoop or Spark. More Information
IoT / M2M:
With VLC and VLR codecs (compression and decompression methods), it is possible to make the connected objects more intelligent, more recognizable and more predictable.
The need to compress at the object level is not obvious, especially if the flows are low. But at the level of the machines that record the data of a few objects to several thousand of objects, the problem of compression can arise, especially over long periods.
For IoT (Internet of Things) and M2M (Machine to Machine), at the level of the machines, the VLC and VLR codecs can provide very important features in artificial intelligence and deep learning for the detection of the anomalies, the recognition of the objects and the prediction of the behaviors. More Information
OFDM / PLC:
The OFDM (Orthogonal Frequency-Division Multiplexing) is a method of coding signals in the form of multiple subcarriers by distributing the data on different frequencies. It uses FFT (Fast Fourier Transform) and is commonly used in most broadband communications. These include: the Digital Video Broadcasting (DVB-T and DVB-H), the Digital Audio Broadcasting (DAB), the ADSL, the Power-Line Communications (PLC, HomePlug, G3-PLC, Prime-PLC), the wireless local area networks (for example, 802.11a, 802.11g or WiFi, 802.16 or WiMAX), and the new generation mobile networks (LTE, 4G).
With the OFDM, all types of data, including the data from the VLC codecs, are carried. Nevertheless, by using the properties of these codecs (notably the background) and the properties of the data types (in particular the number of frames per second), it is possible to significantly increase the number of subcarriers as well as the range of communications, and decrease the energy consumption. This document focuses only on the power-line communications, used in particular with the Smart Grids, the Internet of Energy, the smart cities, and the electric cars. More Information
Neural Networks:
The VLR codec has been developed not only to compress very strongly the audio data, but also to minimize the transmission frequency of the audio frames.
It can be used to reduce the latencies in the bidirectional voice communications.
Finally, we can consider using it with the neural networks, especially with the deep reinforcement learning.
Alert systems can be set up for monitoring the vital signs.
One can use data from the Internet of Things (IoT).
The storage requirements are reduced, the calculations are simplified, the predictions and the decision aids are facilitated. More Information