These notes are written for investors, developers and decision makers.
We recall the algorithm that we will use to have a voice codec with low bitrate, low consumption and low radiation for mobile phones (VLR):
- One must choose a number C of repeats credit.
The receiver has the right to repeat C times the frame until it receives another frame containing the remaining repeats credit to cancel.
- If the repeats credit is exhausted, the transmitter sends the current frame with the repeats credit to cancel to zero.
- If the repeats credit is not exhausted, the transmitter compares the current frame with the previous frame:
- If both frames are identical or almost identical (a similarity index is to be defined by the sender), the transmitter sends nothing.
- If the two fields are different, the transmitter sends the current frame with a number indicating the remaining repeats credit to cancel.
See :
http://www.whmsoft.com/projects/algorithms_en.html
This algorithm can be used in the general case (voice and music) and for all audio compression methods using fully or partially (for comparison) the frequency domain (FFT, MDCT, ...) and solving the problem of the edge effects (usually by frame overlapping).
This algorithm allows to manage the resynchronization at the codec level. The receiver knows after receiving each frame if there was lost packets.
For non-successive redundancies and safe media (files, local networks, TCP/IP, ...), you can store the last frames in memory and compare the current frame to a frame on the rear. Instead of sending the entire frame, you send an index k, indicating that the current frame is similar to the rearward frame at position k. You can choose a maximum value of k depending on the cases to be treated, given a value of 255 requires only one byte header to send instead of tens or hundreds of bytes.
If there is enough memory, we can significantly increase this value (up to 65535 for example) and quickly find the index to send (2 bytes) through accelerated research techniques (GPU programming for example, with distance calculations and sorts in parallel).
In multichannel audio, one can save the last frames of a single channel (for example the left channel), and search for the nearest neighbor for all channels, in parallel.
These algorithms are chiefly useful for the small FFT buffers, the silences, the stationary parts and the redundant buffers.
This method (taking into account non successive redundancies), added to the above (taking into account successive redundancies) and added to the lossless compression, will be called VLB (very low bandwidth) in this document.
The VLB method can be applied to the WHM Music codec (see WhMic product
http://www.whmsoft.com)). This codec uses the greatest points (foreground), the most energetic bands (background), the magnitudes and the phases, and a 50% or less frame overlapping.
For the foreground, one can use directly the amplitudes of the sine waves and the amplitudes of the cosine waves, instead of the phases. For the background, one uses only the magnitudes and the sign of the phases. The bands of the background, taken separately, can be encoded in parallel.
This codec uses FFT only and is very fast, especially if you take all the points of the background. You can get a very high quality by increasing the accuracy of magnitudes, especially for the points of the foreground. This codec can be used for the sampling rates of 44 kHz, 48 kHz, 96 kHz and 192 kHz, with 16-bit or 24-bit samples.
It should be noted that with the VLB method, the decompression is particularly very fast and requires very little computation.
It should also be noted that the non emission of similar successive frames reduces the headers added by the transmission protocols.
It should be lastly noticed that this codec being based on FFT, it can be accelerated with the graphics processing support (GPU, Graphics Processing Unit) and can include a very important number of simultaneous channels.
The VLB method can be used to have a high quality sound with a very low consumption of bandwidth. It can be useful, for example for:
- The music players.
- The private telephone switches using IP (IP PBX).
But also with safe media or by removing the references to the previous frames:
- The audio streaming.
- The voice communications servers (for games across the network).
Databases Generated in Advance
If we can analyze the data in advance (files on physical media or in streaming for example), we can combine the VLB algorithms and the algorithms of the Codebook version.
In this case, there is no need for rear search or additional lossless compression. One continue not to send successive similar frames.
For more information on the Codebook version, see at the following address:
Codebook Version
This database may contain compressed records and / or the WAVE records used to generate the databases. In the latter case, there is no longer any need to make iFFT (inverse FFT), just apply the overlaps.
The records of the database are numbered, there is no duplicate, the similarity index is used to adjust the quality.
The similarity indices are calculated with only the magnitudes. The use of FFT makes it possible to compress strongly without deforming by taking advantage of the shift-invariance of the magnitudes (global shifts, included in the phases).
In the case of two similar frames, to distinguish simple shifts from other cases (really different phases), secondary similarity indices based on cross-correlation can be used.
The cross-correlations measure the similarity between two signals, so give very precise measurements to avoid these collisions.
Before the beginning of the readings, one loads or downloads the necessary pre-generated database.
With just an integer to read or to receive, one can obtain a quality close to the original, with very low bitrates.
Notes
- The algorithms of the VLB method are in study in France (INPI).
- New: The study of the patent applications concerning the VLB algorithms and the versions with Codebook is completed at INPI (France). Both patents will be issued in February or March 2018 at the latest.
Contacts and Comments:
support@whmsoft.com
Our codecs are based on FFT and can be accelerated with the GPU support
as this WebGL animation.
three.js