Taking into Account a So-Called Intermediate Space
In 2D, from the space of the pixels, to have the k-space:
- Step 1: FFTs (Fast Fourier Transforms) are performed on each line (or on each column).
- Step 2: FFTs are performed on each column obtained (or on each line obtained).
- Step 3: Some minor operations. are performed.
In this document, we denote by intermediate space, the space obtained after FFTs on each line (or on each column).
In the k-space, the magnitudes are characterized by an invariance to pure translations. Pure rotations rotate the lines of k-space around the center. Following an enlargement (or a narrowing), the k-space undergoes a narrowing (or an enlargement).
The k-space is ideal for the computer vision or the artificial intelligence, while the human eye is sensitive to the pixel space.
The intermediate space is thus a space between the space of the pixels (human vision) and the k-space (computer vision).
Advantages of the Intermediate Space
There are several advantages to use the VLC and VLR codecs at the level of the intermediate space, to compress images, sequences of images or videos.
1) Not only do we not touch the parallel treatments, but we strengthen them. The use of FFT allows the parallel processings. In addition, the lines are processed independently in the space and in time.
2) The processing speed is very high because steps 2 and 3 are not performed.
3) From a storage or a transmission using the intermediate space, it suffices to perform inverse FFTs (iFFTs) to have the space of the pixels, or FFTs to have the k-space.
4) From a storage or a transmission using the intermediate space, if one needs only a few lines, it suffices to decompress only these lines.
5) To take into account the color, one can consider 3 planes R, G and B, and apply the algorithms on each plane.
6) To take into account the color, we can consider 3 planes Y, U and V (or Y, Cb and Cr).
In this case we apply the full algorithms (foreground and background) on the Y plane (luminance).
For the chrominance (which fixes the color, U and V planes, or Cb and Cr planes), we can settle for a minimum of background bands, or even delete the background.
For more information on the VLC and VLR codecs, see at the following addresses:
-
Algorithms
-
Home Page
Notes
- The intermediate space is very compressible.
The phases are not as negligible as in audio. The phases of the foreground must be adequately taken into account.
- One can use the intermediate space with dimensions greater than two (3D, 4D or 5D for example). In this case, the FFTs are performed only on one dimension.
- We can choose to perform the FFTs in the largest dimension in order to minimize the number of FFTs.
- We can consider decomposing the vectors of the kind (x, y, z, alpha, phi) into (x, alpha, phi), (y, alpha, phi) and (z, alpha, phi).
- The VLC and VLR codecs can use any number of bits per pixel, for example 16 bits or more. If a large number of bits per pixel is used in compression, during the decompression and the display, it is necessary to reduce to a correct number, for example 8 bits in SDR, 10 bits in HDR 10 or 12 bits in HDR 12.
- By taking into account the phases correctly, the VLC and VLR codecs can also compress the k-spaces of dimension two or more(3D, 4D or 5D for example).
- For the image recognition (computer vision), one generally neglects the color and one considers only the edges, lines and curves.
- If one wants to have high compression rates, it is better to use the k-space, the most useful frequencies being towards the center. In addition, the symmetry properties around the horizontal and vertical axes can be used.