Hevc better, faster, stronger
Now that the market is overflowing with UHD TV, and even 4K content is starting to pop up, H.264 is no longer up to the challenge. We need a solution that can compress and distribute ultra HD video with maximum quality through channels that are not yet ready for such loads.
This solution has been called H.265 or HEVC, and was first introduced to the general public at MWC 2012 by Qualcomm. The standard proved to be incredibly efficient even at that time.
How does compression work?
To understand how most compression algorithms work, let’s take the example of packing a suitcase. When it gets too heavy, we take out any unnecessary things. The same is true for video — the fewer repeated and unimportant elements it has, the easier it is to carry.
1. Entropy coding
Any ordered sequence can be grouped like this until there are no more data blocks left that can be compressed further. Or to say it another way, until the sequence of zeros and ones becomes absolutely random. This is why this type of coding has been named Entropy Coding. It should be mentioned, though, that the information itself is not affected. We only transform how it is represented and reduce the redundancy.
2. Frequency decomposition
Anyone who is at least a little bit familiar with Computer Science knows its main principle — data received in the form of zeros and ones can be converted into any other system, whether decimal, hexadecimal or even alphabetic. A 2D image is also a particular type of data that can be converted from one coordinate system to another. For example, we are used to viewing an image in X-Y coordinates (length-width). This type of representation is called a spatial domain. The value of each pixel in it is based on its position. We can convert this image into a frequency domain, where we don’t evaluate the position of pixels, but rather how their value changes relative to the neighboring ones. The higher the contrast of the areas, the higher their coordinate is on freqX-freqY axes. This is how the same image looks when converted from the X-Y system to freqX-freqY.
• Low-frequency components are located nearer to the center of our matrix. They are responsible for homogenous areas with gradual luma and chroma transitions.
• High-frequency ones are located closer to the edges. These include all contours, razor edges, and fine details.
After this transformation, we can simply crop the edges of our matrix, or in other words, mask it. When we convert the image to the usual form, it will lose some details, but in general will remain similar to the original one.
When we select the necessary mask size and form, we can control these losses and the degree to which the final file is compressed. Below is the same car, but now circular masks are applied to it.
3. Chroma subsampling
When an image is transmitted to a TV screen, the RGB color scheme is converted into YCbCr, where Y is the luma component and Cb and Cr are the blue and red components of the color scheme, or chroma components. A human eye can perceive even the smallest fluctuations in brightness, but it isn’t as good at recognizing shades. Therefore, if we transmit luma information in full resolution, and the color component in reduced resolution, no one will notice, and bandwidth is reduced. Coding a signal in Y’CbCr reduces the data volume almost by half.
There are several chroma subsampling methods. Each of them is designated by a numeric code that describes chroma resolution (2nd and 3rd) relative to luma resolution (1st).
4:4:4 (YUV) FORMAT
The colored dot consists of luma (Y’) and chroma (Cr and Cb) components. In this case, there are four components of each color for every four luma components. This is how non-compressed RGB images are usually represented. Theoretically, the 4:4:4 formula can be used in Y’CbCr, but there is no practical need for using this format.
4:2:2 (YUY2) FORMAT
The ratio of luma resolution to chroma resolution is 4:2. This is the traditional broadcasting format used by DigiBeta, DVCpro50, and others.
4:1:1 (YV12) FORMAT
The ratio of chroma component resolution to luma is reduced by a factor of 4. This system is used in NTSC DV and PAL DVCPro.
4:2:0 (YV12) FORMAT
Component resolution depends on whether interlaced or frame scanning is used. It is often used for transmitting Н264 via the Internet, PAL DV, MPEG2, and various software solutions.
4. Motion compensation
In almost any video, each frame is similar to the previous one. They have a common, nearly static background, and only some objects move relative to others. It seems quite natural to want to code only those elements that change, but not the ones that stay the same. This example illustrates how similar all the subsequent frames are.
How does the algorithm work?
Н.265 versus H.264
Why does an Н.265 coded video of maximum quality take up to 40-50 percent less bandwidth than the same H.264 video? What’s more, the technology supports resolutions up to 8K and 10-bit color coding. Such an impressive leap in efficiency has become possible due to three key structural improvements:
1. Clean Random Access.
Decoding a randomly selected frame does not require decoding previous frames. The H.265 format does not require inserting any intermediate frames (I-frames), which reduces the bit rate of a video.
2. Change in maximum block size.
With H.264, the maximum block size is 256 pixels (16х16). But with H.265, it increases 16-fold to 4,096 pixels (64х64), and the algorithm determines the block size automatically.
3. Parallel decoding. The new format benefits from the characteristics of multi-core processors.
Н.265 can calculate different parts of the same frame simultaneously. The processing speed increases by several times.
Where is НEVC already used
HEVC is currently supported on many software and hardware encoders like Nvidia NVENC and Intel QSV. H265 can sometimes be seen on satellite television, IP cameras, and various devices for capturing and coding HDMI (this is especially popular with game streaming when you don’t want to increase the load on your computer).
2. Play Back
You can currently encounter H.265 on IP cameras. Also, there are 30-megabit channels compressed into H.265 on satellites. Little by little, attempts are being made to implement it in various OTT services, where there is device control.
The format is gaining popularity especially quickly on set-top boxes and Smart TVs. The situation with desktop browsers is less promising so far, in fact, only Microsoft Edge is currently able to play H.265. On modern smartphones, H.265 is likely to be played on the processor, meaning your battery will die before you have finished watching even a short video.
Infomir was one of the first companies to embed HEVC technology in its set-top boxes. Due to this, we can already see the advantages of the standard in real-life examples.
Is HEVC going to revolutionize IPTV/OTT services? Probably not. Formats aren’t replaced overnight. Н.264 will remain an active market player for a long time, but will slowly give way to its logical successor. However, we can say with certainty that the future belongs to Н.265. Be ready for it with Infomir!