High Efficiency Image File Format (HEIF, ISO/IEC 23008-12) specifies the storage of individual images, image sequences and their metadata into a container file conforming to the ISO Base Media File Format (ISO/IEC 14496-12). HEIF includes the storage specification of HEVC intra images and HEVC image sequences in which inter prediction is applied in a constrained manner. Use cases supported by HEIF include:
As HEVC provides support for various chroma formats and sample fidelities up to lossless coding, the format can serve the whole spectrum of use cases from today's consumer devices storing images typically at 8 bits per sample to high-end professional devices with sample fidelity and dynamic range requirements going all the way up to 16 bits per sample.
Computational photography forms a new category of use cases that can benefit from the HEIF file format. Now a set of related images can be stored in a single file with associated metadata indicating relationships between different pictures.
HEIF specifies a structural format, from which codec-specific image formats can be derived. HEIF also includes the specification for encapsulating images and image sequences conforming to the High Efficiency Video Coding (HEVC, ISO/IEC 23008-2 | ITU-T Rec. H.265).
In ISOBMFF, a continuous or timed media or metadata stream forms a track, whereas static media or metadata is stored as items. Consequently, HEIF has the following basic design:
A file may contain both image items and image sequence tracks along with other media. For example, it is possible to create a file, which includes image items or image sequence tracks conforming to HEIF, and audio and timed text tracks conforming to any derivative format of the ISOBMFF.
Files conforming to ISOBMFF consist of a sequence of data structures called boxes, each containing a four-character code (4CC) indicating the type of the box, the size of the box in terms of bytes, and the payload of the box. Boxes may be nested, i.e. a box may contain other boxes. ISOBMFF and HEIF specify constraints on the allowed box order and hierarchy.
Files conforming to HEIF start with a FileTypeBox as defined in the ISOBMFF standard, which contains a list of brands the file complies with. As the FileTypeBox is located at the start of the file, it provides easily accessible indications of the file contents to media players. Each brand is identified by its unique four-character code. The specification of a brand can include requirements and constraints for files of the brand and for file players supporting the brand. A brand included in the FileTypeBox permits a player that supports the requirements of the brand to play the file.
The brands specified in the HEIF standard are presented in Table I. The HEIF standard specifies the 'mif1' and 'msf1' structural brands. Additionally, HEVC-specific brands are specified as listed in Table I. The dedicated brand names, 'heic' and 'hevc' indicates that HEVC Main Profile is utilized.
Table I. Brands, MIME subtypes, and file extensions for HEIF.
|Brand||Coding format||Image or sequence?||MIME Type||MIME subtype||File extension|
|heic||HEVC (Main or Main Still Picture profile)||image||image||heic||.heic|
|heix||HEVC (Main 10 or format range extensions profile)||image||image||heic||.heic|
|hevc||HEVC (Main or Main Still Picture profile)||sequence||image||heic-sequence||.heic|
|hevx||HEVC (Main 10 or format range extensions profile)||sequence||image||heic-sequence||.heic|
Internet media types, also known as MIME (Multipurpose Internet Mail Extensions) types, are used by various applications to identify the type of a resource or a file. MIME types consist of a media type ('image' in the case of HEIF files), a subtype, and zero or more optional parameters. For multi-purpose files, the selection of the subtype can be made on the basis of the primary use of the file.
An optional codecs MIME parameter can be present to indicate the used coding formats of the tracks and items present in the file. The codecs MIME parameter also includes the profile-tier-level value to which an HEVC-coded image item or an image sequence track conforms. More information about the MIME type registration and optional parameter can be found in Annex D and Annex E of the HEIF standard.
Table IX provides a comparison of the features of HEIF to other selected image formats. It can be observed that HEIF is more extensible and comprehensive than the other compared file formats. Particularly the possibility to include other media types, the advanced multi-picture features, and the support for non-destructive editing make HEIF more advanced than the other formats. The rich set of features make HEIF suitable for a broad range of devices and applications, including for example burst photography.
The references used to conclude the information in Table II are included in Table X.
Table II. Comparison of the features of some image file formats.
|.heic||JPEG/Exif||PNG||GIF (89a)||WebP||JPEG-XR / TIFF||JPEG-XR / JPX||BPG|
|Formats and extensibility|
|Base container file format||ISOBMFF||TIFF||-||-||RIFF||TIFF||- 4||-|
|Lossy compression||Yes (HEVC)||Yes (JPEG)||No||No||Yes (VP8)||Yes||Yes||Yes (HEVC10)|
|Lossless compression||Yes (HEVC)||Yes (TIFF Rev 6.0)||Yes (PNG)1||Yes (GIF)1||Yes (VP8L)||Yes||Yes||Yes (HEVC10)|
|Extensible to other coding formats||Yes||Yes8||No||No||No||Yes8||Yes5||No|
|Metadata format (on top of internal)||Exif, XMP, MPEG-7||Exif||-||-||Exif, XMP||Exif, XMP||JPX, (XMP)6||Exif, XMP|
|Extensible to other metadata formats||Yes||No||No||No||No||No||Yes (XML-based)||Yes|
|Other media types (audio, text, etc.)||Yes||Audio2||No||No||No||No||Yes7||No|
|Multiple images in the same file||Yes||No11||No||Yes3||Yes3||No||Yes||Yes9|
|Image sequences / animations||Yes||No||No||Yes||Yes||No||Yes||Yes|
|Extensible to other editing operations||Yes||No||No||No||No||No||No||No|
|Auxiliary picture information|
|Transparency (alpha plane)||Yes||No||Yes||No12||Yes||Yes||Yes||Yes|
1 In GIF and indexed color PNG encoding, lossy color quantization is applied while the color-quantized image is losslessly compressed.
2 PCM, µ-Law PCM and ADPCM encapsulated in RIFF WAV
3 Only for animations and tiling/overlaying
4 JPX is a box-structured format compatible with ISOBMFF. However, only the File Type box is common in JPX and ISOBMFF.
5 Encapsulation of JPEG-2000 and JPEG-XR have been specified for JPX container. Mappings for other codecs could be similarly specified.
6 JPX (ITU-T T.800 and T.801) specifies an own metadata schema, but is capable of carrying an XML formatted metadata, such as XMP.
7 JPX can contain media complying with ISOBMFF (or derivatives thereof). No accurate synchronization between JPX animations and other media.
8 TIFF as a container format facilitates extensions to other coding formats.
9 Only for animations, thumbnails, and alpha planes. Non-timed image collections not supported.
10 HEVC Main 4:4:4 16 Still Picture profile, Level 8.5, with additional constraints
11 Can be enabled through the MP extension
12 A palette index for fully transparency can be specified
It is acknowledged that a summary such as that in Table II might be somewhat incomplete when it comes to features of different formats. For example, the table does not cover some of the extensions of JPEG. We welcome feedback and corrections to the table.
Table III. References for the compared image file formats
|Image format||Version or date||Reference and/or URL|
ISO/IEC 10918-1 | ITU-T Rec. T.81
ISO/IEC 29199-2 | ITU-T Rec. T.832
ISO/IEC 15444-2 | ITU-T Rec. T.801 (for JPX)
The file size of container files are directly affected by the compression performance of the image/video codec being utilized. Such a comparison can be found in Annex B of this document.
Table IV illustrates coding efficiency of HEVC intra coding with respect to well-known still picture codecs. The results indicate that JPEG would require on average 139 % higher bitrate than HEVC (i.e. 2.39 times the file size) in order to achieve the same objective picture quality. For JPEG-XR and JPEG-2000 the average increase in bitrates are 66 % and 44 %, respectively.Table IV. HEVC intra coding performance with respect to legacy formats. Bitrate increase required to achieve the objective quality provided by HEVC intra coding is reported for each test category.
|Class||Resolution||Charasteristics||JPEG||JPEG XR||JPEG 2000|
|Class A||2560x1600||Cropped 4Kx2K sequences for Ultra HDTV services||87 %||44 %||48 %|
|Class B||1920x1080||High resolution sequences for streaming and broadcast services||124 %||62 %||15 %|
|Class C||832x480||Medium resolution sequences for Internet/mobile video services||122 %||53 %||50 %|
|Class C||832x480||Medium resolution sequences for Internet/mobile video services||122 %||53 %||50 %|
|Class D||416x240||Low resolution sequences for services to resource constrained devices||110 %||47 %||43 %|
|Class E||1280x720||720p sequences for video conferencing applications||170 %||73 %||23 %|
|Class F||1024x768, 1280x720||Computer screen content and computer generated content||223 %||118 %||87 %|
|Average||139 %||66 %||44 %|
Table V indicate that one can expect that for natural content the restricted inter coding can typically provide two to three times better compression than intra picture coding. In special cases like animations where majority of the scene is static the compression efficiency can significantly exceed those levels and be tens of times more efficient than intra coding.Table V. Coding efficiency improvements provided by low latency predictive coding of the HEVC Image File Format. Bitrate impact and coding gain are reported with respect to HEVC intra coding.
|Content||Type||Frames||Bitrate change||Coding gain|
|Class A||Image burst||8||-46 %||1.9|
|Class B||Image burst||8||-51 %||2.0|
|Class C||Image burst||8||-60 %||2.5|
|Class D||Image burst||8||-63 %||2.7|
|Class E||Image burst||8||-79 %||4.8|
|Class F||Image burst||8||-55 %||2.2|
|Memorial||Exposure stack||16||-29 %||1.4|
|Mersu||Focal stack||13||-25 %||1.3|
Multiple images can be stored in a HEIF file. It can be useful to differentiate between them by assigning them certain roles. The roles specified in HEIF are listed and described in Table VI. Note that a single image can be associated with more than one role.
Table VI. Roles of images.
A representative image of the image items and image sequence tracks of the file.
The cover image should be displayed when no other information is available on the preference to display the image items of the file. The file can have only one cover image.
|thumbnail image||A smaller-resolution representation of a master image.|
An image that complements a master image. For example, an alpha plane or a depth map.
Can assist in displaying the master image but is not typically displayed as such.
|master image||An image that is not a thumbnail image or an auxiliary image. Typically represents a full-resolution displayable image.|
|hidden image||An image that should never be displayed. Can be present in the file for example as an input image for a derived image.|
|pre-derived coded image||
A coded image that has been derived from other images.
For example, a high dynamic range image derived from an exposure-bracketed set of images.
|coded image||A coded representation of an image.|
|derived image||An image that is represented in a file by an indicated operation to indicated input images and can be obtained by performing the indicated operation to the indicated input images.|
HEIF allows the storage of image properties which are shared among different image items in a compact way. These properties are stored in ItemPropertyContainerBox. There are mainly two types of properties: descriptive and transformative. Descriptive properties provide information about the image item without applying modifications on the image itself. Transformative properties provide information about the transformative modification that needs to be done on the image item. The order of application of these properties to the image items are defined in the standard. Table VII lists currently defined properties. In addition to descriptive image properties, image items can optionally be characterized with metadata items, the format of which follows Exif, XMP, or MPEG-7 metadata.
Table VII. Image Properties
|Decoder configuration and initialization||Descriptive Property||The information needed to initialize the decoder. The structure of this information is usually defined in the related image coding format specification.|
|Image spatial Extents (‘ispe’)||Descriptive Property||indicates the width and height of the associated image item|
|Pixel Aspect Ratio (‘pasp’)||Descriptive Property||has the same syntax as the PixelAspectRatioBox as defined in ISO/IEC 14496-12.|
|Color Information (‘colr’)||Descriptive Property||has the same syntax as the ColourInformationBox as defined in ISO/IEC 14496-12.|
|Pixel Information (‘pixi’)||Descriptive Property||indicates the number and bit depth of colour components in the reconstructed image of the associated image item.|
|Relative Location (‘rloc’)||Descriptive Property||indicates the horizontal and vertical offset of the reconstructed image item relative to the associated image item.|
|Image Properties for Auxiliary Images (‘auxC’)||Descriptive Property||Auxiliary images must be associated with an AuxiliaryTypeProperty.|
|Content light level (‘clli’)||Descriptive Property||Has the same syntax as the ContentLightLevelBox as defined in ISO/IEC 14496-12.|
|Mastering display colour volume (‘mdcv’)||Descriptive Property||Has the same syntax as the MasteringDisplayColourVolumeBox as defined in ISO/IEC 14496-12.|
|Content colour volume (‘cclv’)||Descriptive Property||Has the same syntax as the ContentColourVolumeBox as defined in ISO/IEC 14496-12.|
|Required reference types (‘rrtp’)||Descriptive Property||Lists the item reference types that a reader shall process to display the associated image item.|
|Creation time information (‘crtt’)||Descriptive Property||Creation time of the associated item or entity group.|
|Modification time information (‘mdft‘)||Descriptive Property||Modification time of the associated item or entity group.|
|User description (‘udes‘)||Descriptive Property||User-defined name, description and tags.|
|Accessibility text (‘altt‘)||Descriptive Property||Alternate text for an image, in case the image cannot be displayed.|
|Auto exposure information (‘aebr‘)||Descriptive Property||Exposure variation information in bracketed sets.|
|White balance information (‘wbbr‘)||Descriptive Property||White balance compensation information in bracketed sets.|
|Focus information (‘fobr‘)||Descriptive Property||Focus variation information in bracketed sets.|
|Flash exposure information (‘afbr‘)||Descriptive Property||Flash exposure variation information in bracketed sets.|
|Depth of field information (‘dobr‘)||Descriptive Property||Depth of field variation information in bracketed sets.|
|Panorama information (‘pano‘)||Descriptive Property||Characteristics about the associated panorama entity group.|
|Image Scaling (‘iscl’)||Transformative||Resize input image to target width and height.|
|Image Rotation (‘irot’)||Transformative||Rotation by 90, 180, or 270 degrees.|
|Clean Aperture (‘clap’)||Transformative||Cropping according to a given cropping rectangle.|
Derived images enable non-destructive image editing, where the original coded images are kept in the file, while new images, called derived images, can be introduced by specifying a transformation operation that is applied to one or more input images. HEIF specifies the generic structures used for storing derived images as items as well as a few specific types of derived images. Derived images can also have descriptive or transformative image properties. Item references of type 'dimg' specify the input image(s) of the derived image. The input images can be coded images or derived images. The derived image types specified in the HEIF standard are listed in Table VIII. Other types may be specified in other documents or later versions of the HEIF standard.
Table VIII. Derived Images
|Identity transformation (‘iden’)||Cropping and/or rotation by 90, 180, or 270 degrees, imposed through the respective transformative properties.|
|Image Overlay (‘iovl’)||Overlaying any number of input images in indicated order and locations onto the canvas of the output image.|
|Image Grid (‘grid’)||Reconstructing a grid of input images of the same width and height.|
Predictively coded image items have several benefits, such as selecting any image from an image sequence as the cover image, higher compression efficiency, and avoiding re-encoding when converting an image sequence to image items. Presence of predictively coded image items, which have one or more other coded image items as decoding dependencies, is signalled with 'pred' brand. Decoding dependencies and order for predictively coded image items are specified by item references of type 'pred'.
HEIF files allow the storage of the metadata which can be related to images and image sequences. Such metadata can be information related to integrity checks, EXIF or XMP data or MPEG-7 related metadata. For image items, such metadata can be stored as a metadata item and reference the related image items with a ‘cdsc’ item reference. For image sequences, timed metadata tracks can refer to an image sequence track with a ‘cdsc’ track reference.
HEIF files can contain several image items representing the same image content (i.e. alternative representations). A media player may select one of the alternative representations of the same image content for displaying. A mechanism, known as entity grouping (defined in the GroupsListBox), is used for indicating alternate groups, which can contain both image items and media tracks (image sequences). The capability to define alternate groups between tracks and image items provides a unique mechanism for display initialization based on preferences of the media player and the content creator. Examples where alternate groups can be useful are:
Images that have a particular correlation (e.g. image bursts or animations like cinemagraphs) can be efficiently stored in HEIF files, thanks to the inherited media track features of ISOBMFF. Such images are called image sequences and they can reside in the same file with image items. Table IX lists the roles of the image sequence tracks which are currently defined.
Table IX. Roles of image sequence tracks.
|thumbnail image sequence||A smaller-resolution representation of a master image sequence. T|
|auxiliary image sequence||An image sequence that complements a master image sequence. For example, a sequence of alpha plane or depth map images. Can assist in displaying the master image sequence but is not typically displayed as such.|
|master image sequence||An image sequence that is not a thumbnail image sequence or an auxiliary image sequence. Typically contains full-resolution displayable images.|
HEIF enables the usage of Inter-picture prediction for compact storage of image sequences. Moreover, image sequence use cases may require faster access to individual images and the ability to edit individual images without affecting any other images. HEIF therefore includes the following two features:
Figure 1 illustrates how a file player processes the coded images and the derived images included in a file. The file player decodes a coded image into a reconstructed image. Similarly, the file player applies the operation of the derived image to the indicated one or more input images to obtain the respective reconstructed image. The descriptive image properties generally describe the reconstructed image, with the exception of the decoder configuration and initialization information, which is associated with the coded image. The transformative image properties, if any, are applied to the reconstructed image to obtain an output image. The output image can be displayed, when the coded image or the derived image is not a hidden image. The output image can also act as an input image to derived images.
The most important features that enable controlling the playback of an HEIF file are listed in Table X. Some of these features were introduced in the ISOBMFF or ISO/IEC 14496-15 and are explicitly inherited by HEIF, while other features were specifically designed for the HEIF standard.
Table X. Features controlling image sequence playback.
|Feature||First appeared in||Description|
|non-displayable sample||ISO/IEC 14496-15||Is never displayed, but can be used as a reference for predicting other images in the track.|
|timed vs. non-timed playback||HEIF||In timed playback, the image sequence is played as video, whereas in non-timed playback the samples of the track are displayed by other means, such as an image gallery. Non-timed playback may be indicated e.g. when a track is used for achieving a better compression efficiency for an exposure stack.|
|edit list||ISOBMFF||A list of ranges of the image sequence track in their playback order. Enables modifying the playback order and pace of samples.|
|looping||HEIF||HEIF allows indicating edit list repetition e.g. for looping animations. The repetition can be indicated to last for a certain duration or be infinite.|
|cropping and rotation||ISOBMFF||Rectangular cropping and rotation by 90, 180, 270 degrees can be specified.|
The HEIF standard includes the specification for encapsulating HEVC-coded images and image sequences into HEIF-compliant files. The specification includes the following aspects:
The Multi-Image Application Format (MIAF) specification defines additional constraints and interoperability points to ensure higher interoperability, while fully conforming to the HEIF format. This is done by defining specific constraints, limiting the supported encoding types to a set of specific profiles and levels (see Table XI), requiring specific metadata formats, and defining a set of brands for signalling such constraints (see Table XII). This enables the industry to deploy particular uses of the HEIF specification.
Table XI. MIAF profiles.
|MIAF profile||Allowed image item coding profiles and highest levels|
|MIAF HEVC Basic profile||
|MIAF HEVC Advanced profile||
|MIAF HEVC Extended profile||
|MIAF AVC Basic profile||
Table XII. MIAF brands.
|MIAF Application brand||Features|
|Progressive application brand||Data is ordered in way that progressive perceived quality enchancement during loading the file is possible.|
|Animation application brand||An animation application branded video file includes one video track, and may also include an associated alpha plane sequence and an audio track.|
|Burst capture application brand||Multi-image capturing, for example focal and exposure stacks.|
|Common media fragmented brand||Used when compatiblity with CMAF is desired ('cmfc' brand in ISO/IEC 23000-19).|
|Fragmented alpha video brand||Files that are CMAF compatible, and contain an alpha plane sequence associated with the video sequence.|
The MIAF specification also defines normative requirements for MIAF readers and renderers. Entities participating in MIAF file playback are briefly described in Table XIII.
Table XIII. MIAF architecture.
|MIAF reader||Reads and parses MIAF files, and identifies the type of image coding and metadata. Handles decoding of bitstreams for the coding (types, profiles, levels) that are supported.|
|MIAF renderer||Renders the output of MIAF reader into a visual context, taking associated metadata (e.g. colour information) into account, and possible auxiliary image data (e.g. depth or alpha planes).|
|MIAF player||Uses a MIAF reader and a MIAF renderer to present the contents of a MIAF file.|
The MIAF specification can be found from the ISO website.