Mkvmerge

Languages in Matroska and MKVToolNix

Background

Mkvmerge 1 Articles. Audio Fingerprinting Skips A Show’s Intro, Reliably. November 25, 2020 by Donald Papp 23 Comments. Lacking a DVD drive, jg was watching a TV series in the form of a. The exe running the new GUI might have been called something else originally, I can't remember, but I don't think there's ever been a download called 'MKVMerge'. As far as I know you've always had to download and install MKVToolNix and MKVMergeGUI was a component.

For a long time the Matroska file format has only supported track/chapter/tag languages in the form of ISO 639-2codes (e.g. por for Portuguese) with an optional country code following(e.g. por-BR for Portuguese in Brazil). The whole of MKVToolNix has never had support for those optional country codes, though.

Mkvmerge Gui free download - JkDefrag GUI, AVI-Mux GUI, RAMBooster, and many more programs. MKVToolnix is a set of tools to create, alter and inspect Matroska (.mkv) files under Linux, other Unices and Windows.

In 2019 the IETF's CELLAR working group (which the Matroska project is a part of) has extended the format specifications with three newelements that allow storing a much more descriptive language tag for tracks, chapters and tags. These elements are called 'LanguageIETF'(for track headers), 'ChapLanguageIETF' (for chapter languages) and 'TagLanguageIETF' (for tag languages). These elements must follow thesyntax laid out in the IETF's Best Common Practice 47 (referred to as 'BCP 47' in this article) whichis also known as 'RFC 5646'.

The advantages of using an existing standard, especially a Best Common Practice, is that we can build on the work of a lot of very smart andknowledgeable people and that the same standard is widely used in other projects, protocols and products.

Examples for BCP 47 language tags

So how might those elements look? Here are a couple of examples:

  • de — the simplest form consists solely of an ISO 639-1 or 639-2 language code (in this case: German)
  • pt-BR — an ISO 3166 country code might be used, too, for specifying the language spoken in a specific region or country (in this case:Portuguese as spoken in Brazil)
  • sr-Cyrl-RS — an ISO 15924 script code might be used as well for specifying that alanguage is written in a specific script (in this case: Serbian as spoken Serbia, written in Cyrillic instead of Latin)

There are several more possible additions that describe rarer variants of languages. One can even use custom private extensions that aren'tstandardized and only have meaning to a select number of people.

Semantics & interaction

The rules for using those new elements in Matroska are simple: if both a new '…LanguageIETF' element and the corresponding old plain'…Language' exist at the same level, the new '…LanguageIETF' element must be used. Otherwise the only existing element is used.

It is expected that existing programs & devices will take quite some time before they support the new elements. That is nothing MKVToolNixcan help with.

Mkvmerge Data

Support for BCP 47 language tags

Mkvmerge

Starting with version 50 MKVToolNix has almost full support for BCP 47 language tags. The BCP 47 language tag parser is lenient in what it accepts, including but not limited to:

  1. It is case-insensitive (e.g. both en and EN are accepted to mean English).
  2. You can specify either an ISO 639-2 or an ISO 639-1 code if both exist for the same language (e.g. both eng and en are accepted).
  3. It accepts both ISO 3166 country codes and numeric UN M.49 country codes (e.g. both UGand 800 mean Uganda).

However, the parser always outputs a normalized version of the language tag as laid out in BCP 47, again including but not limited to:

  1. The language code is the ISO 639-1 code if it exists, otherwise 639-2, and it is lower-case.
  2. The script code, if given, is capitalized.
  3. The country code is always the alphabetical ISO 3166 code, even if a numeric UN M.49 code was given initialized. And it is upper-case.
  4. The other components are always lower-case.

mkvmerge

All of mkvmerge's options that accept a language accept a BCP 47 language tag.

When identifying a file in JSON mode, existing 'LanguageIETF' track header elements will be output as the language_ietf track property.

When writing a file will always write the 'LanguageIETF', 'ChapLanguageIETF' and 'TagLanguageIETF' elements (the latter two only if chaptersor tags are written in general, of course). In addition to those elements the corresponding old elements will be written; they'll be set tothe ISO 639-2 code portion of the BCP 47 language tag. For example, when the track language is set to sr-Cyrl-RS 'LanguageIETF' will beset to sr-Cyrl-RS and the old 'Language' element will be set to 'srp').

When reading existing files (Matroska files, XML chapter or tag files etc.) that already contain those '…LanguageIETF' elements the existingelements will be kept. Otherwise '…LanguageIETF' elements will be added based on command-line options and other existing '…Language'elements.

The creation of the new elements can be disabled completely with the command-line option --disable-language-ietf which operates on allthree new elements.

mkvpropedit

For mkvpropedit there's a new track header property named language-ietf that can be set or removed. Changes to this property only applyto the new 'LanguageIETF' track header element.

Changes to the old language track header property will cause mkvpropedit to apply the same change to both the new 'LanguageIETF' elementas well as the old 'Language' element similar to how mkvmerge applied the language to both elements. For example, when using mkvpropedit movie.mkv --edit track:2 --set language=zh-TW the 'LanguageIETF' element will be set to zh-TW and the old 'Language' element to chi.

When reading XML chapter or tag files mkvpropedit works like mkvmerge does (see above).

The creation of the new elements can be disabled completely with the command-line option --disable-language-ietf which operates on allthree new elements.

MKVToolNix GUI

Multiplexer & chapter editor

In MKVToolNix's multiplexer and chapter editor all controls taking a single language have been changed to use a language selectiondialog. That dialog offers the user the choice between a free-form input or selecting each component of the language with the help ofdrop-down boxes. Changes to any of the input methods cause the respective other input method to be updated immediately if the resultinginput is valid. The validity of the input is shown on the bottom of the dialog including the parser's error message if the input is invalid.

The default editing mode is selecting individual components. The default mode can be changed in the preferences → 'GUI' → 'Default IETF BCP47 language editing mode'.

Header editor

Mkvmerge Gui Filehippo

The header editor shows both elements as entries in its tree. The old 'Language' element uses the old language drop-down box just like inearlier versions of MKVToolNix. The 'LanguageIETF' element uses the same language selection dialog described above.

Changes to the 'LanguageIETF' element have no effect on the old 'Language' elements and vice versa, which differs from how mkvpropeditworks.

Disabling the '…LanguageIETF' elements

You might find yourself in situations where you have to disable those new elements, e.g. because your hardware device fails to play aMatroska file that contains them. Here's how to do that:

  • For mkvmerge add the command-line option --disable-language-ietf. Not only does it prevent mkvmerge from adding those elements,they'll also be removed if they exist in the source file.
  • For mkvpropedit add the command-line option --disable-language-ietf. It'll prevent mkvpropedit from writing the track header'LanguageIETF' element when working on the language property and remove the 'ChapLanguageIETF' & 'TagLanguageIETF' elements when workingon chapters or tags respectively. For removing existing track header 'LanguageIETF' elements, use --edit track:… --delete language-ietf.
  • For MKVToolNix GUI's multiplexer you can add --disable-language-ietf to the default list of additional command-line options in thepreferences → 'Multiplexer' → 'Default values' → 'Default additional command-line options'.
  • For MKVToolNix GUI's chapter editor you currently cannot disable the creation of 'ChapLanguageIETF' elements. Such functionality might beadded later.
  • For MKVToolNix GUI's header editor you can simply select each 'Language (IETF BCP 47)' element and check the 'remove element' checkbox ifthe element is currently present in the file.

The old user interface is not coming back

Compared to the old user interface the new one requires two more mouse clicks to change the language tag (opening the dialog & clicking'OK'). A small number of users object to this change. While I truly understand that only two clicks can amount to a lot of extra work whenhandling large number of files, the old interface is not coming back. Other wishes such as 'simply show the old language combo box withthe new edit button' don't have much merit considering what the new interface achieves. Here are the requirements I had before I implementedthe new UI:

  1. The user should be able to use the full feature-set that BCP 47 language tags offer.
  2. For users not familiar with BCP 47: the language tag should be easy to construct with a lot of help from the program.
  3. Users intimately familiar with BCP 47 should be able to quickly input a valid BCP 47 language tag without having to hunt through multiplecombo boxes.
  4. The displayed language tag should be human-readable (e.g. 'English' instead of 'en').
  5. The displayed language tag should still offer the full information about all of its components.
  6. The new controls should ideally fit into the same space the existing controls were occupying, if at all possible without making thedialog any wider.
  7. After swapping out the old for the new controls the interface should not be much more confusing than it was before.
  8. The number of mouse clicks required for the most-often executed operation (changing solely the language code) should ideally notincrease, or if it has to increase, as little a possible.
  9. The new controls should be usable solely with the keyboard.
  10. The user interface should not accept invalid language tags.

Mkvmerge Gui

Of course several of those requirements conflict with each other.

Having the combo box outside would be much more confusing in the presence of other components of a BCP 47 language tag. It would also posereal problems wrt. handling invalid language tags. Overall that change wouldn't be an improvement due to the number of drawbacks it wouldcome with.

Mkvmerge Gui Download

The current UI the result of finding a compromise that implements as many of the requirements as possible. It's not perfect & I'm definitelywilling to improve upon it.