mkv2sup - Auto export and determine forced subtitles

Discussion of advanced MakeMKV functionality, expert mode, conversion profiles
mkv2sup - Auto export and determine forced subtitles

Post by mgutt

I finished this "little" bash script to automatically export all subtitles of all MKVs and determine the forced subtitles of them. Forced subtitles are determined by their filesize (<3MB works for 99% of my Blu-Ray rips) or if the subtitle track name already has the track name "Forced". For example if you have named them through MakeMKV:
2019-11-19 22_43_17.jpg
Everytime the script is executed it processes the next MKV file. So you need to restart it by cronjob or task schedulers. You can copy & paste this script directly into the command line as well.

P.S. Please vote for auto-naming of forced subtitles.

- movies_path: The path to your movie collection
- docker_config_path: Compare with your docker configuration
- sub_langs: "all" exports all subtitle languages or "ger,eng,tur" exports these specific languages
- sub_forced_max_size maybe you want to reduce the subtitle filesize for TV shows episodes

- Linux
- Docker
- MKVToolNix Container
- Movie collection in sub folders like "/volume1/movies/Ben Hur (1959)/Ben Hur (1959).mkv"


Code: Select all

# #####################################
# mkv2sup v0.5
# Notes:
# mkv2sup automatically exports all subtitles of all MKV files in a specific folder.
# After that it determines the forced subtitles. MKVs without (compatible) subtitles will be skipped.
# It works only with MKVs in subfolders like /volume1/movies/Ben Hur/Ben Hur.mkv
# Changelog:
# 0.5
# - "bin/bash" added to the head of the script to force the usage of the correct interpreter
# 0.4
# - check mkv modification filetime to ensure its not currently written through an other app
# 0.3
# - Docker is now optional
# - No exported SUP will be deleted if the new option <preserve_sup_files> is enabled
# - Check if mkvtoolnix docker container is in use by other bash script before killing it
# 0.2
# - Bug fix: Changed some exit status codes
# - Bug fix: Some file names containing dots were cut
# - Bug fix: Now all <sub_langs> of forced subtitles are renamed and not only those with <default_lang>
# - Bug fix: Named SUP files that were only skipped, will now be deleted, too
# 0.1
# - first release
# Todo:
# - update subtitle track names in mkv file
# - set forced track as default in mkv file
# - how to solve doubles (two forced subtitle tracks in the same language, at the moment those will be tried to be renamed, but this fails as there already exists one)
# - add support for DVDs S_VOBSUB subtitles
# - determine all SRT subtitles (Regular, SDH, etc.) by using word/char matching
# - while writing SUP files check if one contains the word "Forced" and is <default_lang>
# - skip MKV files that are currently written
# #####################################
# ######### Settings ##################
sub_langs="ger,eng,tur" # Use "all" to preserve all subtitle languages. Note: The first language is set as default.
# #####################################
# ######### Script ####################
# check user settings
movies_path=$([[ "${movies_path: -1}" == "/" ]] && echo "${movies_path%?}" || echo "$movies_path")
docker_config_path=$([[ "${docker_config_path: -1}" != "/" ]] && echo "${docker_config_path}/" || echo "$docker_config_path")
default_lang="${sub_langs:0:3}" # first language is used as default language
sub_forced_max_size="${sub_forced_max_size//[!0-9.]/}" # float filtering (
sub_forced_max_size=$(awk "BEGIN { print $sub_forced_max_size*1000000}") # convert MB to Bytes
function exitus() {
    # check if container exists
    if [[ -x "$(command -v docker)" ]] && [[ "$(docker ps -q -f name=mkvtoolnix_mkv2sub)" ]]; then
        # stop container only if its not in use (by other shell script)
        mkvtoolnix_cpu_usage="$(docker stats mkvtoolnix_mkv2sub --no-stream --format "{{.CPUPerc}}")"
        # if [[ ${mkvtoolnix_cpu_usage%.*} -lt 1 ]]; then
            # we do not stop the container as our script is not race-condition safe!
            # echo "Stop mkvtoolnix container"
            # docker stop mkvtoolnix_mkv2sub
            # docker rm mkvtoolnix_mkv2sub
        # fi
    exit $exit_status
function mkv_getinfo() {
    # check if mkvtoolnix exists
    if [[ -x "$(command -v mkvmerge)" ]]; then
        echo "mkvtoolnix will be used to fetch tracks information"
        mkv_info="$(mkvmerge -J "$mkv_path")"
        return "$mkv_info"
    # check if docker exists
    elif [[ -x "$(command -v docker)" ]]; then
        echo "Docker will be used to fetch tracks information"
        # check if mkvtoolnix container exists
        if [[ ! "$(docker ps -q -f name=mkvtoolnix_mkv2sub)" ]]; then #
            # check for blocking container
            if [[ "$(docker ps -aq -f status=exited -f name=mkvtoolnix_mkv2sub)" ]]; then
                docker rm mkvtoolnix_mkv2sub
            echo "mkvtoolnix container needs to be started"
            # start mkvtoolnix container
                run -d
                -e TZ=Europe/Berlin
                -v "${docker_config_path}mkvtoolnix_mkv2sub:/config:rw"
                -v "${movies_path}:/storage:rw"
            echo "docker ${docker_options[@]}"
            docker "${docker_options[@]}"
        mkv_info="$(docker exec mkvtoolnix_mkv2sub /usr/bin/mkvmerge -J "$docker_mkv_path")"
    echo "mkvtoolnix and docker do not exist!"
    exitus 1
function mkv_extract() {
    # check if mkvtoolnix exists
    if [[ -x "$(command -v mkvmerge)" ]]; then
        echo "mkvtoolnix will be used to extract tracks"
        mkv_info="$(mkvextract "$mkv_path" "${mkvextract_options[@]}")"
    # check if docker exists
    elif [[ -x "$(command -v docker)" ]]; then
        echo "mkvtoolnix@docker will be used to extract tracks"
        # check if mkvtoolnix container exists
        if [[ ! "$(docker ps -q -f name=mkvtoolnix_mkv2sub)" ]]; then #
            # check for blocking container
            if [[ ! "$(docker ps -aq -f status=exited -f name=mkvtoolnix_mkv2sub)" ]]; then
                docker rm mkvtoolnix_mkv2sub
            echo "mkvtoolnixcontainer needs to be started"
            # start mkvtoolnix container
                run -d
                -e TZ=Europe/Berlin
                -v "${docker_config_path}mkvtoolnix_mkv2sub:/config:rw"
                -v "${movies_path}:/storage:rw"
            echo "docker ${docker_options[@]}"
            docker "${docker_options[@]}"
        echo "docker exec mkvtoolnix_mkv2sub /usr/bin/mkvextract $docker_mkv_path ${mkvextract_options[@]}"
        docker exec mkvtoolnix_mkv2sub /usr/bin/mkvextract "$docker_mkv_path" "${mkvextract_options[@]}"
    echo "mkvtoolnix and docker do not exist!"
    exitus 1
# get next mkv file
shopt -s nullglob # avoid empty directory errors (
for movie_path in "$movies_path"/*; do
    mkv_folder="$(basename "$movie_path")"
    echo "Parsing '$mkv_folder'..."
    for mkv_path in "$movie_path"/*.mkv; do
        mkv_basename=$(basename "$mkv_path")
        file_time=$(stat -c %Y "$mkv_path") # file modification time
        file_time=$(($file_time+120)) # the last modification of the file should be a few time ago
        current_time=$(date +%s) # actual timestamp
        if [[ $file_time -gt $current_time ]]; then
        for sup_path in "$movie_path"/*.sup; do
            sup_basename=$(basename "$sup_path")
            if [[ $sup_basename == *"$mkv_filename."* ]]; then
                # skip this mkv file because its sup subtitle has been found
                continue 2
        for srt_path in "$movie_path"/*.srt; do
            srt_basename=$(basename "$srt_path")
            if [[ $srt_basename == *"$mkv_filename."* ]]; then
                # skip this mkv file because its srt subtitle has been found
                continue 2
        echo "'$mkv_path' has been found."
        # we found an mkv file without subtitle files
        break 2;
shopt -u nullglob # its important to reset this setting (
# no mkv file found
if [[ -z $docker_mkv_path ]]; then
    echo "No mkv files found or all subtitles have been exported!"
    exitus 0
mkv_getinfo # uses $mkv_path, fills $mkv_info
if [[ -z $mkv_info ]]; then
    echo "Error while fetching tracks information with mkvmerge"
    exitus 1
echo "Informations of all tracks have been obtained."
# parse info
sub_track_ids=(); track_langs=(); track_names=(); track_codec_ids=();
while read -r line ; do
    echo $line
    # Note: we did not use "jq -r" to parse JSON as it needs installation
    track_codec_name=$(echo $line | grep -oP '^.*?(?=\")')
    track_id=$(echo $line | grep -oP '(?<="id": )[0-9]+')
    track_bits=$(echo $line | grep -oP '(?<="audio_bits_per_sample": )[0-9]+')
    track_channels=$(echo $line | grep -oP '(?<="audio_channels": )[0-9]+')
    track_codec_id=$(echo $line | grep -oP '(?<="codec_id": ").*?[^\\](?=\",)')
    track_lang=$(echo $line | grep -oP '(?<="language": ")[a-z]+')
    track_name=$(echo $line | grep -oP '(?<="track_name": ").*?[^\\](?=\",)') # most flexible way of getting a JSON value (
    track_default=$(echo $line | grep -oP '(?<="default_track": )(true|false)')
    track_forced=$(echo $line | grep -oP '(?<="forced_track": )(true|false)')
    track_type=$(echo $line | grep -oP '(?<=")[a-z]+$')
    # collect track langs
    if [[ -n $track_lang ]]; then
        track_langs[$track_id]='und' # und = undetermined
    # collect track names
    if [[ -n $track_name ]]; then
    # collect codec ids
    if [[ -n $track_codec_id ]]; then
    # collect subtitles in prefered languages
    if [[ $track_type == "subtitles" ]] && [[ $track_codec_id == "S_HDMV/PGS" ]]; then
        if [[ $sub_langs == "all" ]] || [[ $sub_langs == *"$track_lang"* ]]; then
done < <(echo "$mkv_info" | 
        tr -d '\n' | # we need to remove line breaks with "tr" to force grep to return one-liners
        grep -oP '(?<=codec": ").*?"type": "[a-z]+') # Regex is faster than looping through all lines
# create empty sup file if mkv file does not contain any subtitles (by that it will be skipped in next turn)
if [[ ${#sub_track_ids[@]} -eq 0 ]];then
    echo "The empty SUP file '${empty_sup_filename}' will be created to skip MKV file '${mkv_path}' in the next turn as it does not contain any (compatible) subtitles."
    touch "$empty_sup_filename"
    exitus 0
# build mkvextract export parameter
for track_id in "${sub_track_ids[@]}"; do
    # file naming scheme "Movie_Name.[Language_Code].forced.ext" adopted from Plex (
# export all subtitles
mkv_extract # uses mkv_path, docker_mkv_path, mkvextract_options, movies_path
echo "Successfully extracted all subtitles"
# determine forced subtitle
shopt -s nullglob
shopt -s nocasematch # insensitive string comparison (
for sup_path in "${movie_path}"/*.sup; do
    # get path parts
    sup_dirname=$(dirname "$sup_path")
    sup_basename=$(basename "$sup_path")
    sup_filename=${sup_basename%.*.*.*.*} # (filename).track[0-9].<lang>.<name>.sup
    sup_extension=${sup_basename/#"$sup_filename"./} # filename.(track[0-9].<lang>.<name>.sup)
    # fetch track data through filename
    IFS='.' # set internal field separator to dot (default is whitespace)
    read -ra track_data <<< "$sup_extension" # explode to array (
    unset IFS; # unset internal field separator
    track_id=${track_id/track/} # remove the word "track"
    # skip SUP files with wrong naming scheme
    if [[ -n ${track_id//[0-9]/} ]]; then
        echo "'$track_id' is not a track id"
    if [[ ${#track_lang} -lt 2 ]] || [[ ${#track_lang} -gt 3 ]] || [[ -n "${track_lang//[a-zA-Z]/}" ]]; then
        echo "'$track_lang' is not a track lang"
    if [[ -n ${track_name//[a-zA-Z \']/} ]]; then
        echo "'$track_name' is not a track name"
    # set Plex compatible filename (
    # determine by track name
    if [[ $track_name == "forced" ]] && [[ $sub_langs == *"$track_lang"* ]] || [[ $default_lang == "all" ]]; then
        mv "$sup_path" "$sup_filename_new"
        echo "'$sup_path' has been renamed to '$sup_filename_new'"
    # skip subtitle tracks that already have names like "Regular", "SDH", etc.)
    if [[ $track_name != "und" ]];then
        if [[ $preserve_sup_files == "false" ]]; then
            rm -rf "$sup_path"
            echo "'$sup_path' has been deleted"
    # determine by filesize
    filesize=$(stat -c%s "$sup_path")
    if [ $sub_forced_max_size -ge $filesize ]; then
        echo "'$sup_path' is small enough to be a forced subtitle"
        mv "$sup_path" "$sup_filename_new"
        # cp --backup "$sup_path" "$sup_filename_new"
        echo "'$sup_path' has been renamed to '$sup_filename_new'"
    # delete all other exported subtitles
    if [[ $preserve_sup_files == "false" ]]; then
        rm -rf "$sup_path"
        echo "'$sup_path' has been deleted"
shopt -u nocasematch
shopt -u nullglob
# create empty sup file if mkv does not contain at least one forced subtitle (by that it will be skipped in next turn)
if [[ $preserve_sup_files != "true" ]] && [[ $forced_found != "true" ]]; then
    echo "The empty SUP file '${empty_sup_filename}' has been created to skip MKV file '${mkv_path}' in the next turn as it does not contain forced subtitles."
    touch "$empty_sup_filename"
    exitus 0
exitus 0
Re: mkv2sup - Auto export and determine forced subtitles

Post by mgutt

Post by mgutt »

mkv2sup exports all subtitles:
2019-11-19 22_01_30.jpg
then determines the forced subtitles:
2019-11-19 22_01_40.jpg
Re: mkv2sup - Auto export and determine forced subtitles

Post by mgutt

Post by mgutt »

Finally you can drag & drop the SUP files into Subtitle Edit and convert them to SRT:
2019-11-19 23_33_13.jpg
The SUP file naming is based on Plex: ... dia/#toc-3
Movies/Movie_Name (Release Date).[Language_Code].forced.ext
Re: mkv2sup - Auto export and determine forced subtitles

Post by mgutt

Post by mgutt »

Multiple MKVs per movie are supported as well:

While exporting all subtitles:
2019-11-20 00_30_26.jpg
2019-11-20 00_32_49.jpg
Because of that it supports TV shows as well, but only if your episodes are not located in season subfolders as mkv2sub does not crawl those sub-subfolders (not yet, maybe in a later release). So this will be processed:
"V:\Serien\Gilmore Girls (2000)\s01e01 Alles auf Anfang.mkv"
"V:\Serien\Gilmore Girls (2000)\s01e02 Ein klassischer Fehlstart.mkv"
"V:\Serien\Gilmore Girls (2000)\s01e03 Familie mit Handicap.mkv"
"V:\Serien\Gilmore Girls (2000)\s02e05 Ein schwerer Fall.mkv"
"V:\Serien\Gilmore Girls (2000)\s02e01 Der Antrag.mkv"
"V:\Serien\Gilmore Girls (2000)\s02e02 Nicht ohne meine Mutter.mkv"
But these not:
"V:\Serien\Gilmore Girls (2000)\Staffel 1\s01e01 Alles auf Anfang.mkv"
"V:\Serien\Gilmore Girls (2000)\Staffel 1\s01e02 Ein klassischer Fehlstart.mkv"
"V:\Serien\Gilmore Girls (2000)\Staffel 1\s01e03 Familie mit Handicap.mkv"
"V:\Serien\Gilmore Girls (2000)\Staffel 2\s02e05 Ein schwerer Fall.mkv"
"V:\Serien\Gilmore Girls (2000)\Staffel 2\s02e01 Der Antrag.mkv"
"V:\Serien\Gilmore Girls (2000)\Staffel 2\s02e02 Nicht ohne meine Mutter.mkv"
Re: mkv2sup - Auto export and determine forced subtitles

Post by mgutt

Post by mgutt »

Version 0.2 has been released:

Code: Select all

# 0.2
# - Bug fix: Changed some exit status codes
# - Bug fix: Some file names containing dots were cut
# - Bug fix: Now all <sub_langs> of forced subtitles are renamed and not only those with <default_lang>
# - Bug fix: Named SUP files that were only skipped, will now be deleted, too
Some new To do's:
# - use mkvtoolnix instead of docker container (if installed)
# - add support for DVDs S_VOBSUB subtitles
# - determine all SRT subtitles (Regular, SDH, etc.) by using word/char matching
The last idea could be realized through a separate Bash script. I'm not sure at the moment.
Re: mkv2sup - Auto export and determine forced subtitles

Post by mgutt

Post by mgutt »

Version 0.5 has been released. Updates since 0.2:

Code: Select all

# 0.5
# - "bin/bash" added to the head of the script to force the usage of the correct interpreter
# 0.4
# - check mkv modification filetime to ensure its not currently written through an other app
# 0.3
# - Docker is now optional
# - No exported SUP will be deleted if the new option <preserve_sup_files> is enabled
# - Check if mkvtoolnix docker container is in use by other bash script before killing it
