Page 1 of 1

Deobfuscate playlists "automatically" (The CompSci approach)

Posted: Sun Jan 01, 2017 8:48 pm
by ex0morph
Hi all,

Long time user, first time poster.

I was having difficulty with a couple Blurays that utilise playlist obfuscation in an attempt to frustrate us MakeMKV users. Some cases it was simple to find the right playlist by searching the forums for the correct title found by others. However I encountered a Bluray that no one had posted the correct MPLS id for (at least the particular authoring of the Bluray I had). Now I know there are several manual approaches to figuring out the correct playlist but these either require:
  1. A Windows machine with a Bluray drive and propitiatory software
  2. Potentially spoiling the film by manually working out the correct M2TS order
So I put my geek on and used some computer vision to essentially do B programmatically. My assumption going into this was that each M2TS file should seamlessly follow on from each other (i.e there should be minimal difference between the last frame of one M2TS file and the first frame of the correct follow on M2TS file). One caveat to this would be if a Chapter fell exactly on a segment boundary (I'll ignore this for the time being).

I have never been one for words so will let the code do the talking. What you will need to run this (if you so wish to):
  • Python (with modules: numpy, scipy and scikit)
  • ffmpeg
  • OpenCV3
  • MakeMKVcon
  • A Decrypted backup of a Bluray with obfuscated playlists
On a debian system (assuming that MakeMKV is installed) just run the following to install the dependencies:

Code: Select all

sudo apt-get install libopencv-dev python-opencv
pip install scikit-image
The following code implements the deobfuscation by doing the following:
  • Captures the first and last frames of M2TS files from a given STREAM folder
  • Uses makemkvcon to extract the segment lists for each title of a Bluray
  • Loops over each playlist calculating the average difference between two frames either side of a segment transition. (It calculates both the mean squared error of pixel values and the structural similarity)
  • Displays title info for the "best" playlist for each difference metric.
Now for the code:

Save this to a python file (e.g. deobfuscate_playlists.py) and run like so:

Code: Select all

python deobfuscate_playlists.py <Path to where the decrypted Bluray folder is - i.e. where the BDMV folder is located (not the actual BDMV folder)>
You can also add a "--test" flag to the end of the above command that will pop up a window showing the first and last frame for each found M2TS file. This is useful to check that OpenCV is correctly installed.

Notes:
  • No guarantees what so ever this will work. I have only been able to test on the 2 obfuscated Blurays I own - which it did correctly identify the playlist in both cases. Would be interested in others testing it on Blurays that they know the correct playlist id.
  • The process takes several minutes to run (3-4 mins for the BRs I tested) but does use some caching to avoid unnecessary re-calculations of the same segment transitions.
  • The code is pretty shocking - I hacked it together in a couple hours - Please don't hate on it too much. ("Magic" 3 and 5 in the code if you have a keen eye)
  • I have no experience with makemkvcon prior to this so the section "Get segment lists from MakeMKV" is likely to be fragile and a terrible way to achieve what it is trying to.
  • The Structural Similarity method is the only one that has successfully identified the correct playlist for me. I left the other one in for testing with other BRs.

Code: Select all

import argparse
import cv2
import numpy as np
import os
from skimage.measure import compare_ssim
from subprocess import Popen, PIPE

parser = argparse.ArgumentParser(description="Attempts to find the correct playlist for a Bluray")
parser.add_argument("folder", help="Path to where the BDMV folder is located (Exclude BDMV from path)")
parser.add_argument("-s", "--stream", help="Path to where the STREAM folder is located (Auto as {folder}/BDMV/STREAM)")
parser.add_argument("-m", "--makemkvpath", help="The path to makemkvcon (Default is /usr/bin/makemkvcon)")
parser.add_argument("--test", action="store_true", help="Shows captured frames to test that OpenCV / FFMEPG is working correctly")
ARGS = parser.parse_args()

STREAM_PATH = ARGS.stream if ARGS.stream else os.path.join(ARGS.folder, "BDMV", "STREAM")
MAKEMKVCON = ARGS.makemkvpath if ARGS.makemkvpath else "/usr/bin/makemkvcon"

FIRST_LAST_FRAMES = {}

try:
    CV_FRAME_COUNT = cv2.cv.CV_CAP_PROP_FRAME_COUNT
    CV_PROP_POS_FRAMES = cv2.cv.CV_CAP_PROP_POS_FRAMES
except AttributeError:
    CV_FRAME_COUNT = cv2.CAP_PROP_FRAME_COUNT
    CV_PROP_POS_FRAMES = cv2.CAP_PROP_POS_FRAMES

# ================================================ #
# Grab first and last frames of all segments       #
# ================================================ #
def preprocess_frame(frame):
    try:
        return cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    except:
        print "Skipped - Probably title music"
        return None

for file in os.listdir(STREAM_PATH):
    if file.endswith(".m2ts"):
        print "Found STREAM file: " + file
        cap = cv2.VideoCapture(os.path.join(STREAM_PATH, file))

        total_frames = cap.get(CV_FRAME_COUNT)

        _, first_frame = cap.read()
        cap.set(CV_PROP_POS_FRAMES, total_frames-5)
        _, last_frame = cap.read()
        segment_id = os.path.splitext(file.strip('0'))[0]

        first_frame, last_frame = preprocess_frame(first_frame), preprocess_frame(last_frame)

        if first_frame != None and last_frame != None:
            FIRST_LAST_FRAMES[segment_id] = (first_frame, last_frame)

            if ARGS.test:
                first_and_last = np.concatenate((first_frame, last_frame), axis=1)
                first_and_last = cv2.resize(first_and_last, None, fx=0.3, fy=0.3, interpolation=cv2.INTER_CUBIC)
                cv2.imshow(file, first_and_last)
                while True:
                    key = cv2.waitKey(1)
                    if key & 0xFF == ord('n'):
                        cv2.destroyAllWindows()
                        break
                    if key & 0xFF == ord('q'):
                        cv2.destroyAllWindows()
                        exit()
        cap.release()

# ================================================ #
# Get segment lists from MakeMKV                   #
# ================================================ #
class TitleInfo(object):
    def __init__(self, mkv_key, mpls_id, playlist):
        self.mpls_id = mpls_id
        self.playlist = playlist
        self.mkv_key = mkv_key
        self.differences = 99999999999
        self.struct_similarity = 0

def create_titleinfo(block):
    mkv_key = block[0].split(",")[0][6:]
    mpls_file = block[5].split("\"")[1]
    mpls_id = os.path.splitext(mpls_file.strip('0'))[0]
    playlist = block[7].split("\"")[1]
    if len(playlist.split(',')) < 3:
        print "Invalid playlist (" + playlist + ") - possibly the wrong TINFO index?"
        return None

    title_info = TitleInfo(mkv_key, mpls_id, playlist)
    return title_info

TITLE_INFO_STORE = []

stdout, strerr = Popen([MAKEMKVCON, "-r", "info", "file:" + ARGS.folder], stdout=PIPE, stderr=PIPE).communicate()
mkv_info = stdout.splitlines()

linenum = 0

while linenum < len(mkv_info):
    line = mkv_info[linenum]
    if line.startswith("TINFO"):
        startnum = linenum
        while line.startswith("TINFO") and linenum < len(mkv_info):
            linenum += 1
            line = mkv_info[linenum]
        title_info = create_titleinfo(mkv_info[startnum:linenum])
        if title_info != None:
            TITLE_INFO_STORE.append(title_info)
    else:
        linenum += 1

# ================================================ #
# Calculate differences                            #
# ================================================ #

CACHED_DIFFERENCES = {}

def mean_squared_error(last_frame_of_previous, first_frame_of_next):
    mse = np.sum((last_frame_of_previous.astype("float") - first_frame_of_next.astype("float")) ** 2)
    mse /= float(last_frame_of_previous.shape[0] * first_frame_of_next.shape[1])
    return mse

def calculate_differences(segment_a, segment_b):
    diff = mean_squared_error(FIRST_LAST_FRAMES[segment_a][1], FIRST_LAST_FRAMES[segment_b][0])
    struct = compare_ssim(FIRST_LAST_FRAMES[segment_a][1], FIRST_LAST_FRAMES[segment_b][0])
    return diff, struct

for title in TITLE_INFO_STORE:
    segments = title.playlist.split(",")
    diff = 0
    struct = 0
    for ind in xrange(0, len(segments) - 1):
        cache_key = segments[ind] + "_" + segments[ind + 1]
        if cache_key in CACHED_DIFFERENCES:
            diff += CACHED_DIFFERENCES[cache_key][0]
            struct += CACHED_DIFFERENCES[cache_key][1]
        else:
            frame_diff, frame_struct = calculate_differences(segments[ind], segments[ind + 1])
            CACHED_DIFFERENCES[cache_key] = (frame_diff, frame_struct)
            diff += frame_diff
            struct += frame_struct

    title.differences = diff / len(segments)
    title.struct_similarity = struct / len(segments)

    print ",".join([str(title.differences), str(title.struct_similarity), title.mkv_key, title.mpls_id, title.playlist])

# ================================================ #
# Find the "best" playlists                        #
# ================================================ #
best_mse = TITLE_INFO_STORE[0]
best_ssim = TITLE_INFO_STORE[0]
for title in TITLE_INFO_STORE:
    if title.differences < best_mse.differences:
        best_mse = title
    if title.struct_similarity > best_ssim.struct_similarity:
        best_ssim = title

print "==========================================="
print "Best title (Based on pixel differences)"
print "==========================================="
print "AvgDiff: " + str(best_mse.differences)
print "AvgStru: " + str(best_mse.struct_similarity)
print "MKV Key: " + best_mse.mkv_key
print "MPLS ID: " + best_mse.mpls_id
print "SegList: " + best_mse.playlist
print "==========================================="
print "Best title (Based on structural similarity)"
print "==========================================="
print "AvgDiff: " + str(best_ssim.differences)
print "AvgStru: " + str(best_ssim.struct_similarity)
print "MKV Key: " + best_ssim.mkv_key
print "MPLS ID: " + best_ssim.mpls_id
print "SegList: " + best_ssim.playlist

Re: Deobfuscate playlists "automatically" (The CompSci appro

Posted: Sat Mar 11, 2017 9:22 am
by zeroepoch
This is a very useful script! I've been doing the makemkvcon -r approach and going segment by segment and removing titles from the list as I go to the next field until I narrow it down to one. I thought about using some image processing to compare the segments but was too lazy to ever code it. I just use vlc with 2 files on the command line to check the transition. Of course sometimes you make a mistake and it's obvious later. This script sounds like it would catch that.

Re: Deobfuscate playlists "automatically" (The CompSci appro

Posted: Sat Mar 11, 2017 10:34 am
by zeroepoch
On Fedora 25 opencv isn't compiled against ffmpeg by default so I had to build opencv myself and do a bit of python path setting to get the script to work. Unfortunately for Moonlight it doesn't find the right track. It's not even one of the top few on either MSE or SS. Although with BDJ support makemkv now does label the main track automatically. It seems to be correct for Moonlight.

Re: Deobfuscate playlists "automatically" (The CompSci appro

Posted: Sun Dec 24, 2017 9:14 pm
by thetoad
I've made some updates to this, can I check it into github under an OSS licence?

Re: Deobfuscate playlists "automatically" (The CompSci approach)

Posted: Fri Apr 26, 2019 1:57 am
by f1d094
thetoad wrote:
Sun Dec 24, 2017 9:14 pm
I've made some updates to this, can I check it into github under an OSS licence?
Did you ever post this update? If you didn't receive a response from the author, could you post the update here the way he did?

Re: Deobfuscate playlists "automatically" (The CompSci approach)

Posted: Wed Sep 11, 2019 10:55 pm
by kod4krome
I am about to try this. This should absolutely be on github. Can the original author PLEASE put this on github. And thank you for all your hard work.