BU40N MT1959 firmware format: packed region at 0x158000 (compression method known?)

Discussion of LibreDrive mode, compatible drives and firmwares
Post Reply
ibizara
Posts: 2
Joined: Mon Jun 22, 2026 12:33 pm

BU40N MT1959 firmware format: packed region at 0x158000 (compression method known?)

Post by ibizara »

I'm comparing BU40N 1.00 and 1.03 firmware internals while investigating OmniDrive behaviour and noticed a large packed-looking region that differs almost completely between firmware versions despite having a very similar header structure.

Firmware images examined:

Code: Select all

BU40N_1.00_stock.bin
MD5: edb28fcd7a239281ace26a468d382a9c

BU40N_1.03_MK.bin
MD5: 74ebaf627d2aac5f899191d6caceb54c
The 1.00 image is the original LG firmware. The 1.03_MK image is the MakeMKV/LibreDrive patched firmware based on LG 1.03.

Looking at the raw firmware images, there appears to be a module beginning at offset 0x158000.

BU40N 1.00

Code: Select all

offset:       0x158000
field 1:      0x00072464 (468,068)
field 2:      0x0004E98F (321,935)
region end:   0x1A698F
BU40N 1.03_MK

Code: Select all

offset:       0x158000
field 1:      0x00071C60 (466,016)
field 2:      0x0004E3B1 (320,433)
region end:   0x1A63B1
The start of the region looks like:

Code: Select all

0x158000:
00072464
0004E98F
07060605
07070706
08070706
08080808
...
and the corresponding structure in 1.03_MK is almost identical.

After the first 8 bytes there is approximately 0x140 bytes of highly structured low-valued data, followed by a high-entropy stream that differs almost completely between firmware versions.

The interesting part is that the 0x140-byte structure does not appear random. Treating it as code-length data produced the following result:

First 288 entries → Kraft sum = 1.0
Last 32 entries → Kraft sum = 1.0
Entire 320-byte table → Kraft sum = 2.0

This suggests the region contains two complete prefix-code (Huffman-like) tables, arranged as:

Code: Select all

0x158000  size field
0x158004  size field
0x158008  288-byte code-length table
0x158128  32-byte code-length table
0x158148  compressed bitstream
The table structure is nearly identical between 1.00 and 1.03_MK, while the compressed stream contents are almost entirely different.

I extracted the stream and tested the obvious formats:

Code: Select all

zlib
raw deflate
gzip
bzip2
lzma/xz
All failed.

I also looked at MediaTek's documented ALICE firmware compression (used in some MTK products). While there are some conceptual similarities (table-driven compressed instruction streams), this BU40N format does not appear to be a standard ALICE container.

Questions:
  1. Has anyone identified the compression/packing method used for this MT1959 firmware block?
  2. Does the decompressor reside inside the firmware itself, or in MT1959 boot ROM / mask ROM?
  3. Does this region contain executable ARM code, servo/DSP microcode, or some other firmware component?
  4. Has this area ever been reverse engineered by the LibreDrive / MakeMKV developers or anyone working on MTK optical drive firmware?
I'm mainly trying to understand the firmware format and whether this packed region has ever been decoded or modified successfully.
RibShark
Posts: 19
Joined: Mon Apr 29, 2019 6:27 pm

Re: BU40N MT1959 firmware format: packed region at 0x158000 (compression method known?)

Post by RibShark »

ibizara wrote: Mon Jun 22, 2026 12:40 pm Questions:
  1. Has anyone identified the compression/packing method used for this MT1959 firmware block?
  2. Does the decompressor reside inside the firmware itself, or in MT1959 boot ROM / mask ROM?
  3. Does this region contain executable ARM code, servo/DSP microcode, or some other firmware component?
  4. Has this area ever been reverse engineered by the LibreDrive / MakeMKV developers or anyone working on MTK optical drive firmware?
  1. I failed to work out what it was when I tried.
  2. The decompression code is in the firmware it seems, as ARM code.
  3. It contains THUMB code; various areas in the firmware jump to this code via a thunk. It's always decompressed to the same place so the addresses are static.
  4. Not sure about the MakeMKV dev but I haven't tried much to reverse the compression algorithm yet.
Lemme know if you are able to work this out, would be super helpful (I'm the OmniDrive dev). Right now I'm relying on RAM dumps from the drive which have this part decompressed, but being able to do this and recompress back into the firmware could open up some doors.
ibizara
Posts: 2
Joined: Mon Jun 22, 2026 12:33 pm

Re: BU40N MT1959 firmware format: packed region at 0x158000 (compression method known?)

Post by ibizara »

Small update / WIP.

I made some progress on the 0x158000 packed block in BU40N 1.00.

The original table split still looks correct:

Code: Select all

0x158008  288-byte literal/length code-length table
0x158128   32-byte distance code-length table
0x158148  compressed bitstream
I now have an experimental Python decoder that expands the BU40N 1.00 block from:

Code: Select all

compressed:   0x72464
decompressed: 0x4e98f
The format appears to be a custom canonical-Huffman + LZ77-style scheme, but not standard DEFLATE.

Current working assumptions:

Code: Select all

bitstream:       MSB-first
Huffman:         canonical codes, bit-reversed for lookup
symbol 256:      literal zero, not EOF
symbols 257-287: length symbols
distance:        raw distance symbol + 1
This produces an output file of the advertised decompressed size. As a sanity check, the decoded output contains:
0x27b76: CAETDVD_59110933
So it is definitely producing structured data from the packed block.

Important caveat: I do not think this is 100% solved yet. The output contains plausible Thumb-looking code and strings, but it does not currently decompile cleanly as one linear ARM/Thumb image. There may still be a small semantic difference in the decoder, a relocation/fixup step, a second transform, or simply mixed code/data/microcode in the decompressed payload.

Here is the current Python script:

Code: Select all

#!/usr/bin/env python3
import argparse
import struct
from pathlib import Path

LBASE = [
    3, 4, 5, 6, 7, 8, 9, 10,
    11, 13, 15, 17, 19, 23, 27, 31,
    35, 43, 51, 59, 67, 83, 99, 115,
    131, 163, 195, 227, 258, 258, 258,
]

LEXT = [
    0, 0, 0, 0, 0, 0, 0, 0,
    1, 1, 1, 1, 2, 2, 2, 2,
    3, 3, 3, 3, 4, 4, 4, 4,
    5, 5, 5, 5, 0, 0, 0,
]


class BitReader:
    def __init__(self, data: bytes):
        self.data = data
        self.bitpos = 0

    def read(self, n: int) -> int:
        value = 0

        for i in range(n):
            if self.bitpos >= len(self.data) * 8:
                raise EOFError("ran out of compressed input")

            byte = self.data[self.bitpos >> 3]
            bit = (byte >> (7 - (self.bitpos & 7))) & 1
            value |= bit << i
            self.bitpos += 1

        return value


def reverse_bits(value: int, width: int) -> int:
    out = 0
    for _ in range(width):
        out = (out << 1) | (value & 1)
        value >>= 1
    return out


def build_canonical_table(lengths: bytes) -> dict[tuple[int, int], int]:
    counts: dict[int, int] = {}

    for length in lengths:
        if length:
            counts[length] = counts.get(length, 0) + 1

    code = 0
    next_code: dict[int, int] = {}

    for bits in range(1, max(counts.keys(), default=0) + 1):
        code = (code + counts.get(bits - 1, 0)) << 1
        next_code[bits] = code

    table: dict[tuple[int, int], int] = {}

    for symbol, length in enumerate(lengths):
        if not length:
            continue

        canonical = next_code[length]
        next_code[length] += 1

        # Required for this stream.
        stored_code = reverse_bits(canonical, length)
        table[(stored_code, length)] = symbol

    return table


def decode_symbol(br: BitReader, table: dict[tuple[int, int], int]) -> int:
    code = 0

    for length in range(1, 32):
        code |= br.read(1) << (length - 1)

        symbol = table.get((code, length))
        if symbol is not None:
            return symbol

    raise ValueError(f"bad Huffman code at bit {br.bitpos}")


def decompress_partition(firmware: bytes, offset: int = 0x158000) -> tuple[bytes, int, int, int]:
    compressed_size, output_size = struct.unpack_from("<II", firmware, offset)

    lit_table_off = offset + 8
    dist_table_off = lit_table_off + 288
    stream_off = dist_table_off + 32

    lit_lengths = firmware[lit_table_off:lit_table_off + 288]
    dist_lengths = firmware[dist_table_off:dist_table_off + 32]
    stream = firmware[stream_off:stream_off + compressed_size]

    lit_tree = build_canonical_table(lit_lengths)
    dist_tree = build_canonical_table(dist_lengths)

    br = BitReader(stream)
    out = bytearray()

    while len(out) < output_size:
        symbol = decode_symbol(br, lit_tree)

        if symbol < 256:
            out.append(symbol)
            continue

        # In this format, symbol 256 behaves as literal zero.
        if symbol == 256:
            out.append(0)
            continue

        length_index = symbol - 257

        if length_index < 0 or length_index >= len(LBASE):
            raise ValueError(
                f"bad length symbol {symbol} at output={len(out):#x}, bit={br.bitpos}"
            )

        length = LBASE[length_index]
        extra_bits = LEXT[length_index]

        if extra_bits:
            length += br.read(extra_bits)

        distance_symbol = decode_symbol(br, dist_tree)

        # Unlike DEFLATE, this currently appears to use raw distance symbols.
        distance = distance_symbol + 1

        if distance <= 0 or distance > len(out):
            raise ValueError(
                f"invalid distance {distance} at output={len(out):#x}, bit={br.bitpos}"
            )

        for _ in range(length):
            out.append(out[-distance])

            if len(out) >= output_size:
                break

    return bytes(out), compressed_size, output_size, br.bitpos


def main() -> None:
    parser = argparse.ArgumentParser(
        description="Experimental BU40N 1.00 0x158000 partition decoder"
    )
    parser.add_argument("firmware", help="input BU40N firmware .bin")
    parser.add_argument("-o", "--output", default="decoded_158000.bin")
    parser.add_argument("--offset", default="0x158000")

    args = parser.parse_args()

    firmware = Path(args.firmware).read_bytes()
    offset = int(args.offset, 0)

    decoded, compressed_size, output_size, bits_used = decompress_partition(
        firmware, offset
    )

    Path(args.output).write_bytes(decoded)

    print(f"partition offset:   {offset:#x}")
    print(f"compressed size:    {compressed_size:#x}")
    print(f"decompressed size:  {len(decoded):#x}/{output_size:#x}")
    print(f"bits consumed:      {bits_used}")
    print(f"wrote:              {args.output}")


if __name__ == "__main__":
    main()
Run with:

Code: Select all

python3 decode_158000.py BU40N_1.00_stock.bin

partition offset:   0x158000
compressed size:    0x72464
decompressed size:  0x4e98f/0x4e98f
bits consumed:      1632522
wrote:              decoded_158000.bin

strings -a -tx decoded_158000.bin | grep CAETDVD
Expected string:

Code: Select all

27b76 CAETDVD_59110933
If anyone has a RAM dump of this region after the drive has decompressed it, comparing that against this output would probably show exactly what is still missing.
Last edited by ibizara on Sun Jun 28, 2026 2:35 pm, edited 1 time in total.
RibShark
Posts: 19
Joined: Mon Apr 29, 2019 6:27 pm

Re: BU40N MT1959 firmware format: packed region at 0x158000 (compression method known?)

Post by RibShark »

ibizara wrote: Sun Jun 28, 2026 2:28 pm If anyone has a RAM dump of this region after the drive has decompressed it, comparing that against this output would probably show exactly what is still missing.
RAM dump here: https://workupload.com/file/daUYdQasjXj

Please check DMs, I'm interesting in working with you on this; but these forums aren't the best place due to downtime. Do you have anywhere else I can contact you?
ibizara
Posts: 2
Joined: Mon Jun 22, 2026 12:33 pm

Re: BU40N MT1959 firmware format: packed region at 0x158000 (compression method known?)

Post by ibizara »

Thanks, that RAM dump was exactly what was needed.

I agree with your correction: my original decoder had the two header fields interpreted the wrong way round, and it also had at least two semantic mistakes in the LZ/Huffman layer.

The original interpretation I was using was:

Code: Select all

field0 = compressed size
field1 = decompressed size
but after comparing against the RAM dump, the better interpretation appears to be:

Code: Select all

0x158000 field0 = nominal decompressed/output size
0x158004 field1 = packed partition span from 0x158000
For BU40N 1.00 stock:

Code: Select all

field0 = 0x72464
field1 = 0x4e98f
So the layout is now more likely:

Code: Select all

0x158000  output size / nominal decompressed size
0x158004  packed partition span
0x158008  288-byte literal/length code-length table
0x158128   32-byte distance code-length table
0x158148  compressed bitstream
The bitstream therefore runs up to:

Code: Select all

0x158000 + 0x4e98f = 0x1a698f
The RAM dump you provided is:

Code: Select all

size = 0x71c84
which is close to, but not exactly, the `0x72464` output-size field. I make the difference:

Code: Select all

0x72464 - 0x71c84 = 0x7e0
So either the dump is missing/padded/truncated slightly, or `field0` is a nominal workspace/output bound rather than a strict byte-for-byte RAM dump length.

The really useful part is that the RAM dump immediately proved where my old decoder went wrong. The previous output matched only until offset `0x1a`, then diverged.

The old decoder produced this around the first mismatch:

Code: Select all

f1 b5 04 00 c0 20 84 b0 6f f0 c6 ef 61 09 03 90
01 20 04 91 6f f0 c4 ef 04 20 20 20 20 c4 ef 04
but the RAM dump has:

Code: Select all

f1 b5 04 00 c0 20 84 b0 6f f0 c6 ef 61 09 03 90
01 20 04 91 6f f0 c4 ef 04 20 6f f0 c6 ef 04 20
That first bad copy led to two important fixes.

First, symbol `256` is not a literal zero and not EOF. It appears to be the first LZ length symbol.

So instead of:

Code: Select all

0..255   literal bytes
256      literal zero
257..287 length symbols
the corrected interpretation is:

Code: Select all

0..255   literal bytes
256..287 length symbols
with:

Code: Select all

256 -> length 3
257 -> length 4
258 -> length 5
259 -> length 6
...
Second, the distance is not simply:

Code: Select all

distance = distance_symbol + 1
The early RAM comparison suggests the distance coding is:

Code: Select all

distance_prefix = Huffman-coded distance symbol
distance_low7   = next 7 raw bits, read MSB-first
distance        = (distance_prefix << 7) | distance_low7
With that change, the first bad area is fixed.

For example, at output offset `0x1a`, the corrected decoder sees:

Code: Select all

length symbol 257 -> length 4
distance prefix 0
raw7 = 0x12
distance = 0x12
which copies:

Code: Select all

6f f0 c6 ef
from the earlier bytes, matching the RAM dump.

Then the next copy gives:

Code: Select all

04 20 6f f0 c6 ef
which also matches.

With these changes, my current decoder now matches the RAM dump until offset:

Code: Select all

0x28a
The first remaining mismatch I see is:

Code: Select all

offset 0x28a

decoded:
68 ff 04 00 01 00 5a 48 ...

RAM:
22 fe 04 00 01 00 5a 48 ...
That difference looks branch/immediate-like rather than random corruption, so I do not yet know whether this is still a decompression error, a firmware-version mismatch, a relocation/runtime patch, or a dump-window issue.

The current decoder output statistics are:

Code: Select all

decoded size:           0x72464
bits consumed:          2357865
bytes consumed rounded: 0x47f4e
unused stream bytes:    0x68f9
The unused stream bytes are also suspicious, so I would not call this solved yet. It is just much closer than the previous version.

Here is the revised experimental decoder:

Code: Select all

#!/usr/bin/env python3
"""
Experimental BU40N / MT1959 decoder for the packed block at 0x158000.

Current working assumptions:

  * header[0] is the nominal decompressed/output size
  * header[1] is the packed partition span, including header + tables
  * literal/length table is 288 bytes at offset + 0x08
  * distance table is 32 bytes at offset + 0x128
  * bitstream starts at offset + 0x148
  * bitstream is read physically MSB-first
  * canonical Huffman codes are bit-reversed for lookup
  * symbols 0..255 are literal bytes
  * symbol 256 is not EOF and not literal zero
  * symbols 256..287 are LZ length symbols
  * length symbol 256 maps to length 3
  * distance is encoded as:
        Huffman-coded prefix symbol, then 7 raw MSB-first bits
        distance = (distance_prefix << 7) | raw7

This is still experimental. It fixes the previous divergence at 0x1a
against RibShark's RAM dump, but still diverges later.
"""

from __future__ import annotations

import argparse
import hashlib
import struct
from pathlib import Path


# Length bases, shifted so symbol 256 maps to length 3.
#
# This is deliberately not using DEFLATE's symbol numbering directly:
#
#   DEFLATE: symbol 257 -> length 3
#   here:    symbol 256 -> length 3
#
# At least in the verified early stream, no DEFLATE-style length extra bits
# are consumed. Consuming extra bits misaligns the following distance coding.
LBASE = [
    3, 4, 5, 6, 7, 8, 9, 10,
    11, 13, 15, 17, 19, 23, 27, 31,
    35, 43, 51, 59, 67, 83, 99, 115,
    131, 163, 195, 227, 258, 258, 258, 258,
]


class BitReader:
    def __init__(self, data: bytes):
        self.data = data
        self.bitpos = 0

    def _read_physical_bit(self) -> int:
        if self.bitpos >= len(self.data) * 8:
            raise EOFError("ran out of compressed input")

        byte = self.data[self.bitpos >> 3]
        bit = (byte >> (7 - (self.bitpos & 7))) & 1
        self.bitpos += 1
        return bit

    def read_huffman_bits(self, n: int) -> int:
        """
        Read n physical MSB-first bits, accumulating into bit 0 upwards.

        This matches the reversed canonical-code lookup used by the previous
        script and still appears to be correct.
        """
        value = 0

        for i in range(n):
            value |= self._read_physical_bit() << i

        return value

    def read_raw_msb(self, n: int) -> int:
        """
        Read n raw payload bits as a normal MSB-first integer.
        """
        value = 0

        for _ in range(n):
            value = (value << 1) | self._read_physical_bit()

        return value


def reverse_bits(value: int, width: int) -> int:
    out = 0

    for _ in range(width):
        out = (out << 1) | (value & 1)
        value >>= 1

    return out


def build_canonical_table(lengths: bytes) -> dict[tuple[int, int], int]:
    counts: dict[int, int] = {}

    for length in lengths:
        if length:
            counts[length] = counts.get(length, 0) + 1

    code = 0
    next_code: dict[int, int] = {}

    for bits in range(1, max(counts.keys(), default=0) + 1):
        code = (code + counts.get(bits - 1, 0)) << 1
        next_code[bits] = code

    table: dict[tuple[int, int], int] = {}

    for symbol, length in enumerate(lengths):
        if not length:
            continue

        canonical = next_code[length]
        next_code[length] += 1

        stored_code = reverse_bits(canonical, length)
        table[(stored_code, length)] = symbol

    return table


def decode_symbol(br: BitReader, table: dict[tuple[int, int], int]) -> int:
    code = 0

    for length in range(1, 32):
        code |= br.read_huffman_bits(1) << (length - 1)

        symbol = table.get((code, length))
        if symbol is not None:
            return symbol

    raise ValueError(f"bad Huffman code at bit {br.bitpos}")


def first_mismatch(a: bytes, b: bytes) -> int | None:
    for i, (x, y) in enumerate(zip(a, b)):
        if x != y:
            return i

    if len(a) != len(b):
        return min(len(a), len(b))

    return None


def decompress_partition(
    firmware: bytes,
    offset: int = 0x158000,
    output_limit: int | None = None,
) -> tuple[bytes, dict[str, int]]:
    output_size, packed_span = struct.unpack_from("<II", firmware, offset)

    lit_table_off = offset + 8
    dist_table_off = lit_table_off + 288
    stream_off = dist_table_off + 32
    packed_end = offset + packed_span

    if packed_end > len(firmware):
        raise ValueError(
            f"packed span ends beyond file: end={packed_end:#x}, file={len(firmware):#x}"
        )

    lit_lengths = firmware[lit_table_off:lit_table_off + 288]
    dist_lengths = firmware[dist_table_off:dist_table_off + 32]
    stream = firmware[stream_off:packed_end]

    lit_tree = build_canonical_table(lit_lengths)
    dist_tree = build_canonical_table(dist_lengths)

    if output_limit is None:
        output_limit = output_size

    br = BitReader(stream)
    out = bytearray()

    while len(out) < output_limit:
        symbol = decode_symbol(br, lit_tree)

        if symbol < 256:
            out.append(symbol)
            continue

        length_index = symbol - 256

        if length_index < 0 or length_index >= len(LBASE):
            raise ValueError(
                f"bad length symbol {symbol} at output={len(out):#x}, bit={br.bitpos}"
            )

        length = LBASE[length_index]

        distance_prefix = decode_symbol(br, dist_tree)
        distance_low7 = br.read_raw_msb(7)
        distance = (distance_prefix << 7) | distance_low7

        if distance <= 0 or distance > len(out):
            raise ValueError(
                f"invalid distance {distance} at output={len(out):#x}, "
                f"prefix={distance_prefix}, low7={distance_low7:#x}, bit={br.bitpos}"
            )

        for _ in range(length):
            out.append(out[-distance])

            if len(out) >= output_limit:
                break

    stats = {
        "output_size_header": output_size,
        "packed_span_header": packed_span,
        "stream_size": len(stream),
        "bits_consumed": br.bitpos,
        "bytes_consumed_rounded": (br.bitpos + 7) // 8,
        "unused_stream_bytes": len(stream) - ((br.bitpos + 7) // 8),
    }

    return bytes(out), stats


def main() -> None:
    parser = argparse.ArgumentParser(
        description="Experimental BU40N/MT1959 0x158000 packed-partition decoder"
    )
    parser.add_argument("firmware", help="input firmware .bin")
    parser.add_argument("-o", "--output", default="decoded_158000_v2.bin")
    parser.add_argument("--offset", default="0x158000")
    parser.add_argument(
        "--compare",
        help="optional RAM dump / oracle to compare against",
    )
    parser.add_argument(
        "--output-limit",
        default=None,
        help="override output limit, e.g. 0x71c84. Defaults to header[0].",
    )

    args = parser.parse_args()

    firmware = Path(args.firmware).read_bytes()
    offset = int(args.offset, 0)
    output_limit = int(args.output_limit, 0) if args.output_limit else None

    decoded, stats = decompress_partition(firmware, offset, output_limit)
    Path(args.output).write_bytes(decoded)

    print(f"partition offset:       {offset:#x}")
    print(f"output size header:     {stats['output_size_header']:#x}")
    print(f"packed span header:     {stats['packed_span_header']:#x}")
    print(f"stream size:            {stats['stream_size']:#x}")
    print(f"decoded size:           {len(decoded):#x}")
    print(f"bits consumed:          {stats['bits_consumed']}")
    print(f"bytes consumed rounded: {stats['bytes_consumed_rounded']:#x}")
    print(f"unused stream bytes:    {stats['unused_stream_bytes']:#x}")
    print(f"md5(decoded):           {hashlib.md5(decoded).hexdigest()}")
    print(f"wrote:                  {args.output}")

    if args.compare:
        oracle = Path(args.compare).read_bytes()
        mismatch = first_mismatch(decoded, oracle)

        print()
        print(f"compare file:           {args.compare}")
        print(f"compare size:           {len(oracle):#x}")
        print(f"md5(compare):           {hashlib.md5(oracle).hexdigest()}")

        if mismatch is None:
            print("compare result:         exact match")
        else:
            print(f"first mismatch:         {mismatch:#x}")
            print(
                "decoded bytes:          "
                + decoded[mismatch:mismatch + 16].hex(" ")
            )
            print(
                "compare bytes:          "
                + oracle[mismatch:mismatch + 16].hex(" ")
            )


if __name__ == "__main__":
    main()
Example usage:

Code: Select all

python3 decode_158000_v2.py BU40N_1.00_stock.bin \
  -o decoded_158000_v2.bin \
  --compare "BU40N 1.00 Decompressed Code.bin"
  
partition offset:       0x158000
output size header:     0x72464
packed span header:     0x4e98f
stream size:            0x4e847
decoded size:           0x72464
bits consumed:          2357865
bytes consumed rounded: 0x47f4e
unused stream bytes:    0x68f9
md5(decoded):           f6540a416982958434e499a94a89564f
wrote:                  decoded_158000_v2.bin

compare file:           BU40N 1.00 Decompressed Code.bin
compare size:           0x71c84
md5(compare):           8d3e2aa8df4beaf815a602afb1ee44b4
first mismatch:         0x28a
decoded bytes:          68 ff 04 00 01 00 5a 48 6f f0 aa ee 61 09 01 20
compare bytes:          22 fe 04 00 01 00 5a 48 6f f0 aa ee 61 09 01 20
Current conclusion:
  • The table split and canonical-Huffman layer still look correct.
  • The header fields were reversed in my first script.
  • Symbol 256 is a length symbol, not literal zero.
  • Distance coding appears to be prefix-Huffman + 7 raw bits.
  • The decoder now gets past the old 0x1a failure and matches to 0x28a.
  • It is still not fully solved.
ibizara
Posts: 2
Joined: Mon Jun 22, 2026 12:33 pm

Re: BU40N MT1959 firmware format: packed region at 0x158000 (compression method known?)

Post by ibizara »

Thanks again — I spent more time comparing the stock 1.00 firmware decode against the RAM dump, and there is a much better update now.

The short version is that the decompression side now looks effectively solved for the bytes covered by the RAM dump. The remaining raw differences are not random decompression errors; they appear to be a Thumb branch relocation/fixup pass applied after decompression.

Earlier I said the corrected decoder only matched until 0x28a. That was true for the v2 decoder, but it turned out v2 still had the length mapping wrong.

The corrected v3 findings are:
  • The header fields are still interpreted as:

    Code: Select all

    0x158000  nominal decompressed/output size / upper bound = 0x72464
    0x158004  packed partition span from 0x158000       = 0x4e98f
  • The table layout still looks correct:

    Code: Select all

    0x158008  288-byte literal/length code-length table
    0x158128   32-byte distance code-length table
    0x158148  compressed bitstream
  • The bitstream size is therefore:

    Code: Select all

    0x4e98f - 0x148 = 0x4e847
  • The Huffman layer is still canonical-Huffman with bit-reversed lookup, and the physical bitstream is MSB-first.
  • Symbols 0..255 are literals.
  • Symbols 256..287 are LZ copy lengths.
  • The big v3 correction is that the length mapping is linear, not DEFLATE-style.
So the length mapping is:

Code: Select all

symbol 256 -> length 3
symbol 257 -> length 4
symbol 258 -> length 5
...
symbol 287 -> length 34
or simply:

Code: Select all

length = symbol - 253
The distance mapping still appears to be:

Code: Select all

distance_prefix = Huffman-coded distance symbol
distance_low7   = next 7 raw bits, read MSB-first
distance        = (distance_prefix << 7) | distance_low7
With that v3 length fix, the raw decompressed output is no longer going off-track. Instead, the raw decompressed stream and the RAM dump differ at branch-looking Thumb-2 instructions, and those differences are explained by a relocation/fixup pass.

Current v3 run against BU40N 1.00 stock:

Code: Select all

partition offset:          0x158000
nominal output size:       0x72464
packed span:               0x4e98f
stream size:               0x4e847
decoded size:              0x72279
bits consumed:             2572856
bytes consumed rounded:    0x4e847
unused stream bytes:       0x0
status:                    EOF at output=0x72279, bit=2572856
md5(decoded raw):          8d371b8acfe9ae094a445af10fd6d160
Comparison against the RAM dump:

Code: Select all

compare size:              0x71c84
compare length used:       0x71c84
raw mismatching bytes:     10706
branch fixup sites:        2995
post-fixup mismatches:     0
post-fixup result:         exact match over compare length
decoded extends past RAM:  0x5f5
So after accounting for the branch fixup pattern, the decoded output matches the RAM dump exactly over the full RAM dump length.

This is not just isolated matching elsewhere. The result is continuous over:

Code: Select all

0x00000 .. 0x71c84
The raw decoded stream is:

Code: Select all

0x72279 bytes
The RAM dump is:

Code: Select all

0x71c84 bytes
So the decoder emits another:

Code: Select all

0x72279 - 0x71c84 = 0x5f5 bytes
after the end of the RAM dump.

That extra tail is not just zero padding. It contains useful-looking strings/tables such as:

Code: Select all

BDDMRUTIL_OBJVER_000D
BDPPCMD_OBJVER_0004
BDUTIL_OBJVER_0004
20160808
DVDPPCMD_OBJVER_0005
.m2ts
STREAM
VIDEO_TS
AUDIO_TS
So I think the RAM dump probably stops slightly before the true end of the decoded partition.

There is still a mismatch between the nominal header output size and the actual decoded stream size:

Code: Select all

header nominal size:  0x72464
actual decoded size:  0x72279
difference:           0x1eb
That now looks like the first field is a nominal output/workspace bound rather than the exact number of bytes emitted by the compressed stream.

The branch relocation/fixup is the interesting remaining bit.

Example at offset 0x2dc:

Code: Select all

raw decoded:  00 f0 00 f8
RAM dump:     ff f7 90 fe
The raw decoded instruction decodes as a BL-style immediate of 0. If I apply:

Code: Select all

new_imm = raw_imm - (output_offset + 4)
and re-encode the Thumb-2 branch immediate, it produces the RAM bytes exactly.

The v3 validation script does not blindly patch every BL-looking word, because data tables can contain values that look like Thumb instructions. It only counts/applies branch fixups that are confirmed by the RAM dump oracle. So the branch-fixup logic in this script is a validation tool, not yet a standalone loader.

Current conclusion:
  • The packed-region decompression for BU40N 1.00 now looks effectively solved for the RAM-covered range.
  • The old divergence at 0x1a was caused by wrong LZ semantics.
  • The later 0x28a-style differences were branch relocation/fixup differences, not general decompression failure.
  • The v3 decoder consumes the whole compressed bitstream: unused stream bytes = 0.
  • The raw decoded stream extends 0x5f5 bytes beyond the provided RAM dump.
  • The remaining unknown is how the real firmware/decompressor identifies which branch sites to relocate without using a RAM oracle.
For OmniDrive / 1.03MK, the practical warning is that the packed firmware probably stores the pre-relocation/raw branch form, not the RAM-style PC-relative branch form. So if we eventually patch decompressed code and recompress it, branch targets likely need to be represented in the raw/pre-fixup form, not copied directly from RAM.

Next things to try:
  • Run this v3 decoder against the 1.03MK block.
  • Compare that output against a matching 1.03MK RAM dump if available.
  • Reverse the real branch-fixup metadata/rule from the decompressor.
  • Only then start thinking seriously about recompression and safe patching.
Here is the current v3 decoder:

Code: Select all

#!/usr/bin/env python3
"""
decode_158000_v3.py

Experimental BU40N / MT1959 decoder for the packed block at 0x158000.

Status as of v3:

  * header[0] appears to be a nominal/decompressed upper size, not the packed size
  * header[1] appears to be the packed partition span from 0x158000
  * literal/length table: 288 one-byte canonical Huffman code lengths
  * distance table:       32 one-byte canonical Huffman code lengths
  * bitstream starts at:  offset + 0x148
  * bitstream physical bit order is MSB-first
  * Huffman lookup uses bit-reversed canonical codes
  * symbols 0..255 are literal bytes
  * symbols 256..287 are LZ copy lengths
  * length mapping is linear:
        length = symbol - 253
        therefore 256 -> 3, 257 -> 4, ..., 287 -> 34
  * distance mapping is:
        prefix = Huffman-coded distance symbol
        low7   = next 7 raw MSB-first bits
        distance = (prefix << 7) | low7

Important: the RAM dump appears to contain an extra runtime relocation/fixup pass
for Thumb-2 BL-like instructions. The raw decoded stream stores the branch
immediate as an absolute target/address-like value. The RAM image stores the
normal PC-relative branch encoding.

This script does NOT blindly patch every BL-looking word, because data tables
can contain values that look like BL instructions. Instead, when --compare is
provided, it counts and optionally applies only those branch fixups that are
confirmed by the RAM/oracle file.
"""

from __future__ import annotations

import argparse
import hashlib
import struct
from pathlib import Path


class BitReader:
    def __init__(self, data: bytes):
        self.data = data
        self.bitpos = 0

    def _read_physical_bit(self) -> int:
        if self.bitpos >= len(self.data) * 8:
            raise EOFError("ran out of compressed input")
        byte = self.data[self.bitpos >> 3]
        bit = (byte >> (7 - (self.bitpos & 7))) & 1
        self.bitpos += 1
        return bit

    def read_huffman_bits(self, n: int) -> int:
        # Physical bits are read MSB-first, but accumulated into bit 0 upwards
        # for the reversed canonical-code lookup.
        value = 0
        for i in range(n):
            value |= self._read_physical_bit() << i
        return value

    def read_raw_msb(self, n: int) -> int:
        value = 0
        for _ in range(n):
            value = (value << 1) | self._read_physical_bit()
        return value


def reverse_bits(value: int, width: int) -> int:
    out = 0
    for _ in range(width):
        out = (out << 1) | (value & 1)
        value >>= 1
    return out


def build_canonical_table(lengths: bytes) -> dict[tuple[int, int], int]:
    counts: dict[int, int] = {}

    for length in lengths:
        if length:
            counts[length] = counts.get(length, 0) + 1

    code = 0
    next_code: dict[int, int] = {}

    for bits in range(1, max(counts.keys(), default=0) + 1):
        code = (code + counts.get(bits - 1, 0)) << 1
        next_code[bits] = code

    table: dict[tuple[int, int], int] = {}

    for symbol, length in enumerate(lengths):
        if not length:
            continue

        canonical = next_code[length]
        next_code[length] += 1
        stored_code = reverse_bits(canonical, length)
        table[(stored_code, length)] = symbol

    return table


def decode_symbol(br: BitReader, table: dict[tuple[int, int], int]) -> int:
    code = 0

    for length in range(1, 32):
        code |= br.read_huffman_bits(1) << (length - 1)

        symbol = table.get((code, length))
        if symbol is not None:
            return symbol

    raise ValueError(f"bad Huffman code at bit {br.bitpos}")


def decode_partition(
    firmware: bytes,
    offset: int = 0x158000,
    output_limit: int | None = None,
) -> tuple[bytes, dict[str, int | str]]:
    nominal_output_size, packed_span = struct.unpack_from("<II", firmware, offset)

    lit_table_off = offset + 8
    dist_table_off = lit_table_off + 288
    stream_off = dist_table_off + 32
    packed_end = offset + packed_span

    if packed_end > len(firmware):
        raise ValueError(
            f"packed span extends beyond input file: end={packed_end:#x}, "
            f"file={len(firmware):#x}"
        )

    lit_lengths = firmware[lit_table_off:lit_table_off + 288]
    dist_lengths = firmware[dist_table_off:dist_table_off + 32]
    stream = firmware[stream_off:packed_end]

    lit_tree = build_canonical_table(lit_lengths)
    dist_tree = build_canonical_table(dist_lengths)

    if output_limit is None:
        output_limit = nominal_output_size

    br = BitReader(stream)
    out = bytearray()
    status = "ok"

    try:
        while len(out) < output_limit:
            symbol = decode_symbol(br, lit_tree)

            if symbol < 256:
                out.append(symbol)
                continue

            # v3 correction: linear length mapping, not DEFLATE-style bases.
            length = symbol - 253

            if length < 3:
                raise ValueError(
                    f"bad length symbol {symbol} at output={len(out):#x}, "
                    f"bit={br.bitpos}"
                )

            distance_prefix = decode_symbol(br, dist_tree)
            distance_low7 = br.read_raw_msb(7)
            distance = (distance_prefix << 7) | distance_low7

            if distance <= 0 or distance > len(out):
                raise ValueError(
                    f"invalid distance {distance:#x} at output={len(out):#x}, "
                    f"prefix={distance_prefix:#x}, low7={distance_low7:#x}, "
                    f"bit={br.bitpos}"
                )

            for _ in range(length):
                out.append(out[-distance])
                if len(out) >= output_limit:
                    break

    except EOFError:
        status = f"EOF at output={len(out):#x}, bit={br.bitpos}"

    stats: dict[str, int | str] = {
        "nominal_output_size": nominal_output_size,
        "packed_span": packed_span,
        "stream_size": len(stream),
        "bits_consumed": br.bitpos,
        "bytes_consumed_rounded": (br.bitpos + 7) // 8,
        "unused_stream_bytes": len(stream) - ((br.bitpos + 7) // 8),
        "status": status,
    }

    return bytes(out), stats


def thumb_bl_imm(h1: int, h2: int) -> int:
    # Thumb-2 BL-style immediate decode.
    s = (h1 >> 10) & 1
    imm10 = h1 & 0x3ff
    j1 = (h2 >> 13) & 1
    j2 = (h2 >> 11) & 1
    imm11 = h2 & 0x7ff

    i1 = (~(j1 ^ s)) & 1
    i2 = (~(j2 ^ s)) & 1

    imm = (
        (s << 24)
        | (i1 << 23)
        | (i2 << 22)
        | (imm10 << 12)
        | (imm11 << 1)
    )

    if s:
        imm -= 1 << 25

    return imm


def encode_thumb_bl_imm(imm: int, h1_orig: int, h2_orig: int) -> tuple[int, int]:
    if imm & 1:
        raise ValueError(f"odd Thumb BL immediate: {imm:#x}")

    val = imm & ((1 << 25) - 1)

    s = (val >> 24) & 1
    i1 = (val >> 23) & 1
    i2 = (val >> 22) & 1
    imm10 = (val >> 12) & 0x3ff
    imm11 = (val >> 1) & 0x7ff

    j1 = (~i1 ^ s) & 1
    j2 = (~i2 ^ s) & 1

    h1 = (h1_orig & 0xf800) | (s << 10) | imm10
    h2 = (h2_orig & 0xd000) | (j1 << 13) | (j2 << 11) | imm11

    return h1, h2


def branch_relocation_candidate(decoded: bytes, oracle: bytes, off: int) -> bool:
    if off + 4 > len(decoded) or off + 4 > len(oracle):
        return False

    h1, h2 = struct.unpack_from("<HH", decoded, off)

    # Thumb-2 BL/B.W-looking instruction. This can occur in data, so this
    # function is only safe because it checks the resulting bytes against oracle.
    if (h1 & 0xf800) != 0xf000 or (h2 & 0xd000) != 0xd000:
        return False

    imm = thumb_bl_imm(h1, h2)
    pc = off + 4
    new_imm = imm - pc

    if not (-(1 << 24) <= new_imm < (1 << 24)):
        return False

    new_h1, new_h2 = encode_thumb_bl_imm(new_imm, h1, h2)
    return struct.pack("<HH", new_h1, new_h2) == oracle[off:off + 4]


def compare_accounting_for_branch_fixups(
    decoded: bytes,
    oracle: bytes,
) -> tuple[bytes, dict[str, int]]:
    patched = bytearray(decoded)
    compare_len = min(len(decoded), len(oracle))
    branch_sites = 0

    for off in range(0, compare_len - 3, 2):
        if decoded[off:off + 4] == oracle[off:off + 4]:
            continue

        if branch_relocation_candidate(decoded, oracle, off):
            h1, h2 = struct.unpack_from("<HH", decoded, off)
            imm = thumb_bl_imm(h1, h2)
            new_h1, new_h2 = encode_thumb_bl_imm(imm - (off + 4), h1, h2)
            struct.pack_into("<HH", patched, off, new_h1, new_h2)
            branch_sites += 1

    mismatches = 0
    first_mismatch = -1

    for i in range(compare_len):
        if patched[i] != oracle[i]:
            mismatches += 1
            if first_mismatch < 0:
                first_mismatch = i

    raw_mismatches = sum(
        1 for i in range(compare_len) if decoded[i] != oracle[i]
    )

    stats = {
        "compare_len": compare_len,
        "raw_mismatching_bytes": raw_mismatches,
        "branch_fixup_sites": branch_sites,
        "post_fixup_mismatching_bytes": mismatches,
        "first_post_fixup_mismatch": first_mismatch,
    }

    return bytes(patched), stats


def main() -> None:
    parser = argparse.ArgumentParser(
        description="Experimental BU40N/MT1959 packed block decoder"
    )
    parser.add_argument("firmware", help="input firmware .bin")
    parser.add_argument("-o", "--output", default="decoded_158000_v3_raw.bin")
    parser.add_argument("--offset", default="0x158000")
    parser.add_argument("--output-limit", default=None)
    parser.add_argument("--compare", help="optional RAM dump/oracle")
    parser.add_argument(
        "--write-oracle-patched",
        help=(
            "optional output path for a RAM-style image patched only at "
            "branch sites confirmed by --compare"
        ),
    )

    args = parser.parse_args()

    firmware = Path(args.firmware).read_bytes()
    offset = int(args.offset, 0)
    output_limit = int(args.output_limit, 0) if args.output_limit else None

    decoded, stats = decode_partition(firmware, offset, output_limit)
    Path(args.output).write_bytes(decoded)

    print(f"partition offset:          {offset:#x}")
    print(f"nominal output size:       {stats['nominal_output_size']:#x}")
    print(f"packed span:               {stats['packed_span']:#x}")
    print(f"stream size:               {stats['stream_size']:#x}")
    print(f"decoded size:              {len(decoded):#x}")
    print(f"bits consumed:             {stats['bits_consumed']}")
    print(f"bytes consumed rounded:    {stats['bytes_consumed_rounded']:#x}")
    print(f"unused stream bytes:       {stats['unused_stream_bytes']:#x}")
    print(f"status:                    {stats['status']}")
    print(f"md5(decoded raw):          {hashlib.md5(decoded).hexdigest()}")
    print(f"wrote raw:                 {args.output}")

    if args.compare:
        oracle = Path(args.compare).read_bytes()
        patched, cstats = compare_accounting_for_branch_fixups(decoded, oracle)

        print()
        print(f"compare file:              {args.compare}")
        print(f"compare size:              {len(oracle):#x}")
        print(f"compare length used:       {cstats['compare_len']:#x}")
        print(f"raw mismatching bytes:     {cstats['raw_mismatching_bytes']}")
        print(f"branch fixup sites:        {cstats['branch_fixup_sites']}")
        print(f"post-fixup mismatches:     {cstats['post_fixup_mismatching_bytes']}")

        if cstats["first_post_fixup_mismatch"] >= 0:
            off = cstats["first_post_fixup_mismatch"]
            print(f"first post-fixup mismatch: {off:#x}")
            print(f"decoded bytes:             {patched[off:off + 16].hex(' ')}")
            print(f"oracle bytes:              {oracle[off:off + 16].hex(' ')}")
        else:
            print("post-fixup result:         exact match over compare length")

        if len(decoded) > len(oracle):
            print(f"decoded extends past RAM:  {len(decoded) - len(oracle):#x}")
        elif len(oracle) > len(decoded):
            print(f"RAM extends past decoded:  {len(oracle) - len(decoded):#x}")

        if args.write_oracle_patched:
            Path(args.write_oracle_patched).write_bytes(patched)
            print(f"wrote oracle-patched:      {args.write_oracle_patched}")
            print(f"md5(oracle-patched):       {hashlib.md5(patched).hexdigest()}")


if __name__ == "__main__":
    main()
Example usage:

Code: Select all

python3 decode_158000_v3.py BU40N_1.00_stock.bin \
  -o decoded_158000_v3_raw.bin \
  --compare "BU40N 1.00 Decompressed Code.bin" \
  --write-oracle-patched decoded_158000_v3_ramstyle.bin

partition offset:          0x158000
nominal output size:       0x72464
packed span:               0x4e98f
stream size:               0x4e847
decoded size:              0x72279
bits consumed:             2572856
bytes consumed rounded:    0x4e847
unused stream bytes:       0x0
status:                    EOF at output=0x72279, bit=2572856
md5(decoded raw):          8d371b8acfe9ae094a445af10fd6d160
wrote raw:                 decoded_158000_v3_raw.bin

compare file:              BU40N 1.00 Decompressed Code.bin
compare size:              0x71c84
compare length used:       0x71c84
raw mismatching bytes:     10706
branch fixup sites:        2995
post-fixup mismatches:     0
post-fixup result:         exact match over compare length
decoded extends past RAM:  0x5f5
wrote oracle-patched:      decoded_158000_v3_ramstyle.bin
md5(oracle-patched):       ea5b760e057c05188a77e36f6b41592e
Also, I currently cannot reply to private messages on here, but I am happy to compare notes. If anyone wants to contact me directly, please feel free to send a message using the ce.uk web form.
Last edited by ibizara on Tue Jun 30, 2026 5:23 pm, edited 1 time in total.
RibShark
Posts: 19
Joined: Mon Apr 29, 2019 6:27 pm

Re: BU40N MT1959 firmware format: packed region at 0x158000 (compression method known?)

Post by RibShark »

ibizara wrote: Tue Jun 30, 2026 5:21 pm The v3 validation script does not blindly patch every BL-looking word, because data tables can contain values that look like Thumb instructions.
Believe it or not the drive does exactly this. Here is some pseudocode as generated by IDA:

Code: Select all

i = 0;
if ( DecompressedDataSize >= 4 )
{
  do
  {
    instructions = &decompressedData[i];
    instruction1Byte1 = decompressedData[i + 1];
    isLongBranch = (instruction1Byte1 & 0xF8) == 0xF0;
    if ( isLongBranch )
    {
      instruction2Byte1 = instructions[3];
      isLongBranch = (~instruction2Byte1 & 0xF8) == 0;// (instruction2Byte1 & 0xF8) == 0xF8; 
    }
    if ( isLongBranch )
    {
      // Fixup Relocation
      v6 = (instructions[2] | (instruction1Byte1 << 19) | (decompressedData[i] << 11) | ((instruction2Byte1 & 7) << 8))
         - (i >> 1)
         - 2;
      instructions[1] = (v6 << 10 >> 29) | 0xF0;
      decompressedData[i] = v6 >> 11;
      instructions[3] = (v6 << 21 >> 29) | 0xF8;
      i += 2;
      instructions[2] = v6;
    }
    i += 2;
  }
  while ( i + 4 <= DecompressedDataSize );
}
Function for this is at 0x13DFE0 in the BU40N 1.00 firmware if you want to take a look yourself.

Also note that the MT1959 is ARMv5, so it's using THUMB rather than Thumb-2.
Post Reply