Extracting and comparing subtitle tracks?

MKV playback, recompression, remuxing, codec packs, players, howtos, etc.
Post Reply
Dankkh
Posts: 6
Joined: Mon Feb 03, 2020 11:58 pm

Extracting and comparing subtitle tracks?

Post by Dankkh » Sun Jun 14, 2020 11:17 pm

I have a Blu-Ray copy of the movie "Accident Man" (2017). When I use MakeMKV to make a backup of the movie onto my computer, I notice there are a head-scratching total of 4 different English subtitle tracks. (There are obviously more languages available, but there are 4 tracks just for English alone).

Two of these are actually English SDH and appear to be identical, according to MediaInfo. The other two are just plain English, which are the ones I'm interested in; however, these two "regular" English subtitle tracks are actually not identical.

According to MediaInfo, the first track has 2742 elements, and the second track has 2748 elements. A difference of 6 lines of text (?). So my question is, what in the world could be the difference?

When I insert the disc into an actual Blu-Ray player, the disc menu only gives me an option for a single non-SHD English track. So the other English track must be a "hidden" subtitle track that wasn't intended to be used.

Without having to sit through the movie in 2 different players at the same time, I'd like to extract those subtitles as simple TXT or CSV files, and then use a program like DiffChecker to compare the difference.

My problem is: What program can I use to convert the subtitles into this format, and compare them? I can get them into SRT files, or SUP files, but nothing else. The program "CCextractor" claims to be able to export as TXT, but this feature straight up doesn't work. There are also browser-based tools that claim to convert to CSV or TXT, but unfortunately they don't let me upload anything bigger than 30MB.

Any help here would be appreciated. Again, I just want the subtitles in a simple text or spreadsheet format that I can plug into a difference checker.

For the record, here's the MediaInfo for the tracks I'm concerned about. The first English track:

Code: Select all

Text #2
ID                          : 7
ID in the original source m : 4609 (0x1201)
Format                      : PGS
Codec ID                    : S_HDMV/PGS
Codec ID/Info               : Picture based subtitle format used on BDs/HD-DVDs
Duration                    : 1 h 40 min
Bit rate                    : 40.4 kb/s
Count of elements           : 2742
Stream size                 : 29.0 MiB (0%)
Language                    : English
Default                     : No
Forced                      : No
Original source medium      : Blu-ray
The second English track:

Code: Select all

Text #4
ID                          : 11
ID in the original source m : 4611 (0x1203)
Format                      : PGS
Codec ID                    : S_HDMV/PGS
Codec ID/Info               : Picture based subtitle format used on BDs/HD-DVDs
Duration                    : 1 h 40 min
Bit rate                    : 40.4 kb/s
Count of elements           : 2748
Stream size                 : 29.0 MiB (0%)
Language                    : English
Default                     : No
Forced                      : No
Original source medium      : Blu-ray

Woodstock
Posts: 10333
Joined: Sun Jul 24, 2011 11:21 pm

Re: Extracting and comparing subtitle tracks?

Post by Woodstock » Mon Jun 15, 2020 12:57 am

The subtitles on a commercial DVD or BD are not text, they're pictures. And MakeMKV doesn't count the elements, it only looks at flags.

Fastest way I've found is to rip ALL subtitles, then use VLC to play the video to determine which is which.

After that, I use handbrake to shrink the video, and order the subtitles the way I want them. You can also use mkvmerge to copy the ones you want.

Dankkh
Posts: 6
Joined: Mon Feb 03, 2020 11:58 pm

Re: Extracting and comparing subtitle tracks?

Post by Dankkh » Mon Jun 15, 2020 1:13 am

I ended up solving the mystery by using a program called SubEdit.

While not the most efficient way of going about it, I basically just opened up 2 different instances of SubEdit, and then examined the 2 different subtitle tracks side-by-side. To speed things up, I scanned the subtitles by "groups", skipping about 100 lines of text at a time. Then, as soon as I noticed that the line #'s beginning to mismatch, I went backwards and forwards again until I pinpointed the exact place where the discrepancies began.

It turns out that the "6 fewer" lines of text in the one track was actually a result of taking 2 different lines of text, and combining them into a single line. This happens 3 times. Thus, a total delta of 6 between tracks.

Other than that, I believe the two subtitle tracks are the same. Very odd that the Blu-Ray was authored in this manner. There is basically no meaningful difference between the two tracks, other than just 3 places where the text is combined.
Woodstock wrote:
Mon Jun 15, 2020 12:57 am
The subtitles on a commercial DVD or BD are not text, they're pictures. And MakeMKV doesn't count the elements, it only looks at flags.

Fastest way I've found is to rip ALL subtitles, then use VLC to play the video to determine which is which.

After that, I use handbrake to shrink the video, and order the subtitles the way I want them. You can also use mkvmerge to copy the ones you want.
Hey there, thanks for the info on this. I did not realize that subtitles are technically images, but it makes sense to me now.

Chetwood
Posts: 982
Joined: Mon Aug 30, 2010 9:16 am

Re: Extracting and comparing subtitle tracks?

Post by Chetwood » Mon Jun 15, 2020 4:53 am

You can also use BDSUP2sub to check out the images.
MultiMakeMKV: MakeMKV batch processing (Win)
MultiShrink: DVD Shrink batch processing
Offizieller Uebersetzer von DVD Shrink deutsch

Post Reply