Media Server Lessons

Below is a series of miscellaneous notes and lessons learned throughout the process of building out my media server. The end goal is not to be an exhaustive resource, but to fill in gaps that I found.

Time

Much more time has been spend on handing the different edge cases, doing research and problem solving than will ever be saved by the convenience of having our whole media library be just a click away. This has turned out to be more of a hobby or extension of my existing hobbies than a time saving endeavour.

A lot of these lessons have come from spending time on doing research, testing different things and learning along the way.

Extracting Media

Disclaimer: I am NOT a lawyer and this is not legal advice. You should check your local laws as to the legality of backing up media or using software like MakeMKV. As always, use your best judgment and consult a lawyer when in doubt.

99.9% of the media in the server, started out from a physical disk. To help with this, I leveraged a popular tool called MakeMKV. The tool is very popular amongst archivist and folks who want to extract their media intact. In a nutshell the app takes the video from disk and dumps it into files ending in .mkv on your machine. These files are unaltered and unchanged from how they were put on the disk for the most part.

4K BluRay vs 1080p

When my journey began, support for 4K BluRay drives was a big unknown. I still haven't personally delved into this area, but I strongly suggest you read this thread from MakeMKV, regardless of using this software or another. It's a great resource.

Running in Isolation

In keeping with the rest of the software running on my server machines, makemkv runs on an isolated container. I leveraged the docker image available here. The docker image has quite a few features, which have been useful from time to time, but the main feature it offers is the ability to "auto-rip" discs. This allows you to simply put a disk into the machine, and let it extract the contents automatically and eject the disc when it's done.

For ~80% of all the discs I've done, this automatic rip functionality is wonderful and works perfectly. After the disc is ejected you'll see a recognizable folder name and a single .mkv file that is significantly larger than the rest. This is usually the main feature. On disks with multiple tracks, like TV Shows, these are also easy to spot based on file size.

However, as with most things in life the other 20% of the disks take 80% of my time.

Obfuscated BD Disks

There appears to be one or more distributors or manufacturers that like to make people's life hard. These manufacturers add some tricky obscuring to their disks. They do so by causing any software reading the disk to see hundreds if not thousands of "versions" of the main feature. Amongst those, only a single one will have the main feature as intended. The rest will jumble or mix pieces of the main feature in untended and sometimes hard to spot ways.

The best solution to this, I've found is to do some searches in the MakeMKV forums. You're likely not the first person to run into it, and folks generally are proactive about sharing whatever information they have available to help others. Here are some example threads:

Multiple "Video" Languages

There are also certain movie studios, generally animation ones, that provide a different video depending on the language that is selected on the menu. MakeMKV represents this with 3 .mkv files in the output that are identical except for a couple of key pieces where text is localized to the selected language. For the English track, this is usually the first one but not always.

The best way to select the right now, is by trial and error. Usually selecting one and skipping to either the opening credits or end credits will tell you if it was the right version or not.

"Video" versions

There are also certain movies that sometimes come with multiple video versions, these are rare but important to note. For example, there is a particular movie that randomly chooses a video version every time it's played for a gag.

Storing Media

Don't underestimate it!

Seriously, lots of media takes a lot of space. For example, full feature length movies from a BluRay disk tend to occupy anywhere between 21Gb and 30Gb. However, outliers can be anywhere from 15Gb to over 60Gb (For video spread between discs).

TV Series are also another set to not underestimate, a single TV Series can be over 1Tb from BluRay discs. Even TV Series on DVD can take 100's of Gigabytes. Making sure to plan our your storage needs is important.

Re-encoding

One of the key ways you can save on space is by re-compressing or re-encoding the media. This is generally a controversial amongst different sets of folks and good information around this is hard to come by.

To facilitate the automation of this, I had a single bash script to manage the different encoding modes. This helped keep things consistent and automate running the re-encoding process over many files. Below you'll see the ffmpeg commands used, here $FILE refers to the original file name and $FILENAME refers to the original filename without an extension.

Generally, this is the strategy I adopted.

  • Animated Content (TV Series or Movies)
ffmpeg -i "$FILE" -map 0 -c copy -c:v libx265 -crf 17 -pix_fmt yuv420p10le -tune animation "$FILENAME-10bit-anim.mkv";

HEVC, or x265, is a relatively new video codec so hardware support might be lacking. However, I have found it to be incredibly effective on animated content. The ffmpeg command above, uses the animation tunning preset to make this even better and faster.

The file size savings here were huge, to the point that for an animated series the original size for two episodes could now cover a 20 episode season without noticeable loss in quality.

An important note here, is the choice of pixel format. Anecdotally I've found (and others have commented) that while 10-bit color space take up marginally more space the encoding generally tends to be of higher quality for minimal gain in file size. I don't know if I can attribute this to a placebo effect or a real benefit but the space savings in using 8-bit didn't seem worth it.

  • "Modern" 1080p+ Content
ffmpeg -i "$FILE" -map 0 -c copy -c:v libx265 -crf 17 -pix_fmt yuv420p10le "$FILENAME-10bit.mkv";

Similar to animated content, HEVC is well suited for "Modern" movies that have little to no film grain. Generally I draw the line at movies produced after the early 2000s. While, you can use it for movies earlier than that some folks dislike the handling of film grain in HEVC (Discussions). HEVC does offer a tunning preset to help with it but I decided to avoid this issue for now (libx265 tune docs).

BRTFS Raid limitations

For the original plan, I had decided to maximize storage over redundancy and availability by choosing to stripe my storage drives using BRTFS' stripe mode. At first, this worked great until I needed to expand with a larger drive. BRTFS supports adding new devices to a pool very easily, however the amount of storage the tools reported did not represent reality. BRTFS striping feature will ensure that a file is at least split between two devices / partitions. TODO: Do more research here and bring numbers

Backing up / Archiving Data

With the exception of re-encode operations, media has a unique property that it is immutable and will never change. This makes it suitable for some interesting backup strategies like AWS Glacier and other "cold" storage solutions. For my particular use case, I opted for AWS S3 Glacier Deep Archive. At a monthly cost of ~$1 USD per TB, it is a cost effective archiving solution. As with most cloud services, getting data in is free / cheap but getting data out is expensive. For this solution, the cost to restore the full data set is in the hundreds of dollars. However, it's important to realize that this is meant for a disaster recovery scenario where you'll be spending a lot of money already on new hardware.

I strongly suggest others do their research on various offerings. A more in-depth write-up of this AWS S3 Glacier Deep Archive can be found here.

Copyright © Andres Ruiz 2020