Media Server Lessons
Below is a series of miscellaneous notes and lessons learned throughout the process of building out my media server. The end goal is not to be an exhaustive resource, but to fill in gaps that I found.
Time
Much more time has been spend on handing the different edge cases, doing research and problem solving than will ever be saved by the convenience of having our whole media library be just a click away. This has turned out to be more of a hobby or extension of my existing hobbies than a time saving endeavour.
A lot of these lessons have come from spending time on doing research, testing different things and learning along the way.
Extracting Media
Disclaimer: I am NOT a lawyer and this is not legal advice. You should check your local laws as to the legality of backing up media or using software like
MakeMKV
. As always, use your best judgment and consult a lawyer when in doubt.
99.9% of the media in the server, started out from a physical disk. To help
with this, I leveraged a popular tool called MakeMKV.
The tool is very popular amongst archivist and folks who want to extract their
media intact. In a nutshell the app takes the video from disk and dumps it into
files ending in .mkv
on your machine. These files are unaltered and unchanged
from how they were put on the disk for the most part.
4K BluRay vs 1080p
When my journey began, support for 4K BluRay drives was a big unknown. I still
haven't personally delved into this area, but I strongly suggest you read
this thread
from MakeMKV
, regardless of using this software or another. It's a great resource.
Running in Isolation
In keeping with the rest of the software running on my server machines, makemkv
runs on an isolated container. I leveraged the docker image available
here. The docker image has quite a few
features, which have been useful from time to time, but the main feature it offers
is the ability to "auto-rip" discs. This allows you to simply put a disk into the
machine, and let it extract the contents automatically and eject the disc when
it's done.
For ~80% of all the discs I've done, this automatic rip functionality is wonderful
and works perfectly. After the disc is ejected you'll see a recognizable folder name
and a single .mkv
file that is significantly larger than the rest. This is usually
the main feature. On disks with multiple tracks, like TV Shows, these are also easy
to spot based on file size.
However, as with most things in life the other 20% of the disks take 80% of my time.
Obfuscated BD Disks
There appears to be one or more distributors or manufacturers that like to make people's life hard. These manufacturers add some tricky obscuring to their disks. They do so by causing any software reading the disk to see hundreds if not thousands of "versions" of the main feature. Amongst those, only a single one will have the main feature as intended. The rest will jumble or mix pieces of the main feature in untended and sometimes hard to spot ways.
The best solution to this, I've found is to do some searches in the MakeMKV
forums.
You're likely not the first person to run into it, and folks generally are proactive about
sharing whatever information they have available to help others. Here are some example threads:
- https://www.makemkv.com/forum/viewtopic.php?t=21495
- https://www.makemkv.com/forum/viewtopic.php?t=5285
Multiple "Video" Languages
There are also certain movie studios, generally animation ones, that provide a different
video depending on the language that is selected on the menu. MakeMKV
represents this
with 3 .mkv
files in the output that are identical except for a couple of key pieces
where text is localized to the selected language. For the English track, this is usually
the first one but not always.
The best way to select the right now, is by trial and error. Usually selecting one and skipping to either the opening credits or end credits will tell you if it was the right version or not.
"Video" versions
There are also certain movies that sometimes come with multiple video versions, these are rare but important to note. For example, there is a particular movie that randomly chooses a video version every time it's played for a gag.
Storing Media
Don't underestimate it!
Seriously, lots of media takes a lot of space. For example, full feature length movies from a BluRay disk tend to occupy anywhere between 21Gb and 30Gb. However, outliers can be anywhere from 15Gb to over 60Gb (For video spread between discs).
TV Series are also another set to not underestimate, a single TV Series can be over 1Tb from BluRay discs. Even TV Series on DVD can take 100's of Gigabytes. Making sure to plan our your storage needs is important.
Re-encoding
One of the key ways you can save on space is by re-compressing or re-encoding the media. This is generally a controversial amongst different sets of folks and good information around this is hard to come by.
To facilitate the automation of this, I had a single bash
script to manage the different
encoding modes. This helped keep things consistent and automate running the re-encoding process
over many files. Below you'll see the ffmpeg
commands used, here $FILE
refers to the original
file name and $FILENAME
refers to the original filename without an extension.
Generally, this is the strategy I adopted.
- Animated Content (TV Series or Movies)
ffmpeg -i "$FILE" -map 0 -c copy -c:v libx265 -crf 17 -pix_fmt yuv420p10le -tune animation "$FILENAME-10bit-anim.mkv";
HEVC, or x265, is a relatively new video codec so hardware support might be lacking. However,
I have found it to be incredibly effective on animated content. The ffmpeg
command above, uses
the animation
tunning preset to make this even better and faster.
The file size savings here were huge, to the point that for an animated series the original size for two episodes could now cover a 20 episode season without noticeable loss in quality.
An important note here, is the choice of pixel format. Anecdotally I've found (and others have commented) that while 10-bit color space take up marginally more space the encoding generally tends to be of higher quality for minimal gain in file size. I don't know if I can attribute this to a placebo effect or a real benefit but the space savings in using 8-bit didn't seem worth it.
- "Modern" 1080p+ Content
ffmpeg -i "$FILE" -map 0 -c copy -c:v libx265 -crf 17 -pix_fmt yuv420p10le "$FILENAME-10bit.mkv";
Similar to animated content, HEVC is well suited for "Modern" movies that have little to no film grain. Generally I draw the line at movies produced after the early 2000s. While, you can use it for movies earlier than that some folks dislike the handling of film grain in HEVC (Discussions). HEVC does offer a tunning preset to help with it but I decided to avoid this issue for now (libx265 tune docs).
BRTFS Raid limitations
For the original plan, I had decided to maximize storage over redundancy and availability by choosing to stripe my storage drives using BRTFS' stripe mode. At first, this worked great until I needed to expand with a larger drive. BRTFS supports adding new devices to a pool very easily, however the amount of storage the tools reported did not represent reality. BRTFS striping feature will ensure that a file is at least split between two devices / partitions. TODO: Do more research here and bring numbers
Backing up / Archiving Data
With the exception of re-encode operations, media has a unique property that it is immutable and will never change. This makes it suitable for some interesting backup strategies like AWS Glacier and other "cold" storage solutions. For my particular use case, I opted for AWS S3 Glacier Deep Archive. At a monthly cost of ~$1 USD per TB, it is a cost effective archiving solution. As with most cloud services, getting data in is free / cheap but getting data out is expensive. For this solution, the cost to restore the full data set is in the hundreds of dollars. However, it's important to realize that this is meant for a disaster recovery scenario where you'll be spending a lot of money already on new hardware.
I strongly suggest others do their research on various offerings. A more in-depth write-up of this AWS S3 Glacier Deep Archive can be found here.