Project Updates for November 2016

Lately, I have been investing a great deal of time in infrastructure development, so there hasn’t been a lot of new stuff to post about (Reports that can be summarized as “today I made another bookcase” could get kind of tedious). This has taken months rather than weeks, but things are improving:

  • This building now has a door! Yay! We haven’t really had a decent front wall since the fire in 2013, and the wildlife level in my office (bugs, mice, frogs, and even a snake have made appearances) has not been good for either my sanity or my productivity.
  • I now have on-site all of the components for the dedicated render-cluster we are building. This should increase our in-house rendering ability by about a factor of about fifteen to twenty. The ducting system to cool it is just about finished. (This is all being paid for by some contracting work I did from July-September — a much-needed opportunity!).
  • The soundbooth for future voice recordings is mostly finished. I still have to add a layer of sound-baffling on the inside, as it currently still rings too much. And I still need to install the quiet ventilation fan to keep the temperature manageable.
  • We’re not completely at a standstill on production: Keneisha Perry has continued work on the remaining character models and some of the character animation for the pilot.
Anya and Sarah Materials Test

Test render of toon materials for “Anya” and “Sarah” characters (no rigs yet).

Render Cluster Details

What we’re building is a “Helmer Cluster” DIY render farm to support Lunatics production. This is loosely based on designs you can find online, built around the “Helmer” cabinet from Ikea, which has a very nice fit for ATX motherboard computers in each drawer, and only costs about $40. Of course, that’s cheaper than a lot of single-board desktop cases, and much cheaper than any rackmount cases.

I’ve described this before, but basically it will be 6 8-core blades with AMD architecture CPUs with 16 GB RAM on each one. The budget for this is about $3000, which requires some clever design as well as careful shopping for components.

I plan to document this project both here and in a how-to video series.

I bought the motherboards for this project first, because I really wanted to avoid a cluster with different motherboards in it. Since then, I’ve bought the rest of the components.

As I see it, the biggest challenge with clusters is adequate heat management. To handle this, I plan to rely primarily on a single HVAC duct fan, rather than buying (more expensive and not particularly powerful) PC case fans. I’m also using a two-stage power supply system, with DC-DC regulator boards on each blade, but a single 1500W 120VAC-12VDC step-down transformer (Mean Well RSP-1500-12), for all 6 blades (meaning they are expected to draw no more than 250W — I used a design calculator to estimate a heavy-load draw of about 170W-per-blade for the design I’m using, so I think this will be an adequate margin). This has the advantage that most of the heat generated by the supply will be from the step-down system on top with a separate cooling fan, and won’t contribute to heating the blades, nor will it block the airflow through the drawers (sadly it isn’t cheaper than buying individual supplies for each blade).

Since it’s going to generate about 1500 W of heat, I’m building the duct system to blow directly outside in the Summer and to be diverted into the room for heating during the Winter. Since it replaces a space heater that we’d be using anyway in the Winter, the Winter electric cost for the cluster will be essentially zero.

I expect to get more than an order of magnitude (10X) improvement in rendering capacity, compared to rendering on my desktop workstation (which is a 4-core AMD CPU with 8 GB RAM, and a somewhat lower clock speed than the cluster CPUs will have).

AMD versus Intel

I know there’s often rivalry between these. Intel cores are somewhat faster, but the increased cost-per-core is a killer. AMD chips generally have twice the number of cores per CPU for about the same price. Since the Intel cores aren’t twice as fast, they generally don’t beat AMD on a pure cost-per-operation basis. Since animation rendering is highly parallelizable, having lots of cores is a bigger advantage than the small improvement in execution that Intel provides. For our project in particular, GPU rendering is not practical, so I haven’t figured that in.

Build, Buy, or Rent?

It’s pretty obvious that I’m saving (a LOT of) money by building this DIY cluster rather than buying rack-based servers, but you might wonder why I don’t just use cloud services for our rendering cluster needs?

I did explore that option. For producing a single episode, the cost/benefit analysis was about even either way. With $3000, I could buy a pretty good chunk of compute time from Amazon’s cloud services (or just pay for a commercial render farm) — sufficient, I think to handle the needs of producing our first episode. This would probably also go faster on the wall clock, because with cloud services, it’s possible to provision many many CPUs for a short period of time rather than a few CPUs for a long time.

However, it is our intention to immediately go on to a 2nd episode, and the cloud-based approach would mean we’d need just as much money for that one. With the render cluster in-house, all we have to pay for is the electricity (which as noted above, essentially costs us nothing in the winter as it replaces a sunk cost). As long as we’re merely renting these resources, we’d be at constant risk of running out of funds and leaving the project undone — especially if we run over budget or behind schedule, which of course, has been a problem for us before. Better to plan for it.

In fact, we might be able to use our cluster to join a render farm collective, rather than just paying. This would mean that our machines would render for other people during idle time, building up credit for us to get free rendering time from others for our project. That would mean we could do renders very fast on occasion.

However, it’s also worth noting that full control over the render clusters gives us better assurance that the software is the right version and configuration to do our rendering correctly, which might become important as we are doing a lot of innovation on the rendering stage to get our non-photorealistic look.