On Hitachi Reliability

I was out of town last week visiting Colorado Springs. Towards the end of my trip, I got the alarming email alert that a drive in my RAID array was dead.

I use RAID10 for my main array since I use mainly consumer drives, and the risk with parity raid is too high to risk using them.

I ordered a replacement drive (WD Red) so that I could swap when I got home. Everything went fine.

Here’s what I replaced. Not a bad lifespan for 24×7 usage. Here’s hoping I get similar life from my RED drive. And that my 3 other Hitachi drives don’t die en masse.

On Random Yet Consistently Timed Crashes

The past few weeks, I’ve been dealing with hard crashes on my Hyper-V server. They all happened at around the same time. Essentially, the VM would stop responding to any services past pings. If I try to use the Hyper-V console to bring it up, it would just crash. If I tried to reboot or stop the VM it would crash the host.

So, I went through the event logs on the host, and came across a bunch of errors on my Highpoint 2720 controller relating to ports not responding and driver not responding. I have a scheduled drive pair verify that was running around the time of the crash, so I assumed that there may be a chance that I had an issue with the drives on the pair.

I ran a full drive scan on the two drive pairs, and both succeeded without errors, nor were there any crashes. After that, I ran a drive pair validate, but at a different time of the day. This one succeeded as well.

Feeling thoroughly confused, I went through the event logs again, and came across an error in the host log that also coincided with the same timestamps. This error was sourced from my PCIe network card, so at that point, I start trying to figure out what could cause two PCIe cards to stop responding at the same time.

I got through the logs on the VM again, and notice some errors with the ID 129 but no details given due to a missing component. I do some Googling, and find an ancient MS forum post about this error. It was traced to an issue with VSS and similar issues with crashing VMs.

I then remember that I had a Windows Server Backup running on the VM around the same time that this was running. Disabled that, and suddenly the crashes stop.


Rosewill RNG-407-Dualv2 Thoughts

A few weeks ago, I picked up a Rosewill RNG-407-Dualv2 for my home server. It’s a dual gigabit NIC, that Amazon had for less than $40. Thanks to Hyper-V and an Ubiquiti managed switch, I was able to quickly set up port channel, which gave me some extra speed on network operation, as well as separating my VI network traffic from my management traffic.

Since they were so affordable, I decided to pick one up for my desktop as well. My desktop runs Windows 10 Pro, with Hyper-V. I figured I’d set it up the opposite way from server, with the desktop getting the port channel 2 gbps connection, and my VM’s getting my existing onboard 1 gbps connection. Little did I know that this was going to be a bit more of a hassle.

As any decent IT guy would do, I tossed aside the provided driver CD, and jumped online to grab the latest drivers. And then the fun started.

Rosewill’s drivers installed as expected and I suddenly had two LAN interfaces as expected. However, Microsoft does not support teaming in Windows 10 natively. That was probably something I should have investigated before buying this, but what’s a home tech project that doesn’t have a few surprises?

I did some checking on Rosewill’s site, however, it was sparse on details and instructions. There was a diagnostic driver that had a folder called teaming, however attempting to install it was blocked by Windows due to incompatibilities.

At this point, I was starting to get concerned, so I decided to check with the chipset manufacturer and see what generic drivers they had. Fortunately, the network chip is a Realtek product, so they had several driver options on their website.

I download the latest Windows 10 drivers, and install. They are more recent than the Rosewill ones, so I had high initial hopes for them. Alas, there was still no way to configure teaming from the driver side.

Realtek had a diagnostic driver, so I attempted an install of that. Everything seemed great. Network cards showed up in network devices with Realtek Teaming driver, and no alerts anywhere. So, I fire up the Realtek Diagnostic Tool, and it fails with a protocol error. I do some Googling, and turn up an old driver on AsRock’s site of all places that claims teaming abilities.

Deciding that I have nothing left to lose, I download and install the driver pack. I try to load the teaming utility, and it comes right up. I am then able to set up network teaming through the rather archaic looking utility. After a quick port channel config on the switch, I’m able to get connected.

As a test, I start copying some large files to two separate systems to maximize speed. Right away, I hit 1.5gbps, which is exactly what I want to see.

For now, I’m satisfied, however, I suspect I’ll be trying the Realtek Diagnostic drivers again, since I’m not sure why those wouldn’t work, but an older AsRock driver for Realtek would.