(951) 268-7836 info@authintel.com

Creating a 3-way 10 Gbe Cluster without a switch– Part 1

After finishing the PhD, I’m back into the lab to test out some new high-speed computing experiments. Recently, I retrieved one of my servers from co-location and put it back into my home-office lab giving me 3 servers with Fusion-IO cards all in the same spot. I’m trying to move around some virtual machines and update the trading optimizer (CapGen) investing databases so thought it would be useful to get all of the servers talking on at least 10 Gb. The Fusion-io cards easily overwhelm the 1 Gb connection since even one of the duos provides 1.5 GB/s which is actually 9 Gb.

A few months back I managed to get 2 servers working using Mellanox Connectx cards on QFSP connections with 20 Gb/s, but that stopped working on me with a driver update (ConnectX not supported on Windows Server 2012 R2), so went to a better supported, although slower connection using 10 Gb/s. To do this, I got the Mellanox 3 EN cards for 2 of the servers and bought the add-on dual 10 Gb/e adapter for one of the HP DL 370 G6 servers. One advantage to using the HP add-on adapter is that it doesn’t require an additional slot although you do trade off 2 of the 1 Gb connectors.

This approach allows the maximum number of Fusion-io (HP IO Accelerator) cards in the server (current at 9 with 8 of them being duos) as shown below.

fio-cards

In this arrangement, each server has a dedicated high-speed connection to the other two servers via the dual interface as shown in the below table without the need for a switch.  Basically, it is just 3 cables connecting each server to the other two servers via the 6 total ports (2 on each server).

image

Server Source Port/ IP Address Destination Server Destination Port/ IP Address
Server 1: 1: 10.0.0.11 Server 2 1: 10.0.0.21
  2: 10.0.0.21 Server 3 1: 10.0.0.31
Server 2 1:10.0.0.21 Server 1 1: 10.0.0.11
  2:10.0.0.22 Server 3 2: 10.0.0.32
Server 3 1:10.0.0.31 Server 1 1:10.0.0.11
  2:10.0.0.32 Server 2 2:10.0.0.22

Below are pictures of the rear of a couple of the servers.

server2

server1b

One of the pain points with setting up a separate private network is that the adapters by default end up in the public class. Some articles have been written about how to fix this with a script, but for my testing I am taking the lazy way out and just turning off the windows firewall on the servers. After having done that, I can may drives directly over the high-speed link and verified ability to achieve the 10 Gb/s throughput by copying files using Fusion-io drives as the source and targets. I am now able to copy a 60GB file from one server to the other are between 40 – 80 GB, this should provide the ability to achieve a live migration of a VM in around a minute.

Now that the infrastructure is in place, I will start experimenting with clustering on the next post and will look into some other alternatives beyond just the Ethernet/ISCSI approach including RDMA. I will also do some experimentation with log shipping and always-on capability with SQL Server. I will also try out the live migration features in Hyper-V to test out the practicality for this on the 10 Gb backbone. Lastly, I will test out my idea for a distributed SQL Server database that sits on top of ISCSI on the high-speed network wherein the database instance is effectively scaled out beyond one instance to multiple servers via the other servers hosting the storage through file server roles.

I hope to finish some testing in the next couple of weeks in my spare time.

Submit a Comment