Wednesday 22 October 2014

ZFS, OVM and the limitations using InfiniBand

The things they don't tell you...

What a pity, this combination comes with some shortfalls indeed. Oracle doesn't tell you, as strictly they do not use all those marvelous features combined. Here my experience shared to avoid other people painting themselves into a corner.

ZFS direct access from OVM over InfiniBand

The best of all world, but not for OVM; OVM is an Oracle flavor of Xen, and Xen works with bonding networking for its guest machines. A network bond is only possible using Ethernet (OSI layer 3). Having InfiniBand, there is Ethernet over InfiniBand (EoIB), but that is not supported by Xen. So, all network over InfiniBand is supported to the host (Oracle Virtual Server), but not to its guests.

Alternative 1 - IB access from the guest

We could work around by installing InfiniBand drivers into the guest. Those are supposed to be bridged (have not tried myself), but due to the implementation of InfiniBand over PCI-express, this cannot be "frozen" during a life-migration. Therefor, seriously crippling one of the best OVM features.

Alternative 2 - No direct access

Why would you want to have direct access in the first place? Possibly because of a NFS mount on ZFS is handy and fast to implement. Go for a solution which is slightly more difficult to setup, but easier to maintain (let along easier security implementation): only allow the storage (ZFS) to be access as local disk to the guest machine. Then, the local disk may be virtual (a disk in the OVM repository), or a physical disk mounted to the guest. Those disks are coming from ZFS, either through NFS-over-IB, or iSCSI. Unfortunately, iSCSI-over-InfiniBand (iSER) is not supported by OVM. Having that said, despite the "IP" overhead of iSCSI over IP-over-Infiniband (IPoIB), the ZFS plugin for OVM makes it possible to do all required disk administration from within OVM.





Alternative 3 - No InfiniBand

Right, you want to stick to NFS mounts of ZFS, accessed in your guests. Then the only possibility is accessing the ZFS through a "bond-able" interface. That is either:
  • An ethernet interface directly on OVM, e.g. 10GB ethernet;
  • An ethernet interface exposed through Xsigo. E.g. this is an 10GB port on the Xsigo, routed over IB, and accessible over IPoIB.

Alternative 4 - Indirect NFS

Just have a "simple" guest machine running NFS. And in turn, use Alternative 2 to have the storage disk as a disk on ZFS.

Choosing OVM disks: physical or virtual



Virtual disks

All the benefit of all disks in one repository. Including possibilities of sparse-copy disks; the equivalent of the thin provisioning - only allocate the blocks used despite a bigger quota/disk.

Physical disks

With the drawback of more complex administration (even though most can be done from OVM through the plugin), the advantages of ZFS are here: snapshots. Ideal if you want to have snapshots capabilities of (groups of) disks, without enforcing it to your entire OVM repository.

Smart usage of snapshots in OVM

Say, before patching you want to snapshot your entire WebCenter deployment across multiple machines, by the click of one button. Just have all involved machines' in one ZFS project, that is all machines in a dedicated OVM repository.

Below an example of 4 machines, with a bunch of disks, in one dedicated OVM repo. That repo itself is on one LUN, in one ZFS project.












Wednesday 22 October 2014

ZFS, OVM and the limitations using InfiniBand

The things they don't tell you...

What a pity, this combination comes with some shortfalls indeed. Oracle doesn't tell you, as strictly they do not use all those marvelous features combined. Here my experience shared to avoid other people painting themselves into a corner.

ZFS direct access from OVM over InfiniBand

The best of all world, but not for OVM; OVM is an Oracle flavor of Xen, and Xen works with bonding networking for its guest machines. A network bond is only possible using Ethernet (OSI layer 3). Having InfiniBand, there is Ethernet over InfiniBand (EoIB), but that is not supported by Xen. So, all network over InfiniBand is supported to the host (Oracle Virtual Server), but not to its guests.

Alternative 1 - IB access from the guest

We could work around by installing InfiniBand drivers into the guest. Those are supposed to be bridged (have not tried myself), but due to the implementation of InfiniBand over PCI-express, this cannot be "frozen" during a life-migration. Therefor, seriously crippling one of the best OVM features.

Alternative 2 - No direct access

Why would you want to have direct access in the first place? Possibly because of a NFS mount on ZFS is handy and fast to implement. Go for a solution which is slightly more difficult to setup, but easier to maintain (let along easier security implementation): only allow the storage (ZFS) to be access as local disk to the guest machine. Then, the local disk may be virtual (a disk in the OVM repository), or a physical disk mounted to the guest. Those disks are coming from ZFS, either through NFS-over-IB, or iSCSI. Unfortunately, iSCSI-over-InfiniBand (iSER) is not supported by OVM. Having that said, despite the "IP" overhead of iSCSI over IP-over-Infiniband (IPoIB), the ZFS plugin for OVM makes it possible to do all required disk administration from within OVM.





Alternative 3 - No InfiniBand

Right, you want to stick to NFS mounts of ZFS, accessed in your guests. Then the only possibility is accessing the ZFS through a "bond-able" interface. That is either:
  • An ethernet interface directly on OVM, e.g. 10GB ethernet;
  • An ethernet interface exposed through Xsigo. E.g. this is an 10GB port on the Xsigo, routed over IB, and accessible over IPoIB.

Alternative 4 - Indirect NFS

Just have a "simple" guest machine running NFS. And in turn, use Alternative 2 to have the storage disk as a disk on ZFS.

Choosing OVM disks: physical or virtual



Virtual disks

All the benefit of all disks in one repository. Including possibilities of sparse-copy disks; the equivalent of the thin provisioning - only allocate the blocks used despite a bigger quota/disk.

Physical disks

With the drawback of more complex administration (even though most can be done from OVM through the plugin), the advantages of ZFS are here: snapshots. Ideal if you want to have snapshots capabilities of (groups of) disks, without enforcing it to your entire OVM repository.

Smart usage of snapshots in OVM

Say, before patching you want to snapshot your entire WebCenter deployment across multiple machines, by the click of one button. Just have all involved machines' in one ZFS project, that is all machines in a dedicated OVM repository.

Below an example of 4 machines, with a bunch of disks, in one dedicated OVM repo. That repo itself is on one LUN, in one ZFS project.