vSAN and Predictive DRS, Network-Aware DRS and Proactive HA
vSphere 6.5 saw the release of a number of improvements in the areas of DRS. I won’t detail all of the improvements here, since my colleague Brian Graf has done a great job of describing the features in a number of different blog posts. He discussed Network-Aware DRS here, Predictive DRS here and Proactive HA here. Instead, what I wanted to talk about in this post is how these features inter-operate with vSAN, if they do at all. I’ve been asked this question a few times now, so after reaching out to Brian and a number of resources on this query, I thought it might be useful to share.
In a nutshell, this new feature of DRS is to try an include network resource utilization alongside CPU and memory when making placement and load-balancing decisions. In the first version of network-aware DRS released in vSphere 6.5, this feature is purely looking at the utilization of the ESXi host uplinks. It is possible that future versions will have much more granularity and look into more specific port groups. However, in the current version, it is fully supported with vSAN. It won’t make any decisions based on vSAN network utilization though, just on overall ESXi network utilization.
Predictive DRS is about looking at past resource usage patterns, and making placement decisions based on those patterns. For example, if there is a spike in activity at the same time every day, or every week, DRS will try to ensure there are enough resources available for that spike in activity. This feature does have a requirement on vRealize Operations (vROps). Predictive DRS is leveraging CPU and Memory forecast metrics from vROps and pushing them to DRS. Other than ingesting these two metrics and determining if something should happen before-hand, nothing has changed to the DRS algorithm, so since DRS works with vSAN, Predictive DRS would work with vSAN in the same fashion.
Again, to simplify this feature as much as possible, Proactive HA introduces a new state for ESXi hosts called “Quarantine mode”. Quarantine is different to maintenance mode in so far as the host’s resources can continue to be used if the host is in quarantine mode, but no new workloads will be place on the host. However, vSAN does not currently recognize this new quarantine mode. Therefore, even though DRS may evacuate workloads from a quarantined hosts, or avoid using it for new workloads, vSAN will continue to place components on the host. Therefore it is not possible to use vSAN with Proactive HA at this time. It is something that we know about, and will be looking to address this as quickly as possible.
Kudos again to Brian for his help with this.