This week I was in Berlin for our annual Tech Summit in EMEA. This is an event for our field folks in EMEA. I presented a number of VSAN sessions, including a design and sizing session. As part of that session, the topic of VSAN memory consumption was raised. In the past, we’ve only ever really talked about the host memory requirements for disk group configuration as highlighted in this post here. For example, as per the post, to a run a fully configured Virtual SAN system, with 5 fully populated disk groups per host, and 7 disks in each disk group, a minimum of 32GB of host memory is needed. This is not memory consumed by VSAN by the way. This memory may also be used to run workloads. Consider it as a configuration limit if you will. As per the post above, if hosts have less than 32GB of memory, then we scale back on the number of disk groups that can be created on the host.
To the best of my knowledge, we never shared information about what contributes to memory consumption on VSAN clusters. That is what I plan to talk about in this post.
[Update]: Some pointed out that we have KB article 2113954 that explains this. One should refer to this article for the latest information, as memory consumption is growing as more and more features are added to new vSAN releases. The following formula is based on vSAN version 6.0 and 6.1, and should not be used to calculate memory usage on later versions of vSAN.
To understand memory consumption by Virtual SAN, the following equation may be used:
BaseConsumption + (NumDiskGroups x (DiskGroupBaseConsumption + (SSDMemOverheadPerGB x SSDSize)))
Where:
- BaseConsumption: This is the fixed amount of memory consumed by Virtual SAN per ESXi host. This is currently 3GB. This memory is mostly used to house the VSAN directory, per host metadata, and memory caches. When there are more than 16 nodes in a Virtual SAN cluster, the BaseConsumption increases by 300 MB to a total of 3.3 GB.
- NumDiskGroups: This is the number of disk groups in the host, and ranges from 1 to 5.
- DiskGroupBaseConsumption: This is the fixed amount of memory consumed by each individual disk group in the host. This is currently 500MB. This memory is mainly used as a resource to support in-flight operations on a per disk group level.
- SSDMemOverheadPerGB: This is the fixed amount of memory allocated for each GB of SSD capacity. This is currently 2 MB in hybrid systems and is 7 MB for all flash systems. Most of this memory is used for keeping track of blocks in the SSD used for write buffer and read cache.
- SSDSize: Size of the SSD in GB.
Caution: Please note that these numbers are for VSAN 6.0 and VSAN 6.1. These may change with future releases.
Now that we understand the requirements, let us run through a few scenarios:
Scenario 1: Let’s look at some working examples where the hosts have more than 32GB of memory per host, the number of hosts in the cluster is less than 16, and that the SSD size is 400GB.
Example 1: One disk group per host, hybrid configuration:
BaseConsumption + (NumDiskGroups x (DiskGroupBaseConsumption + (SSDMemOverheadPerGB x SSDSize))) 3GB + (1 x (500MB + (2MB x 400))) 3GB + (500MB + 800MB) 3GB + 1.3GB = 4.3 GB
Example 2: Three disk groups per host, hybrid configuration:
BaseConsumption + (NumDiskGroups x (DiskGroupBaseConsumption + (SSDMemOverheadPerGB x SSDSize))) 3GB + (3 x (500MB + (2MB x 400))) 3GB + (3 x (500MB + 800MB) 3GB + (3 x (1.3GB) 3GB + 3.9GB = 6.9 GB
Example 3: One disk group per host, all flash configuration:
BaseConsumption + (NumDiskGroups x (DiskGroupBaseConsumption + (SSDMemOverheadPerGB x SSDSize))) 3GB + (1 x (500MB + (7MB x 400))) 3GB + (500MB + 2800MB) 3GB + 3.3GB = 6.3 GB
Example 4: Three disk groups per host, all flash configuration:
BaseConsumption + (NumDiskGroups x (DiskGroupBaseConsumption + ( SSDMemOverheadPerGB x SSDSize))) 3GB + (3 x (500MB + (7MB x 400))) 3GB + (3 x (500MB + 2800MB) 3GB + (3 x (3.3GB) 3GB + 9.9GB = 12.9 GB
Scenario 2 : Let’s look at some working examples where the hosts have more than 32GB of memory per host, the number of hosts in the cluster is more than 16, and the SSD size is 600GB. When there are more than 16 nodes in a Virtual SAN cluster, the BaseConsumption increases by 300 MB to a total of 3.3 GB.
Example 5: One disk group per host, hybrid configuration:
BaseConsumption + (NumDiskGroups x (DiskGroupBaseConsumption + (SSDMemOverheadPerGB x SSDSize))) 3.3GB + (1 x (500MB + (2MB x 600))) 3.3GB + (500MB + 1200MB) 3.3GB + 1.7GB = 5 GB
Example 6: Three disk groups per host, hybrid configuration:
BaseConsumption + (NumDiskGroups x (DiskGroupBaseConsumption + ( SSDMemOverheadPerGB x SSDSize))) 3.3GB + (3 x (500MB + (2MB x 600))) 3.3GB + (3 x (500MB + 1200MB) 3.3GB + (3 x (1.7GB) 3.3GB + 5.1GB = 8.4 GB
Example 7: One disk group per host, all flash configuration:
BaseConsumption + (NumDiskGroups x (DiskGroupBaseConsumption + (SSDMemOverheadPerGB x SSDSize))) 3.3GB + (1 x (500MB + (7MB x 600))) 3.3GB + (500MB + 4200MB) 3.3GB + 4.7GB = 9 GB
Example 8: Three disk groups per host, all flash configuration:
BaseConsumption + (NumDiskGroups x (DiskGroupBaseConsumption + (SSDMemOverheadPerGB x SSDSize))) 3GB + (3 x (500MB + (7MB x 600))) 3GB + (3 x (500MB x 4200MB) 3GB + (3 x (4.7GB) 3GB + 14.1GB = 17.1 GB
Scenario 3 : Finally, let’s look at some examples where a host has less than 32GB of memory. In systems with less than 32GB of RAM, the amount of memory used will be scaled down linearly according to the formula (SystemMemory / 32) where SystemMemory is the amount of memory in the system in GB. Thus, if the system has 16 GB of RAM, the amount of memory consumed will be 1/2 of the output given the formula use to compute memory consumption. If the system has 8 GB it will be scaled down by 1/4.
Lets assume that the host has 16GB of memory, the number of hosts in the cluster is less than 16, and that the SSD size is 400GB.
Example 9: One disk group per host, hybrid configuration:
(BaseConsumption + (NumDiskGroups x (DiskGroupBaseConsumption + (SSDMemOverheadPerGB x SSDSize)))) * (SystemMemory / 32) (3GB + (1 x (500MB + (2MB x 400))) * 0.5) (3GB + (500MB + 800MB) * 0.5) (3GB + 1.3GB * 0.5) = 4.3 GB * 0.5 = 2.15 GB
Example 10: Three disk groups per host, hybrid configuration:
(BaseConsumption + (NumDiskGroups x (DiskGroupBaseConsumption + (SSDMemOverheadPerGB x SSDSize)))) * (SystemMemory / 32) (3GB + (3 x (500MB + (2MB x 400)))) * 0.5 (3GB + (3 x (500MB + 800MB))) * 0.5 (3GB + (3 x 1.3GB)) * 0.5 (3GB + 3.9GB) * 0.5 = 6.9 GB * 0.5 = 3.45GB
Example 11: One disk group per host, all flash configuration:
(BaseConsumption + (NumDiskGroups x (DiskGroupBaseConsumption + (SSDMemOverheadPerGB x SSDSize)))) * (SystemMemory / 32) (3GB + (1 x (500MB + (7MB x 400)))) * 0.5 (3GB + (500MB + 2800MB)) * 0.5 (3GB + 3.3GB) * 0.5 = 6.3 GB * 0.5 = 3.15GB
Example 12: Three disk groups per host, all flash configuration:
(BaseConsumption + (NumDiskGroups x (DiskGroupBaseConsumption + (SSDMemOverheadPerGB x SSDSize)))) * (SystemMemory / 32) (3GB + (3 x (500MB + (7MB x 400)))) * 0.5 (3GB + (3 x (500MB + 2800MB))) * 0.5 (3GB + (3 x 3.3GB)) * 0.5 (3GB + 9.9GB) * 0.5 = 12.9 GB * 0.5 = 6.45GB
That completes the set of examples. From this, you should be able to calculate VSAN memory overhead. Once again, the considerations are as follows:
- VSAN scales back on its memory usage when hosts have less than 32GB of memory
- VSAN consumes additional memory when the number of nodes in the cluster is greater than 16
- All Flash VSAN consumes additional memory resources compares to hybrid configurations