My Personal IT Infrastructure Knowledge Base: SDRS Q&A

Question 1: whether SDRS violates space threshold?

Answer: Yes, SDRS may violate space threshold when there is no datastore in the cluster which is below space threshold. Storage Space threshold is just a threshold (soft limit) used by SDRS for balancing and defragment. It is not hard limit. SDRS tries to keep free space on datastores based on space threshold but SDRS does not guarantee you will have always some amount of free space in datastores. SDRS affinity rules also can lead to threshold violation.

Question 2: Whether VM swap file is considered by SDRS?

Answer: Initial placement algorithm does not consider swap file. SDRS Initial Placement algorithm does not take VM swap file capacity into account. However subsequent rebalance calculations are based on space usage of all datastores, therefore if a virtual machine is powered on and has a swap file, it is counted toward the total space usage.

More information: Swap file size is dependent on VM RAM and reserved RAM. If reserved RAM is equal to RAM assigned to VM, there will be no swap file for that VM. Also there is a way to dedicate one of the datastores as swap file datastore where all the swap files from all the VMs will be stored.

SDRS uses the construct “DrmDisk” as the smallest entity it can migrate. This mean that SDRS creates a DrmDisk for each VMDK belonging to the VM. The interesting part is how it handles the collection of system files and swap file belonging to the VM. SDRS creates a single DrmDisk representing all the system files. If, however, an alternate swap file location is specified, the vSwap file is represented, as a separate DrmDisk and SDRS will be disabled on this swap DrmDisk.

Ex. VM with 2 VMDKs and no alternate swap file location specified, SDRS creates 3 DrmDisks as follows.

1. A separate DrmDisk for each VM Disk file

2. A DrmDisk for system files (VMX, Swap, logs etc)

Above technical details show that swap file is considered for load balancing when a VM is in powered on state, and when swap file is located in the same directory as other disks of the VM.

Question 3: Which VM files does SDRS consider in both Initial Placement and Subsequent Rebalance Calculations?

Answer: SDRS has a concept of 'system-files' even during initial-placement. 'system-files' includes VM configuration file i.e. VMX, snapshot files etc. Size may not be 100% accurate but we do take system-files into considerations for initial placement. Initial placement and rebalance, both take all the VM's system files/snapshot files into consideration.

Question 4: How does the initial placement of VM with multiple disks treat the disks – is calculation on the VM – or on the individual disks?

Answer: Disks are considered individually but depending on VM’s disk affinity. They can be on a same datastore or placed on different datastores. But disks are considered individually.

Question 5: In a healthy balanced environment I would expect that SDRS rebalances would only occur ever interval (8 hours or whatever is selected) We were seeing SDRS rebalancing happening during an initial deploy my suspicion was this was due to imbalance moving VM’s in order to “fit” the vm in. Can you confirm when we would expect rebalancing to occur – should it be at the interval and only outside that if balancing is required to “fit” vm in – or is there any other scenario that could account for this behaviour?

Answer: Rebalancing happens 1) at regular interval (default 8 hours); 2) when threshold violation is detected like above; 3) user requests a configuration change 4) API call like clicking run sdrs via client.

If datastore threshold is crossed, we will do re-balance but we are conservative as the cost of storage-vmotion is high and we don't want to penalize other VMs. so behavior is geared for not doing too many svmotions.

Initial deployment itself does not trigger a load balance run but it can generate a placement recommendation with prerequisite svmotion recommendations (to make room for the VM that is to “fit” in). That said, in our past releases, threshold violation can trigger excessive frequent load balance run. That issue will be fixed in our vsphere 6.0 update 3 and vSphere 2016 releases.

Question 6: - For a sample message like below – can you assist me with by pointing to the equations used to device value 0.961178 and 0.9

- 2016-05-17T08:25:34.586+02:00 info vpxd[06784] [Originator@6876 sub=MoDatastore opID=HB-host-297603@165862-40f6dc1c] [CheckForThresholdViolationInt] Datastore LIT005_032 utilization(0.961178) > threshold(0.9); scheduling SDRS

Answer: Such message will be generated when the sum of disk usage is greater than the threshold, for a datastore.  Both values are percentage values.  The former is the actual disk space on the datastore over capacity

   double utilization = (double) dsUsedSpace / dsCapacity;

the later the threshold value that has been set for the datastore cluster.

Question 7: If I start multiple VM deployment (either cloneVM or createVM opration) from vRA, how does SDRS process each request?

Answer: SDRS uses “RecommendDatastores() API for initial placement request, this API processes one VM at a time. For any given cluster, this API call will be processed sequentially; regardless it is for cloning a VM, or creating a VM, or other type of operation.

Additional information: SDRS is an intelligent engine, which prepare placement recommendations for initial placement and recommendations for continuous load balancing as well (Based on space and I/O load). That means other software component (C# Client, Web Client, PowerCLI, vRealize Automation, vCloud Director, etc) are responsible for initial placement provisioning and SDRS gives them recommendations where is the best place to put a new storage objects (vmdk file or VM system files).

Question 8: With I/O thresholds turned off is it expected that the decision is based only on free space – i.e Should we always pick the datastore with most free space – or do we account for other things. The motivation of this question is that they have noted that it is not always the datastore with the most free space that is selected since I/O thresholds have been turned off.

Answer: Yes, rebalance and initial placement decision is based on free space, affinity /anti-affinity rules configured, growth rate of the VMDKs etc. It needs not to pick the datastore with most free space always. When selecting a datastore, Initial placement takes both DRS and SDRS threshold metrics into account. It will select the host with the least utilization and highest connectivity to place the VM.

Question 9: How simultaneous initial placement requests are handled? Customer scenario was: They requested initial placement for 2 VMs (2 VMDKs) on the same datastore (not sure how they selected) but that datastore had space for only one VMDK. SDRS recommended same datastore for both VMDKs and eventually one of that VMDKs failed with insufficient space fault.

Answer: We don't support real simultaneous initial placement requests. Recommenddatastores API accepts one vm as the input parameter. And when calling the API for placement, you can't specify datastore in the Input spec.

Multiple VM provisioning can behave differently less deterministically because of other SDRS calculation factors (I/O load, space load, growth rate of the disk (in case of thin provisioned type disk)), also because of particular provisioning workflow, exact timing when SDRS recommendation is called and when datastore space is really consumed. Recall that datastore reported free capacity is one of the main factor for next SDRS recommendations.

Question 10: The datastore selected by SDRS was unpredictable – if anything it seemed to favor the smaller datastores (We disabled I/O metric as I assumed that was cause for this (smaller Datastores having smaller I/O) – also added storage as usage of around 90% would account for many problems)

- I am finding it difficult to find information on the balancing algorithm – the main source I am using is below but is quite old. (https://wiki.eng.vmware.com/DRSMN/Storage-IO-LoadBalancing) – is this still relevant with 6.x – is there any newer information?

Answer: Yes, above resource still holds good though it looks old. We haven't changed the core-logic of SDRS algo. We have fixed some problems. We have soft-constraints based on profiles, space-threshold, HBR replication, SRM etc, also we look at the expected space-growth, IO saturation, space threshold. Overall, many factors contribute in order to calculate “goodness” value of the datastore to recommended.

Question 11: What are the soft constraints on SDRS.

Answer: Soft-constraints or soft-rules are used by SDRS to determine that if there is no ideal match available for the initial placement, which rules should be dropped. We have multiple categories of soft-rules. If a user is using SRM and has placed disks on a data-store which is part of consistency group, we ideally would like to move that disk to the disk which is part of the same consistency group. Another use-case is related to storage-profiles. If a user wants to place VMDK on say, Storage-Profile1, we attempt to place it on datastore which can satisfy the ‘Storage-Profile1’. So in case of ideal placement is not possible due to hard rules (affinity-rule and anti-affinity rules), we will start to drop constraints in order of severity and re-run the algo to find a better match.

Soft constraints are constraints that can be dropped during initial placement and datastore maintenance workflow in the second run, when we fail to make recommendation for the first run. SDRS will try to correct soft rule violation during load balancing run.

SOFT_CONSTR_STOR_OVRHD_VERY_HIGH, // SRM protected datastore->nonprotected

SOFT_CONSTR_STOR_OVRHD_HIGH, // SRM protected1 datastore->protected2

SOFT_CONSTR_STOR_OVRHD_MEDIUM, // SRM replication group1->group2

SOFT_CONSTR_STOR_OVRHD_TRIVIAL, // SRM replicated1 datastore->replicated2

SOFT_CONSTR_STORAGE_PROFILE, // Across different storage profiles

SOFT_CONSTR_SPACE_THRESH, // Space threshold violation

SOFT_CONSTR_IO_RESERV, // Honor IO reservations when balancing

SOFT_CONSTR_DATASTORETAG, // Across datastore dedup/TP pool

SOFT_CONSTR_CORRELATION, // Across correlated datastores

SOFT_CONSTR_STOR_OVRHD_INFO, // SRM nonprotected->nonprotected

Question 12: Can we get more detail on this – I was under impression it was just I/O and Space thresholds that were accounted for – can we get details of how we account for SRM and HBR also (or are they sub components of the I/O calculation). Also, is there a threshold priority – for example if both I/O Threshold and Space Threshold cannot be satisfied on 1 datastore which Threshold would SDRS drop first in order to try and place the VM.

Answer: SRM and HBR are not considered for IO calculations but they are considered for not breaking consistency-group or replication availability. Space-Threshold is first dropped. IO threshold are important as it affects existing VMs on that datastore.

For more details on SDRS interop with SRM and HBR (VR) : Refer: http://www.yellow-bricks.com/2015/02/09/what-is-new-for-storage-drs-in-vsphere-6-0/

Either threshold violation (space or IO) will cause SDRS to run load balancing algorithm and SDRS will try best to correct it. When SDRS runs, it is possible that both space and I/O thresholds are violated and SDRS will try to correct both of them. Correction is not guaranteed to be successful.

Question 13: is SDRS I/O metric and SIOC are same things? (Optional)

Answer: No, SIOC != SDRS I/O Metric. SIOC can be used without SDRS enabled.

There is a component of SIOC (sdrsinjector) which is used for ‘stats’ collections. We do use that for SDRS IO load balancing.

For more details on SIOC (Storage IO control): http://www.vmware.com/in/products/vsphere/features/storage-io-control

Question 14: is it recommended to have datastore cluster where all the datastores are connected to all the contributing hosts? (Optional)

Answer: Yes, it is recommended to have fully connected datastore cluster (i.e. POD which contains only datastores that are available to all contributing ESXi hosts). Partially connected datastores can be added to SDRS cluster as well but it impose mobility constraints on SDRS from initial placement and load balancing perspective. SDRS always prefers fully connected datastores.

Question 15. How thin provisioned type VMDKs are considered by SDRS ? (Optional)

Answer: VMFS datastore accurately reports ‘committed’, ‘uncommitted’ and ‘unshared’ blocks. NFS datastore by-default is always thin provisioning, as we do not know how NFS server is allocating blocks.

Thin-provisioned disks and thick provisioned disks use same calculated space and IO metrics. One aspect, which we use, is while load balancing, we look at growth rate.

NOTE - from Sarat Kakarla <skakarla@vmware.com>
Only one thing I would like to add to the final doc is that space calculation of the SWAP space, during the initial placement, reserved memory is added to the committedMB and remaining space is added to the uncommittedMB, after that when calculating the entitled space requirement, following formula would be used.

drmStorageIO.cpp
678 int idlePercentInt =
679 GetModule()->OptVal(DRM_OPT_PERCENT_IDLE_MB_IN_SPACE_DEMAND);
680 ASSERT(0 <= idlePercentInt && idlePercentInt <= 100);
681 _entitledMB = int(vd->GetCommittedMB() +
682 vd->GetUncommittedMB() * (idlePercentInt / 100.0));

By default DRM_OPT_PERCENT_IDLE_MB_IN_SPACE_DEMAND is set to 25%, which means 25% of the swap space is accounted for entitled space, same goes for thin provisioned space too.

My Personal IT Infrastructure Knowledge Base

Pages

Wednesday, June 29, 2016

SDRS Q&A

No comments:

Post a Comment