I have been asked about vSAN encryption a lot these days as the security requirement and standards are exponentially increasing in the IT industry . Data encryption is a key factor to make sure mission critical data are not stolen . Here is a quick and simple guide to configure vSAN-encryption with a Hy-Trust appliance and best practices to deploy and configure encryption cluster with High-Availability .
Introduction to vSAN Encryption
Here is a pictorial representation of vSAN Encryption work flow which explains different hand shakes which are going to be executed until we finally encrypt the physical disks which are going to be used with vSAN , you may also refer to VMware-official-page for additional information .
KEK retrieval process
- VC forms mutual trust with KMS server over SSL/TLS channel using KMIP protocol
- KMS certificate when uploaded to VC gets stored ins VC trusted cert store
- VC requests for KEK from KMS server
- KMS generates KEKID for vSAN cluster
- VC pushes KMS certificate to all the hosts part of vSAN cluster with KEKID
- Hosts contact KMS server using KEKID and retrieve KEK
- Certificate store of hosts KMS certificate is /etc/vmware/ssl
- KMS configuration stores at /etc/vmware/esx.conf
Note* : Once initial configuration is pushed down to the ESXi host which are going to participate in an encrypted vSAN cluster , there is no requirement for VC to keep the communication active between the KMS-Servers and ESXi hosts . VC is just required for the initial configuration .
Encrypting data using DEK
- Host stores KEK in its RAM (safe area with in RAM Key Cache)
- KEK is used to encrypt DEK for all the vSAN disks which is generated randomly
- DEK temporarily gets stored in host key cache with KEK to encrypt the data in plain text
- Key cache is used to safeguard KEK & DEK from unauthorized access
- LSOM is the component which retrieves the DEK from host memory and encrypts & decrypts the data and then place the data on disks
- KEK eventually encrypts the DEK and DEK gets stored permanently somewhere in vSAN datastore
- Host key is provided by KMS server per vSAN Cluster
- In case of esxi host crash coredump will be generated, coredump will have KEK & DEK as we know host stores KEK & DEK in RAM
- To protect unauthorized access, host key is used to encrypt coredump
In case of a compromising situation
We can get the KEK keys and DEK keys recreated thru vcenter , however is an exhaustive and time intensive process
Steps to deploy and configure a KMS HA cluster with Hy-Trust-KMS
In this deployment we are going to look at configuring a HyTrust KeyControl virtual appliance version 4.2 with a vSAN cluster running vSAN 6.7 , same procedure applies for a vSAN 6.6.x cluster as well .
- Make sure you deploy KMS node on non-encrypted storage , definitely not on the same Cluster which we are about to encrypt data , which will lead to a catch-22 situation.
- Verify that the systems you want to use meet the basic system requirements
- Configure the first KeyControl node
- Initialize KeyControl through the KeyControl webGUI for the first node
- If desired, install additional KeyControl nodes and join them to the cluster. The number of nodes you can install is dictated by your KeyControl license , always a good practice to have at least two appliance in HA configuration . VCenter allows to add up to 6 appliance on a single KMS-Cluster .
Start Deploying HyTrust KeyControl 4.2 OVA
- Fill up OVA deployment parameters ⇒ complete the deployment ⇒ power on the VM ⇒ Complete initial configuration on the Console
2. Head over to the webpage , then login with user/password as “secroot/secroot” . Remember this is not the same password which was setup on the console and will needs to be changed after first time login on the web page .
3. Accept all agreements and setup a new web-GUI password .
4. You may choose to configure Email alerts to receive alerts from the KMS-appliance , I chose to skip this .
5. Enable KMIP on the KMS server and hit proceed to continue with Overwrite if prompted as we are configuring this KMS cluster for the first time. Also make a note of the port number .
6. Create client side certificate for VC to be able to communicate with the KMS server .
7. Add the KMS server to the vcenter and authenticate communication between the KMS certificate and vcenter server .
8. Hytrust uses KMS certificate and private key upload method to the vcenter server for KMS to trust the vcenter server . Hence we will need to download the client certificate (KMS-certificate and private key) and upload it to the vcenter server .
Note: PEM file which is obtained from KMS contains both certs which are required . Same file is used for KMS and private key section .
Deploy the secondary KMS appliance for HA-clustering
- Create a 16 character passphrase and authenticate connectivity from the web-GUI on the primary node .
2. Add the secondary node to the existing KMS-cluster.
[root@enc-h1:~] vdq -iH Mappings: DiskMapping: SSD: naa.6000c295967afd2c6e6f2a7a862bf998 MD: naa.6000c29b928af4bfa3fc109492da454e [root@enc-h1:~] localcli vsan storage list |grep -i 'Device\|In CMMDS\|VSAN UUID\|VSAN Disk Group UUID\|Is Capacity Tier' |sed 'N;N;N;;N;s/\n//g'; Device: naa.6000c29b928af4bfa3fc109492da454e VSAN UUID: 5218a577-d999-0cbc-a084-33d29473cf79 VSAN Disk Group UUID: 524e53e9-7e08-30e6-61c0-017bc8a01ad0 In CMMDS: true Is Capacity Tier: true Device: naa.6000c295967afd2c6e6f2a7a862bf998 VSAN UUID: 524e53e9-7e08-30e6-61c0-017bc8a01ad0 VSAN Disk Group UUID: 524e53e9-7e08-30e6-61c0-017bc8a01ad0 In CMMDS: true Is Capacity Tier: false [root@enc-h1:~] localcli vsan storage list |grep -i encryption Encryption: true Encryption: true Host Live vmkernel.log during DG creation : 2018-08-24T02:30:29.123Z cpu0:2099983 opID=25e20366)Encryption enabled. 2018-08-24T02:30:29.123Z cpu0:2099983 opID=25e20366)StorageEfficiency enabled. 2018-08-24T02:30:29.123Z cpu0:2099983 opID=25e20366)Config: 862: "LicensedFeatures" = "vit,allflash,stretchedcluster,erasurecoding,storageefficiency,encryption", Old value: "allflash,stretchedcluster" (Status: 0x0) 2018-08-24T02:36:51.221Z cpu0:2110224)CpuSched: 693: user latency of 2112402 VSAN_0x43045b04df28_PLOG 0 changed by 2110224 vsanmgmtd-worke -6 2018-08-24T02:36:51.281Z cpu0:2110224)Created VSAN Slab LSOM_IORETRY_EncSlab (objSize=65536 align=64 minObj=0 maxObj=819 overheadObj=0 minMemUsage=0k maxMemUsage=55692k) 2018-08-24T02:36:51.314Z cpu0:2110224)LSOMCommon: IORETRY_Create:2570: An IORETRY queue for diskUUID 524e53e9-7e08-30e6-61c0-017bc8a01ad0 (0x4309dd9ac200) is encrypted 2018-08-24T02:36:51.328Z cpu0:2110224)LSOMCommon: IORETRY_Create:2585: An IORETRY queue for diskUUID 524e53e9-7e08-30e6-61c0-017bc8a01ad0 (0x4309dd9b9a90) is NOT encrypted 2018-08-24T02:36:52.331Z cpu0:2110224)PLOG: PLOGAnnounceSSD:7268: Successfully added VSAN SSD (naa.6000c295967afd2c6e6f2a7a862bf998:2) with UUID 524e53e9-7e08-30e6-61c0-017bc8a01ad0. kt 0, en 1, enC 1. 2018-08-24T02:36:52.331Z cpu1:2112402)CpuSched: 693: user latency of 2112411 VSAN_0x43045b477e58_LSOMLLOG 0 changed by 2112402 VSAN_0x43045b04df28_PLOG -6 2018-08-24T02:36:52.339Z cpu0:2110224)LSOMCommon: IORETRY_Create:2570: An IORETRY queue for diskUUID 5218a577-d999-0cbc-a084-33d29473cf79 (0x4309de580610) is encrypted 2018-08-24T02:36:52.352Z cpu0:2110224)LSOMCommon: IORETRY_Create:2585: An IORETRY queue for diskUUID 5218a577-d999-0cbc-a084-33d29473cf79 (0x4309de6905d0) is NOT encrypted 2018-08-24T02:36:53.356Z cpu1:2110224)PLOG: PLOGInitAndAnnounceMD:7737: Successfully announced VSAN MD (naa.6000c29b928af4bfa3fc109492da454e:2) with UUID: 5218a577-d999-0cbc-a084-33d29473cf79. kt 0, en 1, enC 1.
Please feel free to read thru vSAN-Encryption Troublehsooting Deep-Dive blog to understand different scenarios and methods to approach and fix problems with KMS and Disk groups