Post-mortem? They automated LUN binding policies, restricted SAN reconfiguration rights, and made the intern write a 10-page essay on SCSI device IDs.
Maya grabbed coffee and her battle-tested Linux VM with vmfs-tools compiled. First rule of VMFS recovery: do not write anything to the affected LUN . She used a rescue Linux live CD on a physical host with HBA access. recover vmfs datastore
Within 90 minutes, 34 of 37 VMs booted cleanly. Three had corrupted swap files—recreated them. The fraud detection system was back online at 4:15 AM, just before the London trading desk opened. Post-mortem
Step 3: Deeper scan. She ran vmfs6-recover (part of vmfs-tools ). It parsed backup VMFS metadata—the first copy of the file system descriptor had been overwritten when the host re-scanned the "new" LUN, but VMware stores a second copy at offset 512 MB. First rule of VMFS recovery: do not write
She logged in. Heart sank. The 12-TB VMFS volume—hosting a real-time fraud detection system—wasn’t just offline. It was gone. ls -la /vmfs/volumes/ showed only the local datastore. Someone (an intern following an outdated runbook) had accidentally zapped the LUN mapping from the SAN side, then re-presented it—but as a new device signature.