Archive for category VMware vSphere

Operation failed, diagnostics report: iScsi Adapter in unbound status, adding Static Target can not be done.


Hello,

You might see error in VMWare ESXi when LUN is created in EMC VNX3150 and you grant access to hosts that added to the VNXe3150.

Operation failed, diagnostics report: iScsi Adapter in unbound status, adding Static Target can not be done.

VNXe3150 Error in ESXi
This error is generated because EMC VNXe3150 try’s to add the iSCSI Servers/Targets that configured in the EMC VNX to each ESX that granted access to the LUNs. While the iSCSI Targets already added in the Dynamic/Static iSCSI Targets tab, this error is generated.

I think EMC should introduce some logic into the EMC VNXe3150 SAN Storage to initially check the Host if the iSCSI Targets already added, adding iSCSI targets via EMC should be ignored.

Or, VNXe3150 should loop through the storage pools/LUNs where it finds LUNs already presented to ESX hosts from the same Array, adding iSCSI Targets through EMC VNXe3150 should be ignored. Otherwise this fake error will keep presenting itself.

Advertisements

, , , ,

Leave a comment

Setting up & Configuring EMC VNXe3150 iSCSI SAN Storage with High.


Setting up EMC VNXe3150 iSCSI SAN Storage

Finally, the EMC VNXe3150 got installed and configured and almost ready to start transition from old EMC AX4 to the new VNXe3150. In the initial stage, found a bit difficulties to get it configured correctly as every EMC documentation speaks about something else specially when it comes to iSCSI High Availability, they mix between NFS HA and iSCSI HA. In reality, both Storage HA of NFS and iSCSI differ from each other. Simply, NFS uses Link Aggregations with LAG/LACP and iSCSI is not.

Equipment

  • VNXe 3150 with dual storage processors with software version 2.4.0.20932
  • Two Dell Stackable 6224 – 24 port Gigabit switches configured with Jumbo frames for iSCSI and Flow Control Enabled.
  • One – ESXi VMware ESXi 5.1.0 build-799733 host with 6 Gig pNIC’s two pNICs used for iSCSI connectivity only

Configuration on the VNXe

The configuration part is a bit of dilemma when it comes to iSCSI connectivity. In VNXe I set up two iSCSI Servers, one for Storage Processor A and one for Storage Processor B. Each SP has two IP Addresses configured for each Ethernet Interfaces, eth2 & eth3

iSCSI Server Settings-1

iSCSI Server Settings-2

iSCSI Server Settings-3

 

 

 

 

 

 

 

All the Ethernet Interfaces are configured with 9000 mtu size for Jumbo Frame;

MTU Config SPB

MTU Config SPB-eth3

 

 

 

 

 

 

 

Storage Resources Configuration

Storage Elements iSCSI Server Port IP Address MAC Address pSwitch Port VMKernel iSCSI PortGroup
iSCSI-A iSCSI_ServerA eth2 10.90.8.1 8:0:1b:57:71:3e 2/g1 iSCSI-01  10.90.8.78 vmk1
eth3 10.100.8.1 8:0:1b:82:78:dd 1/g1 iSCSI-02 10.100.8.78 vmk2
iSCSI-B iSCSI_ServerB eth2 10.90.8.2 8:0:1b:58:59:0f 2/g2 iSCSI-01  10.90.8.78 vmk1
eth3 10.100.8.2 8:0:1b:cd:f3:26 1/g2 iSCSI-02 10.100.8.78 vmk2

As you can see in the above screen shot and configuration table, each storage process have two Ethernet ports, each Ethernet port is connected to iSCSI pSwitch, eth2 in SPA matched eth2 in SPB. So since both of these interfaces are connected to the same pSwitch and same IP subnet is configured on both, single iSCSI VMkernel PortGroup on the same subnet can reach to both Storage Processors through single physical adapter “vmnic”

VNXe Connectivity Diagram

 VNXe Network Diagram

 

 

 

 

 

 

 

 

 

iSCSI Switches Configuration

Both network switches are configured as master and slave stackable switches, basically for this type of configuration you will not require to stack the switches as each of pair SP Ethernet are connected to the same switch. i;e SPA-Eth2 in pSwitch1 & SPB-Eth2 in pSwitch1. But with NFS configuration you will require to stack the Switches as you will need to configure LAG/LACP for true High Availability

Set up Jumbo Frames on the two iSCSI gig switches so that all the ports are using support Jumbo Frames. Below commands will let you configure all the ports with mtu size 9000

Console(config)#Interface range ethernet all

Console(config-if)#mtu 9000

ESXi host configuration

Each of the Gigs port in ESXi host is connected to the Physical iSCSI Switches. Two VMKernel PortGroups created in vSwitch1. Each of the iSCSI Kernels is mapped to the physical interface in ESXi.

Switch Name      Num Ports   Used Ports  Configured Ports  MTU     Uplinks

vSwitch1         128         8           128               9000    vmnic2,vmnic6

PortGroup Name        VLAN ID  Used Ports  Uplinks

iSCSI-02              0        1           vmnic6

iSCSI-01              0        1           vmnic2

vmk1       iSCSI-01            IPv4      10.90.8.78   255.255.255.0   10.90.8.255     00:50:56:6e:ea:87 9000    65535     true    STA                                  TIC

vmk2       iSCSI-02            IPv4      10.100.8.78 255.255.255.0   10.100.8.255    00:50:56:64:c0:6d 9000    65535

iSCSI vSwitch Ports

To check the connectivity is mapped correctly for each iSCSI PortGroup to reach the correct Ethernet Interface in the Storage Processor; vmkping with –I will allow you to specify the source interface to reach the destination iSCSI target; as this will test the whole path end-to-end from VNXe Storage, ESXi Hosts to Physical iSCSI Switches to make sure connectivity can flow with jumbo frame.

iSCSI Adapter Port Binding

Both iSCSI VMKernel portgroups has to be enabled for port bindings in the iSCSI Initiator adapter of ESXi.

iSCSI Port Bindings

 

 

 

Connectivity Results

# vmkping -I  vmk1 10.90.8.1 -c 50 -s 9000

PING 10.90.8.1 (10.90.8.1): 9000 data bytes

9008 bytes from 10.90.8.1: icmp_seq=0 ttl=255 time=0.596 ms

9008 bytes from 10.90.8.1: icmp_seq=1 ttl=255 time=0.575 ms

9008 bytes from 10.90.8.1: icmp_seq=2 ttl=255 time=0.548 ms

— 10.90.8.1 ping statistics —

3 packets transmitted, 3 packets received, 0% packet loss

round-trip min/avg/max = 0.548/0.573/0.596 ms

# vmkping -I vmk1 10.90.8.2 -c 50 -s 9000

PING 10.90.8.2 (10.90.8.2): 9000 data bytes

9008 bytes from 10.90.8.2: icmp_seq=0 ttl=255 time=0.591 ms

9008 bytes from 10.90.8.2: icmp_seq=1 ttl=255 time=0.617 ms

9008 bytes from 10.90.8.2: icmp_seq=2 ttl=255 time=0.603 ms

# vmkping -I vmk2 10.100.8.1 -c 50 -s 9000

PING 10.100.8.1 (10.100.8.1): 9000 data bytes

9008 bytes from 10.100.8.1: icmp_seq=0 ttl=255 time=0.634 ms

9008 bytes from 10.100.8.1: icmp_seq=1 ttl=255 time=0.661 ms

9008 bytes from 10.100.8.1: icmp_seq=2 ttl=255 time=0.642 ms

— 10.100.8.1 ping statistics —

5 packets transmitted, 5 packets received, 0% packet loss

round-trip min/avg/max = 0.634/0.661/0.708 ms

# vmkping -I vmk2 10.100.8.2 -c 50 -s 9000

PING 10.100.8.2 (10.100.8.2): 9000 data bytes

9008 bytes from 10.100.8.2: icmp_seq=0 ttl=255 time=0.694 ms

9008 bytes from 10.100.8.2: icmp_seq=1 ttl=255 time=0.658 ms

9008 bytes from 10.100.8.2: icmp_seq=2 ttl=255 time=0.690 ms

Add ESXi hosts to VNXe

Setup the ESXi hosts to access VNXe iSCSI SAN Storage. This can be done by browsing into the VNXe > Hosts > VMWare will allow you to find ESX hosts either by typing in the IP Address of the vCenter or the management network of the ESXi itself. Then create VMFS datastore on the VNXe and make sure you are assigning permission to the ESXi host to access the newly created LUN.

After the LUN is presented to the ESXi host and formatted with VMFS now it’s time to change the path selection through from default Fixed to Round Robin and change the Round Robin default IOPS limit in ESXi from 1000 to 1 which will allow you to utilize all the iSCSI paths.

esxcli storage nmp psp roundrobin deviceconfig set –type=iops –iops 1  –device=naa.6006048c2fb691695617fc52e06065a2

Once it’s change you will see all the paths with Active(I/O) for each LUN that changed from Fixed to Round Robin.

Path Round Robin

 

 

 

Failover – Failback Testing Scenarios

For the failover testing I have presented 500 GB LUN and created two Virtual Machines, and installed Windows 2008 R2 Enterprise Edition. Roles installed on this guest machine;

  • Microsoft Active Directory Role Services
  • Microsoft DNS Server Services
  • Exchange Server 2010 with all the Roles.

The second Virtual Machines are a Windows 7 Professional client with Microsoft Outlook 2010 connected to the Exchange 2010 MAPI profile. Sending and receive emails internally is operational in normal mode.

Testing Networking

I have tested failover with network issues scenarios by disconnecting one pNic “vmnic2” from the vSwitch1 that mapped to iSCSI-01 portgroup and at the same time vmkping –I vmk1 was running against both VNXe iSCSI Target IP’s SPA-Eth1 “10.90.8.1” & SPB-Eth2 “10.90.8.2” and ping continues well.  If a Storage Processor (SPA) fails/rebooted on VNXe, the working Storage Processor (SPB) picked up the workload that was handled by SPA.

Testing Networking-1

  • As you can see in the above screen shots, Virtual Machines Windows 7 Client continued pinging the Exchange Server and Exchange Server continued to ping Windows 7 Client.
  • Vmk1 = iSCSI-01 which mapped to vmnic2 stopped pinging to the SPA-Eth2 & SPB-Eth2.
  • LUN Paths both links mapped to vmnic2 subnet 10.90.8.x dead and 10.100.8.x paths mapped to vmnic6 ‘vmk2’ “iSCSI-02” are live and Active(I/O).

Path Round Robin-Failuer

 

 

 

 

 

Ping  “vmnic6” via vmk2 to 10.100.8.1 & 10.100.8.2

# vmkping  -I vmk2 10.100.8.1

PING 10.100.8.1 (10.100.8.1): 56 data bytes

64 bytes from 10.100.8.1: icmp_seq=0 ttl=255 time=0.229 ms

64 bytes from 10.100.8.1: icmp_seq=1 ttl=255 time=0.192 ms

64 bytes from 10.100.8.1: icmp_seq=2 ttl=255 time=0.238 ms

— 10.100.8.1 ping statistics —

3 packets transmitted, 3 packets received, 0% packet loss

round-trip min/avg/max = 0.192/0.220/0.238 ms

~ # vmkping  -I vmk2 10.100.8.2

PING 10.100.8.2 (10.100.8.2): 56 data bytes

64 bytes from 10.100.8.2: icmp_seq=0 ttl=255 time=0.235 ms

64 bytes from 10.100.8.2: icmp_seq=1 ttl=255 time=0.245 ms

— 10.100.8.2 ping statistics —

2 packets transmitted, 2 packets received, 0% packet loss

round-trip min/avg/max = 0.235/0.240/0.245 ms

Relink vmnic2 to vSwitch1 and ping resumed back to SPA-Eth2 & SPB-Eth2

ReLink vmnic2 vSwitch1

 

 

 

 

 

 

 

 

LUN paths resumed

Path Round Robin-Path Resumed

 

 

 

 

 

 

Testing Power Failure of VNXe Storage Processors-A & Storage Processor-B

The second test was done by removing the physical power from Storage Processor-B and initiate vmkping to both Ethernet Interface of SPB from both VMKernel vmk1 & vmk2, as a result vmkping continues as the traffic routed peer SP port.

Testing Power Failure

 

 

 

 

 

 

Result of ping after

Below result shows that Exchange VM continues to ping the Client VM during Storage Processor-B shutdown.

Result of Ping after SP-B Power Failure

 

 

 

Did the same with Storage Processor-A and initiated ping to both Ethernet Interfaces of SPA. Ping continues to both Ethernet Interfaces as well as pings inside each VM “Exchange Server to Client” and vice versa continues as well and Exchange Server VM didn’t give any freeze / errors in event viewer.

Conclusion

The VNXe3150 high availability feature at storage level and networking level ensures data protection against any single component failure in Storage Level and Networking Level.

, , , , , , , , , , ,

Leave a comment

Configuring NTP for VMWare vSphere Environment


NTP is a critical component in every network environment, especially when it comes to Virtualization. Because the entire Virtual Machines are relaying on the host to get the time. When the Domain Controller itself running as a Virtual Machine, cares must be taken of time synchronization between hosts, Virtualized Domain Controllers and other Virtual Machines.

In my environment, since the server vLAN has got access to the Public Internet, this wasn’t a big issue to get the synchronization done properly to the Domain Controllers.

Luckily, I have got two Domain Controllers one running as a Physical Server (PDC) and the second running as a Virtual Machine (ADC). In the initial stage, all the FSMO roles was configured on the Virtual Domain Controller, so the Virtual DC it’s holding the PDC emulator which is responsible for the domain hierarchy which means it’s responsible of the time service as well.

To get this done without the hassle of Time Protocols errors, below is the entire configuration.

  1. Backed up the Virtual DC with Veeam Backup & Replication”.
  2. Transferred the FSMO Roles from the Virtual DC to the Physical DC
  3. Configured the Physical DC to be responsible for the time service as per the MS article http://support.microsoft.com/kb/816042
  4. Configured the Virtual DC to obtain time from Physical DC.
  5. Now, since the majority of our servers are virtual running on VMWare vSphere the ESX configured to obtain the time from the same Physical DC
  6. In vSphere Client, configured the ESX with the Physical Domain Controller as NTP Server.
  7. Ran the ntpdate -q 192.168.10.20 in ESX Server against the Physical Domain Controller
  8. Restart the ESX NTP Service ” service ntpd restart”
  9. Then watch “ntpq -p” to watch time drift in ESX Console.
  10. Domain Members Virtual Machines has been configured not to obtain the time from ESX host, but it’s configured on other VMs that are not member of the Domain.
  11. The time is probably synchronized from the Internet to our Virtual and Physical Servers.

, , , , , , ,

Leave a comment

Steps to Upgrade Veeam Backup & Replication v6.1.x to v6.5.0.109


Hello,

I don’t want to go through a step by step procedure to upgrade from your current version of Veeam Backup & Replication to the latest version, but I would like to go through the procedure that I have followed in my environment and hope you will be benefit.

1. Stop Veeam Services.
2. Backup SQL Database
3. Upgrade SQL Express 2005 to SQL Express 2008 R2 by initiating this in the CMD SQLEXPR_x64_ENU.exe /action=upgrade
4. Once it’s done, start veeam Services.
5. Upgrade Enterprise Manager first if you are using it.
6. Upgrade Veeam Backup & Replication.
7. Launch Veeam Console and you will be prompt it with agent upgrade to upgrade the Repository Servers and Proxy Servers.
8. Just make sure everything is back to normal, jobs, logs, configuration run a job test to make sure everything is fine.

, , , ,

Leave a comment

Backup job failed. Cannot create a shadow copy of the volumes containing Exchange writer’s data.


Been facing the Exchange VSS Writer issue  when I backup my Exchange Server 2007 with Veeam Backup & Replication for quite sometime, and the only way to clear this out is by rebooting the Exchange Server or sometime it will pass through from the second or third job retry.

Last Friday 16th Nov, 2012 my Veeam Backup jobs suppose to run as Full Active Backup as per my configuration, Backup went fine for all the Backup Jobs, except the Backup Job which contains the Exchange Mailbox Server, it gives the usual error;

11/16/2012 8:43:20 PM :: Unable to release guest. Error: Unfreeze error: [Backup job failed.
Cannot create a shadow copy of the volumes containing writer's data.
A VSS critical writer has failed. Writer name: [Microsoft Exchange Writer]. Class ID: [{76fe1ac4-15f7-4bcd-987e-8e1acb462fb7}]. Instance ID: [{8ea7190d-337c-448f-b264-3401303b586b}]. Writer's state: [VSS_WS_FAILED_AT_FREEZE]. Error code: [0x800423f2].]

I have reboot it, and retry the job but no joy it didn’t help. I rebooted second and third time, but no joy The error is persistent.

I searched and searched for a solution but the usual result showed up, either reboot the server to clear the VSS writer Timed Out or restart the Microsoft Information Store which will clear the VSS Timed Out. I have done, both but without any luck.

I fed up from troubleshooting during my holiday and I left home for fishing. While I’m at the sea, my mind triggered out that why don’t you Exclude the C:\Drive of the Exchange VM and select only those drives which contains the Exchange Database?

Hummm, it seems it’s a brilliant idea. As soon as I reached home, I immideltly, logged in remotely again and I Excluded the C:\Drive VMDK SCSI (0:0) from the job and selected only those which contains the Exchange Database, “the disks are vRDM”;


SCSI (0:1)
SCSI (0:2)
SCSI (0:3)
SCSI (0:4)
SCSI (0:5)
SCSI (0:6)
SCSI (0:8)
SCSI (0:9)
SCSI (0:10)
SCSI (0:11)
SCSI (0:12)
SCSI (0:13)
SCSI (0:14)
SCSI (0:15)
SCSI (1:0)
SCSI (1:1)
SCSI (1:2)
SCSI (1:3)
SCSI (1:4)
SCSI (1:5)
SCSI (1:6)
SCSI (1:8)

Imagine, the Job passed the Snapshot Process so fast and I’m surprised that the job is started reading :) and the Exchange Database has been put in Backup Mode and the backup Speed is a bit faster 1.6 TB finished in 4 Hours :)

Wondering:
Why Excluding the SCSI (0:0) which is the C:\SystemDrive of the Exchange 2007 Virtual Machine and Including only the vRDM SCSI Drives the process will pass successfully without VSS Error or Timed Out?

What is the restoration impact of Backing up only the those drives where the Exchange Database resides without the the C:\Drive SCSI (0:0).

Will it be possible configuring another job against the Exchange VM to backup only the C:\Drive SCSI (0:0) without the rest of other drives? And when I want to restore it, I have to restore the Job which contains the C:\Drives and followed by the other Job which backed up the Database Drives?

Update:
I can confirm that, I have Selected All Disks to process under the Disk Exclusions and the job Failed. But I have amended the Selection and I included SCSI (0:0) which contains the Virtual Machine System Drive and the backup successful.
I think the reason why it didn’t work when the All Disks radio button is selected but it does when the SCSI Disks are selected including the SCSI (0:0) which includes the System Drive.

The Virtual Machine is limited to 60 Virtual SCSI Controllers/Targets and when Veeam Backup is initiating job to process the Backup, it creates a Loop within the SCSI Controllers selected in the VM. If any drives / SCSI Controller is presented, it will be added for the Backup. While Exchange is a bit sensitive to wait for the entire process of Veeam Backup Job to go from SCSI (0:0) till SCSI (3:15) which is the last SCSI Controller, it failed because VSS Snapshot it didn’t pass within the time frame that specified by Veeam.
But by selecting the correct SCSI Controller/Disks which are presented within the Virtual Machine, I think Veeam Intelligent enough to pickup only those SCSI Disks which are presented/Selected.

, , , , , , , , , , , ,

1 Comment

How to Power-Off / Shutdown Virtual Machine on VMWare ESXi5


How to Power-Off / Shutdown Virtual Machine on VMWare ESXi5

 

Symptoms

You are experiencing these issues:

  • You cannot power off an ESXi hosted virtual machine.
  • A virtual machine is not responsive and cannot be stopped or killed.
  • You cannot power off an ESXi because it’s hanged with lots of SCSI Storage I/Os

You can start by obtaining the <VMID> and you can run the power action against the VMID generated by the below command;

vim-cmd vmsvc/getallvms – list all VMs registered on the host

vim-cmd vmsvc/power.getstate <vmid> – show power state of a VM
vim-cmd vmsvc/power.shutdown  <vmid>  – shutdown a VM (shutdown guest)
vim-cmd vmsvc/power.reset  <vmid>  – reset a VM
vim-cmd vmsvc/power.off <vmid>  – power off a VM
vim-cmd vmsvc/power.on <vmid> – power on a VM
vim-cmd vmsvc/power.reboot <vmid> – reboot a VM
vim-cmd vmsvc/get.summary  <vmid>  – get summary information for a VM
vim-cmd vmsvc/unregister  <vmid>  – unregister a VM from a host

Example;

# vim-cmd vmsvc/getallvms

Vmid   Name             File                  Guest OS       Version   Annotation

2      VM01   [vESXi5-01] VM01/VM01.vmx   winLonghornGuest   vmx-08

~ # vim-cmd vmsvc/power.getstat 2

Retrieved runtime info

Powered on

~ # vim-cmd vmsvc/power.shutdown 2

(vim.fault.ToolsUnavailable) {

dynamicType = <unset>,

faultCause = (vmodl.MethodFault) null,

msg = “Cannot complete operation because VMware Tools is not running in this virtual machine.”,}

In the above example, it gives me an error that cannot Shutdown the VM as the VMWare Tools is not running.

But by running the command to forcibly power off the VM with Power.Off command, it’s immediately powered off the named VM ID. (2)

~ # vim-cmd vmsvc/power.off 2

Powering off VM:

~ # vim-cmd vmsvc/power.getstat 2

Retrieved runtime info

Powered off

~ #

Commands available under vmsvc/:

~ # vim-cmd vmsvc/
Commands available under vmsvc/:
acquiremksticket                 get.spaceNeededForConsolidation
acquireticket                    get.summary
connect                          get.tasklist
convert.toTemplate               getallvms
convert.toVm                     gethostconstraints
createdummyvm                    login
destroy                          logout
device.connection                message
device.connusbdev                power.getstate
device.disconnusbdev             power.hibernate
device.diskadd                   power.off
device.diskaddexisting           power.on
device.diskremove                power.reboot
device.getdevices                power.reset
device.toolsSyncSet              power.shutdown
device.vmiadd                    power.suspend
device.vmiremove                 power.suspendResume
devices.createnic                queryftcompat
get.capability                   reload
get.config                       setscreenres
get.config.cpuidmask             snapshot.create
get.configoption                 snapshot.dumpoption
get.datastores                   snapshot.get
get.disabledmethods              snapshot.remove
get.environment                  snapshot.removeall
get.filelayout                   snapshot.revert
get.filelayoutex                 snapshot.setoption
get.guest                        tools.cancelinstall
get.guestheartbeatStatus         tools.install
get.managedentitystatus          tools.upgrade
get.networks                     unregister
get.runtime                      upgrade
get.snapshotinfo
~ #

, , , , , , , , , ,

Leave a comment

Unable to allocate processing resources. Error: No backup proxy is able to backup this VM. Check processing mode settings on proxies.


Hello,
Are you seeing this error when you try to back up your VMs with Veeam Backup & Replication 6.5 in Direct SAN Mode? Does your Proxy Servers sees the VMFS LUNs where you are fetching the VMs Data to the Backup Mode?

This error is generated because the Veeam Configured to backup VMs via Direct SAN Access “Using source proxy VMware Backup Proxy [san] and the Proxy doesn’t have access to the VMFS LUNs.

Solution:

To sort out this issue;
1. Make sure your Proxy Server can see the VMFS LUNs inside the Disk Management of the Windows.
2. Make sure the Proxy configured as Direct SAN Access.
3. Make sure the Direct SAN Access Proxy is selected in the Backup Job Configuration / Storage Section.
4. If you are still getting the error and failing to process the backup further, Open the Proxy Configuration under Backup Infrastructure -> Backup Proxies and select the desire proxy.
Right Click on the Desired Proxy and Select Properties in the Selected DataStores, select the VMFS LUNs which contains the target Backup VMs manually and try..

Hope it helps..

, , , , , ,

Leave a comment