实战给装了 VMWare ESXi 的 DELL PERC H700 添加硬盘

公司有台带有 PERC H700 RAID 控制器的 DELL 服务器,上面安装了 VMWare ESXi 4.1。
由于业务扩张,硬盘空间不足,我们买了块新硬盘要加进去。
原来是3块300G的硬盘构成了RAID5,现在要通过重新配置阵列加一块进去。

中间费尽周折,会过头来再看,其实也不是那么难。相信下次就会顺利多了。

缘由
公司有台带有 PERC H700 RAID 控制器的 DELL 服务器,上面安装了 VMWare ESXi 4.1。
由于业务扩张,硬盘空间不足,我们买了块新硬盘要加进去。
原来是3块300G的硬盘构成了RAID5,现在要通过重新配置阵列加一块进去。

步骤0 备份数据

步骤1 扩展阵列

在 LSI 的主页上下载 MegaRAID VMWare Release
然后解开把文件通过 vSphare Client 上传到服务器上。然后在控制台上执行下面的命令:

/tmp/MegaCLI # ./MegaCli -LDRecon start r5 [Add PhysDrv[32:3]] L0 -a0*
 
Start Reconstruction of Virtual Drive Success.
 
Exit Code: 0x00
 
/tmp/MegaCLI # ./MegaCli -LDRecon ShowProg L0 -a0*      
 
Reconstruction on VD #0 (target id #0) Completed 0% in 0 Minutes.
 
Exit Code: 0x00

第二个命令是查看进展的。我们这个300G的硬盘大约花了不到三个小时的时间。

/tmp/MegaCLI # ./MegaCli -cfgdsply -aAll
 
==============================================================================
Adapter: 0
Product Name: PERC H700 Integrated
Memory: 512MB
BBU: Present
Serial No: 11B05V7
==============================================================================
Number of DISK GROUPS: 1
 
DISK GROUP: 0
Number of Spans: 1
SPAN: 0
Span Reference: 0x00
Number of PDs: 4   << 从3变成4了。
Number of VDs: 1
Number of dedicated Hotspares: 0
Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name                :Virtual Disk 0
RAID Level          : Primary-5, Secondary-0, RAID Level Qualifier-3
Size                : 836.625 GB  << 从557.75 GB变成836.625 GB了。
State               : Optimal
Strip Size          : 64 KB
Number Of Drives    : 4
Span Depth          : 1

步骤2 扩展VMWare的存储区
先调查好系统状况:

~ # esxcfg-scsidevs -l
...
naa.6782bcb00f7de00014fe794f0593cfee
   Device Type: Direct-Access
   Size: 856704 MB
   Display Name: Local DELL Disk (naa.6782bcb00f7de00014fe794f0593cfee)
   Multipath Plugin: NMP
   Console Device: /vmfs/devices/disks/naa.6782bcb00f7de00014fe794f0593cfee
   Devfs Path: /vmfs/devices/disks/naa.6782bcb00f7de00014fe794f0593cfee
   Vendor: DELL      Model: PERC H700         Revis: 2.10
   SCSI Level: 5  Is Pseudo: false Status: on
   Is RDM Capable: false Is Removable: false
   Is Local: true
...
 
~ # fdisk -l
 
Disk /dev/disks/naa.6782bcb00f7de00014fe794f0593cfee: 898.3 GB, 898319253504 bytes
64 heads, 32 sectors/track, 856704 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
 
                                           Device Boot      Start         End      Blocks  Id System
/dev/disks/naa.6782bcb00f7de00014fe794f0593cfeep1             5       900    917504    5  Extended
/dev/disks/naa.6782bcb00f7de00014fe794f0593cfeep2           901      4995   4193280    6  FAT16
/dev/disks/naa.6782bcb00f7de00014fe794f0593cfeep3          4996    571136 579728384   fb  VMFS
/dev/disks/naa.6782bcb00f7de00014fe794f0593cfeep4   *         1         4      4080    4  FAT16

开始按照最下面的参考文献删除VMFS分区然后重建。

~ # fdisk /vmfs/devices/disks/naa.6782bcb00f7de00014fe794f0593cfee
 
The number of cylinders for this disk is set to 856704.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
   (e.g., DOS FDISK, OS/2 FDISK)
 
Command (m for help): p
 
Disk /vmfs/devices/disks/naa.6782bcb00f7de00014fe794f0593cfee: 898.3 GB, 898319253504 bytes
64 heads, 32 sectors/track, 856704 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
 
                                                    Device Boot      Start         End      Blocks  Id System
/vmfs/devices/disks/naa.6782bcb00f7de00014fe794f0593cfeep1             5       900    917504    5  Extended
/vmfs/devices/disks/naa.6782bcb00f7de00014fe794f0593cfeep2           901      4995   4193280    6  FAT16
/vmfs/devices/disks/naa.6782bcb00f7de00014fe794f0593cfeep3          4996    571136 579728384   fb  VMFS
/vmfs/devices/disks/naa.6782bcb00f7de00014fe794f0593cfeep4   *         1         4      4080    4  FAT16

然后去 vSphere Client 看,可用空间已经增长了。

最初的状态:

加入硬盘后:

改好分区表之后的最终结果:

参考文献
MegaCli常用参数介绍
http://www.mysqlsupport.cn/megacli-study/
MegaCli -adpCount 【显示适配器个数】
MegaCli -AdpGetTime –aALL 【显示适配器时间】
MegaCli -AdpAllInfo -aAll 【显示所有适配器信息】
MegaCli -LDInfo -LALL -aAll 【显示所有逻辑磁盘组信息】
MegaCli -PDList -aAll 【显示所有的物理信息】
MegaCli -AdpBbuCmd -GetBbuStatus -aALL |grep ‘Charger Status’ 【查看充电状态】
MegaCli -AdpBbuCmd -GetBbuStatus -aALL【显示BBU状态信息】
MegaCli -AdpBbuCmd -GetBbuCapacityInfo -aALL【显示BBU容量信息】
MegaCli -AdpBbuCmd -GetBbuDesignInfo -aALL 【显示BBU设计参数】
MegaCli -AdpBbuCmd -GetBbuProperties -aALL 【显示当前BBU属性】
MegaCli -cfgdsply -aALL 【显示Raid卡型号,Raid设置,Disk相关信息】
MegaCli -cfgdsply -aALL |grep Policy 【查看Cache 策略设置】
MegaCli -AdpBbuCmd -GetBbuStatus -aALL |grep ‘Relative State of Charge’【查看充电进度百分比】
磁带状态的变化,从拔盘,到插盘的过程中。
Device |Normal|Damage|Rebuild|Normal
Virtual Drive |Optimal|Degraded|Degraded|Optimal
Physical Drive |Online|Failed –> Unconfigured|Rebuild|Online
Use of LSI MegaCli tool in ESX 4
http://communities.vmware.com/thread/228615

PERC H700 Online Capacity Expansion or RAID Level Migration
http://lists.us.dell.com/pipermail/linux-poweredge/2011-March/044451.html

MegaCli -LDRecon {-Start -rX [{-Add | -Rmv} -Physdrv[E0:S0,…>}|-ShowProg|-ProgDsply -Lx -aN
>

I was able to successfully reconfigure the VD from a 3 drive RAID5 set
to a 4 drive RAID5 set using the following command:
*# /opt/MegaRAID/MegaCli/MegaCli64 -LDRecon start r5 [Add PhysDrv[32:3]]
L0 -a0*
Start Reconstruction of Virtual Drive Success.

And could check the status using:
*# /opt/MegaRAID/MegaCli/MegaCli64 -LDRecon ShowProg L0 -a0*
Reconstruction on VD #0 (target id #0) Completed 76% in 200 Minutes.

Four hours later I’ve gone from from:
> Virtual Drive: 0 (Target Id: 0)
> Name :
> RAID Level : Primary-5, Secondary-0, RAID Level Qualifier-3
> Size : 837.25 GB
> State : Optimal
> Strip Size : 64 KB
> Number Of Drives : 3
To:
> Virtual Drive: 0 (Target Id: 0)
> Name :
> RAID Level : Primary-5, Secondary-0, RAID Level Qualifier-3
> Size : 1.225 TB
> State : Optimal
> Strip Size : 64 KB
> Number Of Drives : 4
/opt/MegaRAID/MegaCli/MegaCli64 -getLdExpansionInfo
>>> -Lall -aALL
>>>
>>>
>>> Virtual Disk: 0 (Target Id: 0)
>>> Expansion : Not
>>> Possible
>>>
>>> Exit Code: 0x00

使用MegaCli和Smartctl获取普通磁盘和SSD磁盘信息_1
http://hi.baidu.com/stealth_space/blog/item/416e91acde14e8a4cb130c97.html

Adding a VMFS Extent using vmkfstools
http://www.zealkabi.com/2009/03/adding-vmfs-extent-using-vmkfstools.html

Extending an Existing VMFS-3 Volume

vmkfstools -Z –extendfs

This option adds another extent to a previously created VMFS volume
. You must specify the full path name, for example
/vmfs/devices/disks/vmhba0:1:2:1, not just the short name vmhba0:1:2:1. Each
time you use this option, you extend a VMFS-3 volume with a new extent so that the
volume spans multiple partitions. At most, a logical VMFS-3 volume can have 32
physical extents.

Example for Extending a VMFS-3 Volume:

vmkfstools -Z /vmfs/devices/disks/vmhba0:1:2:1 /vmfs/devices/disks/vmhba1:3:0:1

This example extends the logical file system by allowing it to span to a new partition.
The extended file system spans two partitions.vmhba1:3:0:1 and vmhba0:1:2:1. In
this example, vmhba1:3:0:1 is the name of the head partition.

Listing Attributes of a VMFS Volume

vmkfstools -P –queryfs [-h –human-readable]

When you use this option on any file or directory that resides on a VMFS volume, the
option lists the attributes of the specified volume. The listed attributes include the
VMFS version number (VMFS-2 or VMFS-3), the number of extents comprising the
specified VMFS volume, the volume label if any, the UUID, and a listing of the device
names where each extent resides.

You can specify the -h suboption with the -P option. If you do so, vmkfstools lists the capacity of the volume in a
more readable form, for example, 5k, 12.1M, or 2.1G.

For example:

# vmkfstools -P /vmfs/volumes/MSALUN12/AM_W2k3/AM_W2k3-flat.vmdk
VMFS-3.31 file system spanning 1 partitions.
File system label (if any): MSALUN12
Mode: public
Capacity 83214991360 (79360 file blocks * 1048576), 8964276224 (8549 blocks) avail
UUID: 47669f06-beeb8164-ed10-000e7fb4371c
Partitions spanned (on “lvm”):
vmhba1:0:12:1
(One or more partitions spanned by this volume may be offline)

How to increase the size of a local datastore … on an ESXi4?

fdisk /vmfs/devices/disks/mpx.vmhba1:C0:T0:L0

As you can see on the picture, the local VMFS datastore is only 651MB. Also if you add up all the partitions you end up with a bit more than 1500MB. I’m definitely +500MB short.

Press d then Enter 2 to delete /vmfs/devices/disks/mpx.vmhba1:C0:T0:L0p2
Press n and Enter to start creating a new partition.
Press p and Enter to identify that you are creating a primary partition.
Press 2 and Enter to identify that you are creating a second partition. A primary partition already exists.
Press Enter to accept the default ending value.
Press t and Enter to identify that you want to change the type of a partition. We want to change partition# 2.
Type fb and press Enter. This sets the volume to type VMFS.
Press w and press Enter to save the changes and exit fdisk.

Now that we have set properly the partition table, we need to grow the file system on it. The :2 in the command identifies that this operation is performed on the second partition. Run the following command:
vmkfstools –growfs /vmfs/devices/disks/vml.0000000000766d686261313a303a30:2 /vmfs/devices/disks/vml.0000000000766d686261313a303a30:2
Now in the Configuration tab of my ESXi4.0 host, I go to Storage, select the local datastore (datastore1), right click and select Refresh. Now the local VMFS datastore is 1120MB (1.12GB) and uses all remaining disk space.

[UPDATE] In certain occasion, you cannot grow the VMFS disk because it doesn’t exist anymore. In that case you will have to create a VMFS datastore. By default it will take all space available. The command is: vmkfstools -C vmfs3 /vmfs/devices/disks/vml.0000000000766d686261313a303a30:2
Note that you will have to add that new VMFS datastore to your host using the vCenter Client. You will be able to give it a name, for instance ‘datastore1′, and the set the block size (1MB by default).

Growing a datastore from the Service Console in ESX 4.0