Due to some issues with the way the VG’s were initially laid out, a standard pvmigration isn’t possible with one of the AIX servers we administer. Hence, another possibility is to use the ISL on the SAN side, which basically does a block level copy of the LUNs from one SAN to the other. Some steps that need to be performed to accomplish this task:
* do an emcgrab (plan on doing one before and one after)
* dump some information out to txt files. Like lspv, lsvg, lsvg VGNAME, lsvg -l VGNAME, lsvg -p VGNAME, lsdev -Cc disk, lsdev -Cc adapter, lsdev -Cc driver, lsdev -Cc if
* perform a mksysb backup (ensure the mksysb is on another system, or run from a nim master)
* Collect fibre switch logs
* Stop the application / Databases and unmount the file systems
* varyoffvg VGNAME
(repeat for each VG) .. Confirm with lsvg -o VGNAME
* exportvg VGNAME
* rmdev -Rdl hdiskXX (if you are running SAN Boot, you’ll want to keep those ones)
* rmdev -Rdl fcsX (if running SAN BOOT, skip this step)
* odmget CuAt | grep hdisk (confirm all are gone, unless you have SAN Boot)
* lsdev -Cc disk (if NO SAN Boot, confirm all of the hdisks have been removed)
Shutdown the Server
* move any fibre strands (if applicable). Some setups will allow you to see both SANs at the same time. If you have additional systems running using that “shared fibre strands” then you’ll have to wait till all migrations are done before doing this step.
* Re-zone system on the SAN side, and perform the copy
* change zone mappings so system can’t see LUNS from old SAN
* Restart server
* Scan the bus for hardware changes (cfgmgr -v
)
* lsdev -Cc disk
(see what hdisk devices are shown, and their state)
* mount the filesystems (assuming all the PVs were re-added to their respective VGs and were automatically varyiedon).
= Troubleshooting =
If after the copy and reboot of the server the hdisks were added, but no VGs are there or no PVs in the Vgs, then you should start looking at the PVs. Use the command lqueryvg -At -p hdiskX
to view the VGDA information residing on the disk. This will contain information like the name of the VG and the VG characteristics including the Logical Volumes.
First thing to try is to use the importvg command. ex. importvg hdisk1
If not luck, you can attempt to remove the disk(s), then run cfgmgr -v
followed by doing an importvg hdisk1
, followed by a recreatevg -y VGNAME hdisk1
.
NOTE: An importvg should only be required for one of the PVs (per VG), as the VGDA information on the other PVs will be found, automatically imported into the VG, and the VG will then be varyied on.
== Actual Outage ==
With the actual outage recently performed I can state that a dump of the VGs was quite helpful. By doing a lsvg | awk {'print "echo -----" ; print "lsvg " $1'} | sh > /tmp/lsvgs.txt
it grabbed the human name of the Volume Group and the VGID. When the server had it’s LUNs migrated and the system rebooted, the VGs affected weren’t listed.
By running a lqueryvg -At -p hdiskX
it was able to read the VGDA information from the drive. By taking the VGID and grepping it against the /tmp/lsvgs.txt file it was very easy to tell the name of the orginal VG. The VG can then be imported with importvg -y ORIGINALVGNAMEHERE hdiskX
(replaceing originalvgname and hdiskX with the appropriate Volume Group name gathered from the /tmp/lsvgs.txt file and hdiskX with the affected hdisk).
NOTE: The importvg command (per VG) is only required to be run against one of the PVs. If the VG consisted of 15 PVs, the importvg command will find all 15 PVs and bring them in to that Volume Group.
After the importvg command was run, the data was out of synch (fix by running synclvodm
) and the VG size had changed, so to update the VG one would run chvg VGNAME
. Once that is done, the system would be pretty much back to normal. The last couple of steps done were:
* mount -a
(mounts any missing File Systems)
* run the MPIO settings against the newly discovered PVs