dhanaunix: July 2016

Sunday, July 31, 2016

Removing Ldom's on Solaris 10 Sparc Machines

1) Removing the configuration

# ldm ls-config

# ldm rm-config <config name>

2 )Disable the ldom services

# svcadm disable ldmd

# svcadm disable vntsd

3) Removing the packages

# pkgrm SUNWldm

# pkgrm SUNWjass

4) From the console prompt, reset the settings

sc> bootmode config="factory-default"

sc>poweroff -y

5) sc> poweron -c

To Erase the data on tape

st -f /dev/rmt/1m erase

Thursday, July 28, 2016

Giving Crontab permission for Oracle User in HP-Ux

1) Checking cron jobs for oracle user.

#crontab -l oracle
crontab: can't open your crontab file.

2) Add the oracle user in /var/adm/cron.allow file to give crontab permission to oracle user.

#vi cron.allow
oracle

3) Oracle user will now able to create cron jobs.

#crontab -e

Tuesday, July 26, 2016

Adding swap device from a logical volume in HP-Ux

1) Use swapinfo command to view the current status of swap devices.

#swapinfo

2) Create a new logical volume called lv-swap, in the vg01 volume group, to be used as a secondary swap device.

#lvcreate -L 1000 -C y -n lv-swap vg01

3) Edit the /etc/fstab file to add the swap device.

#vi /etc/fstab
Add the line: /dev/vg01/lv-swap /swap swap 0 0

4) Activate the new swap device.

#swapon –a

5) Finally, check the swapinfo for the newly added swap device.

#swapinfo

Monday, July 25, 2016

Mirror root-disk Replacement online in HP-Ux

Caution : Before starting, make sure that the remaining disk is really
bootable. Use vxvmboot and lifls command to verify if the disk is bootable. Also, make sure recent ignite-ux recovery image is taken and a valid backup of your data.

1) Without removing mirror

# vxdisk -o alldgs list | grep root

# setboot
Primary bootpath : 2/0/1/0/0.1.0
Alternate bootpath : 2/0/1/0/0.0.0

==> We would like to replace c0t1d0

2) Remove the disk from kernel

# vxdg -k -g rootdg rmdisk rootdisk01
# vxdisk -o alldgs list | grep root

3. Pull out the removed disk and put in the replacement disk.

4. Run vxdisksetup on recently inserted disk :

# /etc/vx/bin/vxdisksetup -iB c0t1d0

# vxdisk -o alldgs list | grep root

even if the new disk doesn't show up in the list you can go ahead.

5. Add replaced disk back in rootdg

# vxdg -k -g rootdg adddisk rootdisk01=c0t1d0

# vxdisk -o alldgs list | grep rootdg

6. Check if the mirrordisk plexes are still in status DISABLED RECOVER

# vxprint -thg rootdg

7. Recover mirror

# vxrecover -b -g rootdg rootdisk01

and check with vxtask list if the job is finished

8. Use vxprint to check if all the plexes are ENABLED ACTIVE

# vxprint -thg rootdg

9. Only when step seven is finished, make the replacement mirrordisk bootable

# /usr/lib/vxvm/bin/vxbootsetup rootdisk01

# vxvmboot -v /dev/rdsk/c0t1d0

# lifls -l /dev/rdsk/c0t1d0

Corrupt of sar data files

The most likely problem for this error message is that there are two sa1 sar
processes running at the same time.

With two sa1 processes writing to the file, one will over-write data written by the other, causing corruption in the sar data file which will confuse sar when the file is read. Check that they are not duplicate entries in cron where sa1 and sa2 are running at same time.

Also, Check /var/adm/messages for warning messages from the disk during the time cron is trying to run account /sar. This could indicate a disk is going bad.

[ID 107833 kern.warning]

Fan Sensor Problem on Solaris Server

The server has an issue with fan-sensor at FANBD0/FM0/F0, its status is Unknown in prtdiag output. We raised a case with SUN and got below update :

"This is a known issue:

If you would update your kernel patch to 137137-09 (or latest) as well as update your firmware with patch 136932, this issue should disappear. "

The same has been applied and the issue got fixed.

Sunday, July 24, 2016

Interface Bonding in Redhat Linux

Step 1: Creating Bonding Channel
# /etc/modprobe.d/
vi bonding.conf
alias bond0 bonding

Step 2: Creating Channek Bonding Interface
#/etc/sysconfig/network-scripts/
touch ifcfg-bond0
vi ifcfg-bond0
DEVICE=bond0
IPADDR=192.168.1.8
NETMASK=255.255.255.0
ONBOOT=yes
BOOTPROTO=none
USERCTL=no

Step 3: Configuring Channel Bonding Interface

After the channel bonding interface is created, the network interfaces to be bound together must be configured by adding the MASTER and SLAVE directives to their configuration files. The configuration files for each of the channel-bonded interfaces can be nearly identical. For example, if two Ethernet interfaces are being channel bonded,
both eth0 and eth1 may look like the following example. Edit physical interface card details as under.

For eth0

# vi /etc/sysconfig/network-scripts/ifcfg-eth0

DEVICE=eth0
USERCTL=no
ONBOOT=yes
MASTER=bond0
SLAVE=yes
BOOTPROTO=none

For eth1

# vi /etc/sysconfig/network-scripts/ifcfg-eth1
DEVICE=eth1
USERCTL=no
ONBOOT=yes
MASTER=bond0
SLAVE=yes
BOOTPROTO=none
Type=Ethernet

?DEVICE: Indicates what is the device name
?USERCTL: Indicates that can user control this device(here its no)
?ONBOOT: Indicates that at the boot time do this device should be up?
?MASTER: Is this device has master? Then what it is(here its bond0)
?SLAVE: Is this device acting as slave?
?BOOTPROTO: What about getting IP Address from DHCP? It’s set to none which indicate it’s a static IP)

Step 4: Restarting Network Service

# service network restart

Changing the hostname in Redhat 6

1) hostname
2) hostname newname(redhat6)
3) hostname (for checking) - temporary
4) change the hostname in /etc/hosts - if its not in dns
5) change the hostname in /etc/sysconfig/network
6) reboot

Preventing /etc/resolv.conf from changing automatically (entries change after reboot often) - Linux

1)cat /etc/resolv.conf
service network restart
cat /etc/resolv.conf (entries will change)
2) service NetworkManager status (this should be stopped)
Network manager reads the configuration from /etc/sysconfig/network/scripts file and rebuilds the resolv.conf file.
service NetworkManager stop
3) chkconfig --list NetworkManager
chkconfig NetworkManager off
4) cd /etc/sysconfig/network/scripts
vi /ifcfg-eth0 (configuration file for dhcp)

PEERDNS=No ( this will make sure that resolv.conf entries doesn't change)

Adding Windows route

Syntax - route add -p <Client host backend network> mask 255.255.252.0 <gateway>

Ex. - route add -p 10.214.6.0 mask 255.255.255.0 10.214.6.0

Decomissioning the Backup Server

1) The Servers will be decommissioned, we need to disable the backup from Console server.
2) Login to Mgmt server and open console - Click on Manage (Windows Mgmt Server)
3) Search the host that needs to be decom, if more than one, select all at once
4) Right click on Unassign computers-> Disable the licenses and the backup jobs.
5) Right click on Unmanage computers and see if the servers are been removed.
6) Click on My computer -> C: -> winbkp-> bkpconfig -> Remove the servers from corresponding file.
Before that just confirm from where servers are backed up or not
7) Click on bkpconfig.txt file and remove all files that belongs to the decomissioning servers
8) Go to the schedule job and remove the schedule and put as free schedule
9) Fire the tape backups that are not sent, if any

Wednesday, July 20, 2016

Route adding in Solaris 11

route -p add net <network> -netmask <netmask ip> <gateway> -ifp <interface>

route -p add net 10.170.28.0 -netmask 255.255.255.0 10.170.8.1 -ifp tst

Tuesday, July 19, 2016

Increasing File System in VXVM

1) Verify recent flasharchive on the system.

2) Verify a recent backup of the file systems to be modified.

3) Notify the server's CMDB management group and Operations that the change is starting.

4) Use 'df -k' on the partitions to be resized, and store the total size of the file systems.

# df -k /app

Initialize the disk

# /etc/vx/bin/vxdisksetup -i c5t3d9 format=cdsdisk

5) Add the disk into the required diskgroup "datadg"

# vxdg -g datadg adddisk data_emcd34=c5t3d9

Verify that there is sufficient space to complete to request:

# vxassist -g datadg maxsize

6) Increase the size of the /app filesystem by 25GB.

# vxresize -g datadg app +25g

Use 'df -k' to verify increased space when compared to the size before the resize.

# df -k /app

Memory Utilization is High on Solaris Server

Please find the memory utilization status for server radha34.

radha34% prstat -t
NPROC USERNAME SIZE RSS MEMORY TIME CPU
446 tcadmin 10G 5241M 88% 84:10.28 9.6%
64 root 518M 249M 4.1% 145:25.19 4.8%
3 balep 13M 8624K 0.1% 0:00.00 0.2%
1 nobody 3224K 2080K 0.0% 0:00.00 0.0%
4 topazmon 16M 7712K 0.1% 0:00.00 0.0%
3 rchr08 12M 6432K 0.1% 0:00.01 0.0%

radha34% swap -s
total: 2492656k bytes allocated + 2925688k reserved = 5418344k used, 8517704k available
radha34%

radha34% prstat | grep -v tcadmin
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
681 root 5240K 3552K sleep 0 0 8:37.14 0.9% automountd/5
1106 root 7632K 6176K sleep 44 0 26:28.37 0.3% sysedge.sol28-s/1
21308 balep 4624K 4360K cpu1 38 0 0:00.00 0.2% prstat/1
Total: 524 processes, 3040 lwps, load averages: 0.18, 0.32, 0.41

tcadmin uses 88% and we couldnt find any other process running on this server

Kill the process if you get clearance from respective team member. Else escalate the issue..

Monday, July 18, 2016

Removing files older than 30 days in Unix

find . -size +10000 -exec ls -lh {} \; -- To get the list of files occupying more space

find /var/spool/clientmqueue -mtime +30 -exec rm {} \; Removes the files older than 30 days

Replacing failed root disk in VXVM

First, we must remove the disk for replacement:

# vxdg -g rootdg -k rmdisk rootdisk

This will save all associated objects in the diskgroup and place the disk into a "removed:was" state.

At this point you will want to do the physical replacement of the disk.
1. Swap out old drive for new drive
2. Get the new drive visible to format and label

NOTE: Your replacement drive must be of equal size or larger than the one you are replacing.

Once labeled ensure VxVM can see the new disk:
# vxdisk scandisks

The drive should appear as: "online invalid"

Once in this state, setup the disk for VxVM usage:
# /etc/vx/bin/vxdisksetup -i Disk_0 format=sliced (whatever the disk name is in disk list)

Once set up and seen in the disk list as "online"

We can put the new disk back into the configuration:
# vxdg -g rootdg -k adddisk rootdisk=Disk_0

Then recover all the objects: (force a sync)
# vxrecover -g rootdg rootdisk&

You can monitor the above sync with:
# vxtask -l list

At this stage, the best thing is to have a reboot done and then we will carry out to add the rootdisk
to the rootdg.

Server panic reboot - Created a case to Sun Support

# ls -ltr |grep pab028
-rw-r--r-- 1 root apac 29587258 Aug 23 21:29 explorer.849d6570.pab028-2008.08.24.02.22.tar.gz

# cp explorer.849d6570.pab028-2008.08.24.02.22.tar.gz /tmp/suncase661417.fnac.explorer.849d6570.pab028-2008.08.24.02.22.tar.gz

# cd /tmp
# ls -ltr |grep suncase66149117
-rw-r--r-- 1 root apac 29587258 Aug 23 21:44 suncase66149117.fnac.explorer.849d6570.pab028-2008.08.24.02.22.tar.gz

# cd /var/crash
# ls -l
total 4
drwx------ 2 root root 512 Nov 23 18:48 pab028
drwx------ 2 root root 512 Aug 1 2007 moplgtotest
# cd pab028
# ls -l
total 12393154
-rw-r--r-- 1 root root 2 Nov 23 18:48 bounds
-rw-r--r-- 1 root root 2481536 Nov 23 18:42 unix.0
-rw-r--r-- 1 root root 6339698688 Nov 23 18:48 vmcore.0

# gzip unix.0
# gzip vmcore.0

# mv unix.0.gz 66166352.unix.0.gz
# mv vmcore.0.gz 66166352.vmcore.0.gz

We have uploaded explorer files under cores directory to supportfiles.sun.com

Vxresize fails with error message "Subdisk data_emcd1-02 would overlap subdisk data_emcd1-01"

bash-2.05# /etc/vx/bin/vxresize -g datadg health +49g
VxVM vxassist ERROR V-5-1-10127 creating subdisk data_emcd1-02:
Subdisk data_emcd1-02 would overlap subdisk data_emcd1-01
VxVM vxresize ERROR V-5-1-4703 Problem running vxassist command for volume health, in diskgroup datadg
bash-2.05#

"vxprint -thrg datadg" would not show the new disk "c2t2d1" that was added to that disk group.

# vxprint -thrg datadg | grep health-01
sd data_emcd1-01 health-01 data_emcd1 0 209704704 0 c2t0d0 ENA

Solution:

Cleared issue with vxconfigd daemon by issuing command "vxconfigd –k –x cleartempdir" and extended volume.

Removed and recreated this directory online without affecting normal operation of server using command "vxconfigd –k –x cleartempdir"
# vxconfigd –k –x cleartempdir
# vxprint -thrg datadg | grep health-01
sd data_emcd1-01 health-01 data_emcd1 0 209704704 0 c2t0d0 ENA
sd data_emcd2-02 health-01 data_emcd2 0 102760448 209704704 c2t2d1 ENA

# /etc/vx/bin/vxresize -g datadg ehealth +49g
# df -h /opt/health
Filesystem size used avail capacity Mounted on
/dev/vx/dsk/datadg/health
149G 62G 86G 42% /opt/health

Users are not able to login after migration

Before migration if we are not sure what authentication the users uses , please check the below method .. It worked for me.

#authconfig-tui

So we have enabled the local authorization which fixed the issue.

VNC service went to maintenance mode on Solaris 10

# svcs -l svc:/application/x11/xvnc-inetd:default
fmri svc:/application/x11/xvnc-inetd:default
name X server that displays to VNC viewers
enabled true
state maintenance
next_state none
state_time Thu Jun 30 16:15:33 2016
restarter svc:/network/inetd:default

Error in /var/svc/log :

Executing start method ("/lib/svc/method/fs-local") ]
cannot mount 'rpool/export' on '/export': directory is not empty
WARNING: /usr/sbin/zfs mount -a failed: one or more file systems failed to mount
[ Jun 17 00:52:19 Method "start" exited with status 0 ]

Error in /var/adm/messages :

inetd[15492]: [ID 702911 daemon.error] Property 'name' of instance svc:/application/x11/xvnc-inetd:default is missing, inconsistent or invalid
Jun 30 16:15:27 muvmzn015 inetd[15492]: [ID 702911 daemon.error] Property 'proto' of instance svc:/application/x11/xvnc-inetd:default is missing, inconsistent or invalid
Jun 30 16:15:33 muvmzn015 inetd[15492]: [ID 702911 daemon.error] Property 'name' of instance svc:/application/x11/xvnc-inetd:default is missing, inconsistent or invalid
Jun 30 16:15:33 muvmzn015 inetd[15492]: [ID 702911 daemon.error] Property 'proto' of instance svc:/application/x11/xvnc-inetd:default is missing, inconsistent or invalid
Jun 30 16:15:33 muvmzn015 inetd[15492]: [ID 702911 daemon.error] Invalid configuration for instance svc:/application/x11/xvnc-inetd:default, placing in maintenance

Issue :

home$ cat /etc/services | grep -i vnc
#vnc-servert5900/tcpttt# Xvnc

Fixed after editing as below

home$ cat /etc/services | grep -i vnc
vnc-server 5900/tcp # Xvnc

Restart the inetd service. ( sometimes need to reboot the system).

Host lost its virtual interface to DOM on Oracle Linux

[root@host1 ~]# netstat -rn

Kernel IP routing table

Destination Gateway Genmask Flags MSS Window irtt Iface

140.85.50.16 0.0.0.0 255.255.255.240 U 0 0 0 eth0

144.20.63.32 10.225.160.1 255.255.255.224 UG 0 0 0 eth1

140.85.21.0 10.225.160.1 255.255.255.128 UG 0 0 0 eth1

144.20.110.128 10.225.160.1 255.255.255.128 UG 0 0 0 eth1

144.20.116.128 10.225.160.1 255.255.255.128 UG 0 0 0 eth1

10.224.124.0 10.225.160.1 255.255.255.0 UG 0 0 0 eth1

144.20.118.0 10.225.160.1 255.255.255.0 UG 0 0 0 eth1

144.20.54.0 10.225.160.1 255.255.255.0 UG 0 0 0 eth1

140.85.2.0 10.225.160.1 255.255.254.0 UG 0 0 0 eth1

140.85.12.0 10.225.160.1 255.255.252.0 UG 0 0 0 eth1

10.225.160.0 0.0.0.0 255.255.248.0 U 0 0 0 eth1

10.224.96.0 10.225.160.1 255.255.248.0 UG 0 0 0 eth1

169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth1

0.0.0.0 140.85.50.17 0.0.0.0 UG 0 0 0 eth0

[root@host1 ~]# ping 10.225.160.1

PING 10.225.160.1 (10.225.160.1) 56(84) bytes of data.

From 10.225.161.133 icmp_seq=1 Destination Host Unreachable

From 10.225.161.133 icmp_seq=2 Destination Host Unreachable

From 10.225.161.133 icmp_seq=3 Destination Host Unreachable

[root@host1 ~]# ping abc12stor12-nas

PING abc12stor12-nas.us.oracle.com (10.225.163.240) 56(84) bytes of data.

--- abc12stor12-nas.us.oracle.com ping statistics ---

0 packets transmitted, 0 received

[root@host1 ~]#

[root@host1 ~]# ping abc12osb12-nfs

PING abc12osb12-nfs.us.oracle.com (10.225.164.135) 56(84) bytes of data.

From host1-nfs.us.oracle.com (10.225.161.133) icmp_seq=2 Destination Host Unreachable

From host1-nfs.us.oracle.com (10.225.161.133) icmp_seq=3 Destination Host Unreachable

From host1-nfs.us.oracle.com (10.225.161.133) icmp_seq=4 Destination Host Unreachable

--- abc12osb12-nfs.us.oracle.com ping statistics ---

5 packets transmitted, 0 received, +3 e

On DOM ---

[root@audom11 ~]# xm list

Name ID Mem VCPUs State Time(s)

45476_host1 5 8192 2 r----- 1881798.9

45481_server1 6 16384 4 -b---- 2652174.1

45486_server2 7 16384 4 -b---- 1111889.1

45491_server3 12 32768 8 -b---- 1285069.8

45496_server4 14 8192 2 -b---- 15698.7

Domain-0 0 2048 24 r----- 1462970.9

[root@audom11 ~]

its possible the vm has "lost" the link to the dom0.

First get dom0 details for the host

i'm checking out all the bridges and bonds (bond1 is usually NFS interface)

its definately a dom0 problem cos that dom0 runs 4 other Vm's .. all with the same issue

[root@audom11 ~]# xm list

Name ID Mem VCPUs State Time(s)

45476_host1 5 8192 2 -b---- 1880611.3

45481_server1 6 16384 4 -b---- 2652141.5

45486_server2 7 16384 4 -b---- 1111874.7

45491_server3 12 32768 8 -b---- 1285042.4

45496_server4 14 8192 2 -b---- 15683.1

Domain-0 0 2048 24 r----- 1462740.8

[root@audom11 ~]# xm network-list 45476_host1

Idx BE MAC Addr. handle state evt-ch tx-/rx-ring-ref BE-path

0 0 00:16:3E:14:0B:02 0 4 13 1280 /1281 /local/domain/0/backend/vif/5/0

1 0 00:16:3E:38:24:93 1 4 14 1282 /1283 /local/domain/0/backend/vif/5/1

[root@audom11 ~]# xm network-detach 45476_host1 1

[root@audom11 ~]# xm network-list 45476_host1

Idx BE MAC Addr. handle state evt-ch tx-/rx-ring-ref BE-path

0 0 00:16:3E:14:0B:02 0 4 13 1280 /1281 /local/domain/0/backend/vif/5/0

[root@audom11 ~]# xm network-attach 45476_host1 bridge=br93 mac=00:16:3E:38:24:93

[root@audom11 ~]# xm network-list 45476_host1

Idx BE MAC Addr. handle state evt-ch tx-/rx-ring-ref BE-path

0 0 00:16:3E:14:0B:02 0 4 13 1280 /1281 /local/domain/0/backend/vif/5/0

2 0 00:16:3E:38:24:93 2 4 14 1282 /1365 /local/domain/0/backend/vif/5/2

Fixed after doing above.

Sunday, July 17, 2016

Restoring spfile from pfile from RMAN prompt

RMAN> restore spfile to pfile '/home/oracle/init24.ora' from '/testbackup/rmanf ull20140821/JBL_T24_DB_CTL_c-1296243675-20140821-01';

RMAN> set DBID 1296243675

executing command: SET DBID

RMAN> startup force nomount

Oracle instance started

Total System Global Area 20911292416 bytes

To check available tape backups from RMAN

instance user $ rman target /

RMAN > list backup ; - to check the tape backup of RMAN
RMAN > list backup summary;
RMAN > list backup by file;

RMAN> list backup completed before 'sysdate' device type disk; - list of backups available on disk
RMAN> show all; - to check for the retention
RMAN> list backup completed before 'sysdate' device type SBT_TAPE; - list of tape backups

command that shows list of backup that can be used for restore

RMAN> list backup summary tag INCR_BACKUPSET_0 completed after '04-NOV-2012' device type disk;

RMAN > backup database plus archivelog; (spfile and control file backed up - depends on size it takes time)

sg_map got stuck on Linux media server

[root@adc12osbmed01 by-id]# sg_map -i -x
Strange, could not find device /dev/nst0 mapped to sg device??
Strange, could not find device /dev/nst2 mapped to sg device??
device /dev/nst5 failed on scsi ioctl(idlun), skip: Input/output error
Strange, could not find device /dev/nst7 mapped to sg device??
Strange, could not find device /dev/nst8 mapped to sg device??
Strange, could not find device /dev/nst11 mapped to sg device??
Strange, could not find device /dev/nst12 mapped to sg device??
Strange, could not find device /dev/nst13 mapped to sg device??
Strange, could not find device /dev/nst14 mapped to sg device??
Strange, could not find device /dev/nst15 mapped to sg device??
Strange, could not find device /dev/nst16 mapped to sg device??
/dev/sg0 0 0 0 0 0 /dev/sda HITACHI H103014SCSUN146G A2A8
/dev/sg1 0 0 1 0 0 /dev/sdb HITACHI H103014SCSUN146G A2A8
/dev/sg2 0 0 2 0 0 /dev/sdc ATA SEAGATE ST95000N SF03
/dev/sg3 0 0 3 0 0 /dev/sdd ATA SEAGATE ST95000N SF03
/dev/sg4 0 0 4 0 0 /dev/sde ATA SEAGATE ST95000N SF03
/dev/sg5 0 0 5 0 0 /dev/sdf ATA SEAGATE ST95000N SF03
/dev/sg6 0 0 6 0 0 /dev/sdg ATA SEAGATE ST95000N SF03
/dev/sg7 0 0 7 0 0 /dev/sdh ATA SEAGATE ST95000N SF03
/dev/sg8 -2 -2 -2 -2 -2
/dev/sg9 7 0 1 0 1 /dev/nst1 HP Ultrium 5-SCSI I59S
/dev/sg10 -2 -2 -2 -2 -2
/dev/sg11 7 0 3 0 1 /dev/nst3 HP Ultrium 5-SCSI I3CS
/dev/sg12 7 0 4 0 1 /dev/nst4 HP Ultrium 5-SCSI I3CS
/dev/sg13 -2 -2 -2 -2 -2
/dev/sg14 8 0 1 0 1 /dev/nst6 HP Ultrium 5-SCSI I3CS
/dev/sg15 -2 -2 -2 -2 -2
/dev/sg16 -2 -2 -2 -2 -2
/dev/sg17 8 0 4 0 1 /dev/nst9 HP Ultrium 5-SCSI I3CS
/dev/sg18 8 0 5 0 1 /dev/nst10 HP Ultrium 5-SCSI I3CS
/dev/sg19 -2 -2 -2 -2 -2
/dev/sg20 -2 -2 -2 -2 -2
/dev/sg21 -2 -2 -2 -2 -2
/dev/sg22 -2 -2 -2 -2 -2
/dev/sg23 -2 -2 -2 -2 -2
/dev/sg24 -2 -2 -2 -2 -2
/dev/sg25 -2 -2 -2 -2 -2
/dev/sg26 9 0 0 0 5 /dev/scd0 TEAC DV-W28SS-R 1.0C

Sg_map is fine. Dismounted one bad media. cancelled one job and refired. Re-added the scsi for Errored drives. Ran the inventory. Except one drive all the drives seems to be okay.

sg_scan would do..

[root@adc12osbmed01 ~]# sg_map -i -x
/dev/sg0 0 0 0 0 0 /dev/sda HITACHI H103014SCSUN146G A2A8
/dev/sg1 0 0 1 0 0 /dev/sdb HITACHI H103014SCSUN146G A2A8
/dev/sg2 0 0 2 0 0 /dev/sdc ATA SEAGATE ST95000N SF03
/dev/sg3 0 0 3 0 0 /dev/sdd ATA SEAGATE ST95000N SF03
/dev/sg4 0 0 4 0 0 /dev/sde ATA SEAGATE ST95000N SF03
/dev/sg5 0 0 5 0 0 /dev/sdf ATA SEAGATE ST95000N SF03
/dev/sg6 0 0 6 0 0 /dev/sdg ATA SEAGATE ST95000N SF03
/dev/sg7 0 0 7 0 0 /dev/sdh ATA SEAGATE ST95000N SF03
/dev/sg8 7 0 0 0 1 /dev/nst0 HP Ultrium 5-SCSI I3CS
/dev/sg9 7 0 1 0 1 /dev/nst1 HP Ultrium 5-SCSI I59S
/dev/sg10 7 0 2 0 1 /dev/nst2 HP Ultrium 5-SCSI I3CS
/dev/sg11 7 0 3 0 1 /dev/nst3 HP Ultrium 5-SCSI I3CS
/dev/sg12 7 0 4 0 1 /dev/nst4 HP Ultrium 5-SCSI I3CS
/dev/sg13 -2 -2 -2 -2 -2
/dev/sg14 8 0 0 0 1 /dev/nst5 HP Ultrium 5-SCSI I3CS
/dev/sg15 8 0 1 0 1 /dev/nst6 HP Ultrium 5-SCSI I3CS
/dev/sg16 8 0 2 0 1 /dev/nst7 HP Ultrium 5-SCSI I3CS
/dev/sg17 8 0 3 0 1 /dev/nst8 HP Ultrium 5-SCSI I3CS
/dev/sg18 8 0 4 0 1 /dev/nst9 HP Ultrium 5-SCSI I3CS
/dev/sg19 8 0 5 0 1 /dev/nst10 HP Ultrium 5-SCSI I3CS
/dev/sg20 8 0 6 0 1 /dev/nst11 HP Ultrium 5-SCSI I59S
/dev/sg21 8 0 7 0 1 /dev/nst12 HP Ultrium 5-SCSI I59S
/dev/sg22 8 0 8 0 1 /dev/nst13 HP Ultrium 5-SCSI I59S
/dev/sg23 8 0 9 0 1 /dev/nst14 HP Ultrium 5-SCSI I59S
/dev/sg24 8 0 10 0 1 /dev/nst15 HP Ultrium 5-SCSI I59S
/dev/sg25 8 0 11 0 1 /dev/nst16 HP Ultrium 5-SCSI I59S
/dev/sg26 9 0 0 0 5 /dev/scd0 TEAC DV-W28SS-R 1.0C

To Stop & Start Database services

# To stop database services, use the following:

# srvctl stop listener -l <Instance>
# srvctl stop database -d <Instance>

# To start database services, use the following:

# srvctl start listener -l <Instance>
# srvctl start database -d <Instance>

# To verify database services,use the following:

# srvctl status listener -l <Instance>
# srvctl status database -d <Instance>

Saturday, July 16, 2016

Assigning and Unassigning volumes from ACSLS Web GUI

Unassigning volumes:

Configuration and Administration -> Logical Library Configuration -> Unassign Volumes -> Select the server which you want to unassign -> Select volumes -> Tick mark the medias and click on Continue -> Click on Unassign ->

Run a inventory on OSB server and check for the medias.

Select volumes -> If the volumes are not displayed, then you have not unassigned. or
try to click on custome filter which is under filter drop down -> Select the volume id -> Is -> Medias -> Ok

Assigning volumes :

Select logical library - zone -> Select volumes -> Tick mark the medias and click on continue -> Assign -> Run a inventory and check the medias on zone.-> It will be labeled as Barcode

Battery down on ACSLS Server

1. Stop OSB services (make sure all drives are unloaded)
2. acsss disable
3. db_export.sh -f /export/backup/pre_outage
option :8
4. scp the file pre_outage to adc-acsls
4. acsss shutdown
5. acsss status
6 .Bring down the server
7. Correct date (change battery)
7a Verify date / time on the O.S. before bringing up services.
8. Bring up services

After replacing the drive, its not able to mount the drive on ACSLS

ACSSA> dismount 000000 1,0,10,1 force
Dismount: Dismount failed, Process failure.

Now able to mount the tape into the drive.

ACSSA> q drive 1,0,10,1
2012-08-31 04:40:17 Drive Status
Identifier State Status Volume Type
1, 0,10, 1 online in use SB0079 HP-LTO5

CAP shows offline in ACSLS (Automatic Catridge System Library Software )

1. Login as Service on the SL console.

2. Select Tools > Diagnostics

3. Expand CAP folder and select desired CAP

4. CAP should show as "Locked"

5. Change right hand drop box to "False"

6. Click Apply

7. Unlock the CAP on the SLC Console

8. Push flashing CAP button

9. Open CAP and remove carts

10. Push CAP button on CAP again to close the CAP door.

11. CAP goes into an unreserved state not assigned to a partition.

12. From the ACSLS cmd_proc, issue "vary CAP x,x,x online"

RDP to Windows machine from Linux Box

1) rpm -qa | grep -i rdesktop
2) yum search rdesktop
3) yum install rdesktop.x86_64
4) yum info rdesktop

On windows machine mycomputer-properties- remote- allow remote logins

5) rdesktop windows-v8-1-test

we have bug in 6 version if it is giving some errors

6) just get the 7 package and update it
rpm -Uvh rdesktop

7) rpm -qa | grep -i rdesktop
8) rdesktop hostname(windows) -f (to get the full screen)

RAID Implementation on Linux Servers

#mdadm --create /dev/md0 --level=0 --raid-devices=2 /dev/hda6 /dev/hda7 (RAID -0)
continue creating array - y
#mdadm --create /dev/md1 --level=1 --raid-devices=2 /dev/hda8 /dev/hda9 (RAID - 1)
continue creating array - y

# RAID1 conf printout:

# mdadm --create /dev/md5 --level=5 --raid-devices=3 /dev/hda10 /dev/hda11 /dev/hda12(RAID -- 5)
continue creating array - y

# mkfs.ext3 /dev/md0 - to create a file system
# mkfs.ext3 /dev/md1
# mkfs.ext3 /dev/md5

# mkdir raid0
# mkdir raid1
# mkdir raid5

#mount /dev/md0 raid0
#mount /dev/md1 raid1
#mount /dev/md5 raid5

# df -hP

# cat /proc/mdstat - to check
# mdadm --detail /dev/md0
# mdadm --detail /dev/md1
# mdadm --detail /dev/md5

# vi /etc/fstab

/dev/md0 /root/raid0 ext3 defaults 0 0
/dev/md1 /root/raid1 ext3 defaults 0 0
/dev/md5 /root/raid5 ext3 defaults 0 0
:wq

# mount -a

Adding route in Oracle Linux Server

route add -net 10.217.208.0 netmask 255.255.248.0 gw 10.217.48.1

Assigning IP address in Linux Server

# vi /etc/sysconfig/network-scripts/ifcfg-eth0 (assign/change the ip here)
# service network restart

Changing hostname

vi /etc/sysconfig/network
HOSTNAME=XXXXXX

SSH is not working on Solaris/Linux Hosts

>@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
>@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
>@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
>IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
>Someone could be eavesdropping on you right now (man-in-the-middle attack)!
>It is also possible that the RSA host key has just been changed.
>The fingerprint for the RSA key sent by the remote host is
>86:0a:09:0d:b6:b9:7e:67:53:e3:f1:df:50:cd:06:e7.
>Please contact your system administrator.
>Add correct host key in /home/cgi/.ssh/known_hosts to get rid of this message.
>Offending key in /home/cgi/.ssh/known_hosts:23
>RSA host key for 10.193.208.170 has changed and you have requested strict checking.
>Host key verification failed.
>lost connection

Remove the older entries by executing below command

ssh-keygen -R ip

Now connect using ssh again. It generates new entry and can able to connect to server without issue.

Linux OS Backup restore using Dump and Restore

1 Insert the OS media and reboot.
2 Go to Linux rescue mode. Dont mount your root directory in /mnt/sysimage. Skip this step
3 Make interface up for eth0 or eth1.
Give your IP and subnet mask
4 If you are not able to ping to other server check the routing table
#netstat –rn
if no gateway is mentioned give your default gateway.
#route add default gw gateway IP
Ping your nfs server.
5 Mount the nfs file to local server.
6 Create an example directory like /example-mount_point.
mount server:/nfs path /example-mount_point.
7 Delete the existing filesystem using fdisk command
# fdisk /dev/lmo/c0d0
8 Now create the File systems as per the Prod server and label it accordingly.
# fdisk /dev/lmo/c0d0
Eg: mkfs.ext3 /dev/lmo/c0d0p1
9 Then create directories as following:
mkdir /newroot
mkdir /newvar
mkdir /newopt
mkdir /newboot
10 Identify and label your root, boot, var,opt FS
11 Now mount the your FSs on newly created mount point
Ex: mount /dev/mapper/VolGroup00-lvol00 /newboot
12 Now cd to /newboot and give the following command
#restore -r -f /restore/servername/boot-backup
13 Now wait till the time it restore full backup
14 Follow the same step from 6-9 for the rest FSs.
15 Now rename all the mount point as per DC standard. If is not the fresh server then no need to rename the mount points because it will take the updated details from your fstab file.
Eg: #e2label /dev/lmo/c0d0p1 /
e2label /dev/lmo/c0d0p2 /boot
e2label /dev/lmo/c0d0p3 /var
Make changes in /etc/fstab file
16 Do the necessary changes in grub.conf file according to your boot partition
Eg: root (hd0,0) for /dev/lmo/c0d0p1 Partition 1st
17 Now reboot the server.
18 Now create the swap partition or swap FS if it is not there. Use fdisk utility to create it.
#mkswap dev/lmo/c0d0p5
#swapon dev/lmo/c0d0p3
#swapon –s

If you have face any grub related problem during booting
Insert the OS media and reboot.
Go to Linux rescue mode. mount your root directory in /mnt/sysimage
#chroot /mnt/sysimage
#/sbin/grub-install /dev/lmo/c0d0

Deleting multiple files in a directory

find . -name 'jive*' | xargs rm -v > test.txt

Stopping & Starting Linux Cluster

Halting package

clusvadm -d "package name"

service rgmanager stop

service cman stop

service ccsd stop

Starting package

service ccsd start

service cman start

service rgmanager start

clusvcadm -e "pacakge name"

Migrating Solaris Servers from One Data Center to Other DC

1. Power down all the zones and apps that are writing to the specific ZFS file systems.
2. Mark a ZFS snapshot of these ZFS file systems.
3. Power on the Zones again so that services can be restored.
4. Copy the snapshots onto external hard drives
5. Ship the hard drives to the new Data Center.
6. Restore the snapshots into the destination servers.
7. Use rsync to transfer the zone configurations from the original server to the destination server.
8. Start the Zones and test the services at the destination server - this is for testing purposes.
9. Power down the zones and revert the ZFS file systems back to their snapshots (since the Zone based apps may have written to the ZFS file systems).
10. At the source data center, power down all the zones and apps that are writing to the specific ZFS file systems.
11. Mark a ZFS snapshot of these ZFS file systems.
12. Use zfs send to export a delta between the previous transported snapshot and the newly marked snapshot.
13. Use rsync to transfer the delta files to the destination Data center
14. Use zfs receive to restore the snapshots.
15. Start the Zones at the destination data center.

Add Network Printer

To add a printer: # lpadmin -p sup12 -s 10.10.1.4 (sup12 being the name of the printer and 10.10.1.4 being the IP address of the printer.)

To make the printer your default printer:

# lpadmin -d sup12

To view printers:

# lpstat -a.

Friday, July 15, 2016

OS shows only 3 GB in 32 bit Linux VM

The memory not showing total 4GB on OS side is a known issue for RHEL 5 32bit to be precise.
The server was installed with a non-PAE(Physical Address Extension) kernel support, which cause the memory addresses spills over the 32-bit boundary. This is somewhat limitation for non-PAE kernel. The max memory size can only be supported up to 3GB.

Nothing we can do in this case as it’s an OS limitation. Kindly relay this message to Customer if you think this will be a show stopper. I don’t foresee any issue since the OS still running at same memory size of 3GB RAM..

Users are not able to login - Solaris/Linux

1. ssh-keygen -t rsa
Press enter for each line
2. cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

3. chmod og-wx ~/.ssh/authorized_keys

Creating Volumes and qtrees in Netapp

Volume creation :

[root@adm01 ~]# rsh adc1ntap1 vol create test_exp_14sep2012 aggra 20G
Creation of volume 'test_exp_14sep2012' with size 20g on containing aggregate
'aggra' has completed.

qtree creation :

[root@adm01 ~]# rsh adc1ntap1 qtree create /vol/test_exp_14sep2012/test -m 2000m
[root@adm01 ~]#
[root@adm01 ~]# rsh adc1ntap1 vol status | grep -i test_exp_14sep2012
test_exp_14sep2012 online raid_dp, flex

Aggregate full on Netapp Storage

Procedure for handling Volume and Aggregate Full

1. If we have enough free space in the aggregate( i.e. more than 40G ) add the space to the volume

To check free space in aggregate : rsh <filername> df -Ah

2. Check for any offline volumes and destroy the same if the volume is expired.

rsh <filername> vol status | grep -i offline

Check for any special instruction in the offlineed volumes and after auditing the SR/RFC# and exp date, delete the volume name.

rsh <filername> vol destroy <offline volume name>

3. Check for any unused volumes ( 0% usable sapce )

rsh <filername> df -h | grep -v snap | grep -i GB

Before reducing the space, check if that volume name has any entry in the AFSM exception list.

4. Check for volumes with high snapshot usage and delete the expired snapshots and assign the gained space to aggregate.

If the aggregate usage is more than 99% keep 2 to 3 schedule snapshots for non-prod instances and delete rest of the scheduled snapshots .

rsh <filername> df -h | grep -i snap | grep -i GB

rsh <filername> snap reclaimable <volumename> <snapshotname>

- snap reclaimable command will provide how much space can be obtained from that particular snapshot on deletion.

4. Check for any overallocated code volumes

./overallocate.sh <filername> ( Customized script )

If any code volumes occupies more than 300G, raise "code cleanup SR" to monitoring team

5. Check for any .del qtrees under arch volume .Check for any unused snapmirror snapshots and delete it .Check of any arch qtrees usage above the threshold and work on cleanup using logarch script

rsh <filername> qtree status <archvolume name>

Firing tar backup on media server using obtar --- Oracle Secure Backup

obtar -C -H -f <drive name> < Path > -v

Readding the scsi on Linux Servers

echo "scsi remove-single-device 7 0 1 3" > /proc/scsi/scsi
echo "scsi add-single-device 7 0 1 3" > /proc/scsi/scsi

Slots are showing vacant in Oracle Secure Backup Servers

Sol : a) Run fixvol for whole logical library from ACS and run inventory

b) move the db file and then run the inventory (it creates a new db file)

obrobotd hung issue on SL3K Tape libraries

check the smce log in acsls

obtool dumpdev -s 2013/06/08 drive1 - to check the clear error ( there will be only one job which will be having error)

# cd /usr/local/oracle/backup/admin/log/device/robot0/
# tail -f obrobotd
2013/08/07.21:33:18 (amh) state.pass = 0
2013/08/07.21:34:15 LMse: dte 11: VAL, lastse 110, oid 0x0 (0), vid "", barcode "RA0256", code 0x0
2013/08/07.21:36:27 (amh) state.pass = 0
2013/08/07.21:44:36 LMse: dte 23: VAL,VAC, lastse 0, oid 0x0 (0), vid "", barcode "", code 0x0
2013/08/07.21:47:38 (amh) state.pass = 1
2013/08/07.21:47:38 (amh) state.last_se_checked = 119
2013/08/07.21:47:38 (amh) state.mediainfo_pass = 1
2013/08/07.21:47:38 (amh) state.mediainfo_loops = 1
2013/08/07.21:47:38 (amh) state.rls_eltype = se
2013/08/07.21:47:38 (amh) state.rls_elnum = 119

[root@ robot0]# date
Wed Aug 7 23:11:24 MDT 2013

Cancel the job which has error. check if the tape has stuck and remove..

if it is not getting removed, just bring down the drive.

It is observed that on specific OSB backup servers that obtool commands( lsjob, catxcr etc) are started hanging. On further analyzing the issue it is found that the obrobotd process got locked by the obndmpd which cause the entire OSB env to be not responsive.

Create InfiniBand Partition for IPMP group on Solaris

# dladm create-part -f -l net7 -P FFFF bondib0_0 (letter ‘l’)

# dladm create-part -f -l net7 -P FFFF bondib0_1

Create interface for each partition

ipadm create-ip bondib0_0

ipadm create-ip bondib0_1

Create IPMP group out of two interfaces

ipadm create-ipmp -i bondib0_0,bondib0_1 bondib0

Create IP, assign network mask, and bring the interface online

ipadm create-addr -T static -a 192.168.10.1/24 bondib0/v4

Installing patch bundle and and qlc driver patch on Solaris 10

a) Download the patch and copy it to /var/tmp, give full permission and execute it in single user mode (make sure that patch is owned by nobody user)

# cp -rp 10_Recommended_CPU_2014-01 /var/tmp/

# cd /var/tmp

# chmod -R 777 10_Recommended_CPU_2014-01

b) Install the patch bundle

[root@10_Recommended_CPU_2014-01]# ./installpatchset --s10patchset

Setup .

CPU OS Patchset 2014/01 Solaris 10 SPARC (2014.02.17)

The patch set will complete installation in this session. No intermediate
reboots are required.

Application of patches started : 2014.04.18 07:21:37

Applying 120900-04 ( 1 of 364) ... skipped
Applying 150616-01 (360 of 364) ... skipped
Applying 150618-02 (361 of 364) ... skipped
Applying 150756-03 (362 of 364) ... success
Applying 150836-01 (363 of 364) ... success
Applying 150840-01 (364 of 364) ... success

Application of patches finished : 2014.04.18 08:04:51

Installation of patch set complete. PLEASE REBOOT THE SYSTEM.

Install log files written :
/var/sadm/install_data/s10s_rec_patchset_short_2014.04.18_07.21.37.log
/var/sadm/install_data/s10s_rec_patchset_verbose_2014.04.18_07.21.37.log

[root@10_Recommended_CPU_2014-01]# reboot

c) Installing single qlc patch (SUNWqlc, SUNWqlcu

#patchadd 149175-04

Validating patches...Loading patches installed on the system...

Done!
Loading patches requested to install.
Done!
Checking patches that you specified for installation.
Done!
Approved patches will be installed in this order:
149175-04
Checking installed patches...
Executing prepatch script...
Installing patch packages...
Patch 149175-04 has been successfully installed.
See /var/sadm/patch/149175-04/log for details
Executing postpatch script...