Sunday, July 31, 2016

Removing Ldom's on Solaris 10 Sparc Machines

1) Removing the configuration

# ldm ls-config

# ldm rm-config <config name>

2 )Disable the ldom services

# svcadm disable ldmd

# svcadm disable vntsd

3) Removing the packages

# pkgrm SUNWldm

# pkgrm SUNWjass

4) From the console prompt, reset the settings

sc> bootmode config="factory-default"

sc>poweroff -y

5) sc> poweron -c

To Erase the data on tape 


st  -f  /dev/rmt/1m  erase    

Thursday, July 28, 2016

Giving Crontab permission for Oracle User in HP-Ux



1) Checking cron jobs for oracle user.

#crontab -l oracle
crontab: can't open your crontab file.

2) Add the oracle user in /var/adm/cron.allow file to give crontab permission to oracle user.

#vi cron.allow
oracle

3) Oracle user will now able to create cron jobs.

#crontab -e 

Tuesday, July 26, 2016

Adding swap device from a logical volume in HP-Ux


1) Use  swapinfo command to view the current status of swap devices.

#swapinfo

2) Create a new logical volume called lv-swap, in the vg01 volume group, to be used as a secondary swap device.

#lvcreate -L 1000 -C y -n lv-swap vg01

3) Edit the /etc/fstab file to add the swap device.

#vi /etc/fstab
Add the line: /dev/vg01/lv-swap /swap swap 0 0

4) Activate the new swap device.

#swapon –a

5) Finally, check the swapinfo for the newly added swap device.

#swapinfo


Monday, July 25, 2016

Mirror root-disk Replacement online in HP-Ux


Caution : Before starting, make sure that the remaining disk is really
bootable. Use vxvmboot and lifls command to verify if the disk is bootable. Also, make sure  recent ignite-ux recovery image is taken and a valid backup of your data.

1) Without removing mirror

# vxdisk -o alldgs list | grep root

# setboot
Primary bootpath : 2/0/1/0/0.1.0
Alternate bootpath : 2/0/1/0/0.0.0

==> We would like to replace c0t1d0


2) Remove the disk from kernel

# vxdg -k -g rootdg rmdisk rootdisk01
# vxdisk -o alldgs list | grep root

3. Pull out the removed disk and put in the replacement disk.

4. Run vxdisksetup on recently inserted disk :

# /etc/vx/bin/vxdisksetup -iB c0t1d0
# vxdisk -o alldgs list | grep root

even if the new disk doesn't show up in the list you can go ahead.

5. Add replaced disk back in rootdg

# vxdg -k -g rootdg adddisk rootdisk01=c0t1d0
# vxdisk -o alldgs list | grep rootdg

6. Check if the mirrordisk plexes are still in status DISABLED RECOVER

# vxprint -thg rootdg

7. Recover mirror

# vxrecover -b -g rootdg rootdisk01

and check with vxtask list if the job is finished

8. Use vxprint to check if all the plexes are ENABLED ACTIVE

# vxprint -thg rootdg


9. Only when step seven is finished, make the replacement mirrordisk bootable

# /usr/lib/vxvm/bin/vxbootsetup rootdisk01

# vxvmboot -v /dev/rdsk/c0t1d0 

# lifls -l /dev/rdsk/c0t1d0

Corrupt of sar data files 

The most likely problem for this error message is that there are two sa1 sar
processes running at the same time.

With two sa1 processes writing to the file, one will over-write data written by the other, causing corruption in the sar data file which will confuse sar when the file is read. Check that they are not duplicate entries in cron where sa1 and sa2 are running at same time.

Also, Check /var/adm/messages for warning messages from the disk during the time cron is trying to run account /sar. This could indicate a disk is going bad.

[ID 107833 kern.warning]

Fan Sensor Problem on Solaris Server


The server  has an issue with fan-sensor at FANBD0/FM0/F0,  its status is Unknown in prtdiag output. We raised a case with SUN  and got below update :

"This is a known issue:

 If you would update your kernel patch to 137137-09 (or latest) as well as update your firmware with patch 136932, this issue should disappear. "

The same has been applied and the issue got fixed.

Sunday, July 24, 2016

Interface Bonding in Redhat Linux



Step 1: Creating Bonding Channel
# /etc/modprobe.d/
vi bonding.conf
alias bond0 bonding


Step 2: Creating Channek Bonding Interface
#/etc/sysconfig/network-scripts/
touch ifcfg-bond0
vi ifcfg-bond0
DEVICE=bond0
IPADDR=192.168.1.8
NETMASK=255.255.255.0
ONBOOT=yes
BOOTPROTO=none
USERCTL=no

Step 3: Configuring Channel Bonding Interface

After the channel bonding interface is created, the network interfaces to be bound together must be configured by adding the MASTER and SLAVE directives to their configuration files. The configuration files for each of the channel-bonded interfaces can be nearly identical. For example, if two Ethernet interfaces are being channel bonded,
both eth0 and eth1 may look like the following example. Edit physical interface card details as under.

For eth0

# vi /etc/sysconfig/network-scripts/ifcfg-eth0

DEVICE=eth0
USERCTL=no
ONBOOT=yes
MASTER=bond0
SLAVE=yes
BOOTPROTO=none

For eth1

# vi /etc/sysconfig/network-scripts/ifcfg-eth1
DEVICE=eth1
USERCTL=no
ONBOOT=yes
MASTER=bond0
SLAVE=yes
BOOTPROTO=none
Type=Ethernet

?DEVICE: Indicates what is the device name
 ?USERCTL: Indicates that can user control this device(here its no)
 ?ONBOOT:  Indicates that at the boot time do this device should be up?
 ?MASTER: Is this device has master? Then what it is(here its bond0)
 ?SLAVE: Is this device acting as slave?
 ?BOOTPROTO: What about getting IP Address from DHCP? It’s set to none which indicate it’s a static IP)


 Step 4: Restarting Network Service

 # service network restart



Changing the hostname in Redhat 6


1) hostname
2) hostname newname(redhat6)
3) hostname (for checking) - temporary
4) change the hostname in /etc/hosts  - if its not in dns
5) change the hostname in /etc/sysconfig/network
6) reboot


Preventing /etc/resolv.conf from changing automatically (entries change after reboot often) - Linux


1)cat /etc/resolv.conf
service network restart
 cat /etc/resolv.conf  (entries will change)
2) service NetworkManager status  (this should be stopped)
Network manager reads the configuration from /etc/sysconfig/network/scripts file and rebuilds the resolv.conf file.
service NetworkManager stop
3) chkconfig --list NetworkManager
chkconfig NetworkManager off
4) cd /etc/sysconfig/network/scripts
vi /ifcfg-eth0  (configuration file for dhcp)

PEERDNS=No  ( this will make sure that resolv.conf entries doesn't change)

Adding Windows route 


Syntax - route add -p <Client host backend network> mask 255.255.252.0 <gateway>

Ex. - route add -p 10.214.6.0 mask 255.255.255.0 10.214.6.0 
Decomissioning the Backup Server

1) The Servers will be decommissioned, we need to disable the backup from Console server.
2) Login to  Mgmt server and open console - Click on Manage (Windows Mgmt Server)
3) Search the host that needs to be decom, if more than one, select all at once
4) Right click on Unassign computers-> Disable the licenses and the backup jobs.
5) Right click on Unmanage computers and see if the servers are been removed.
6) Click on My computer -> C: -> winbkp-> bkpconfig -> Remove the servers from corresponding file.
Before that just confirm from where servers are backed up or not
7) Click on bkpconfig.txt file and remove all files that belongs to the decomissioning servers
8) Go to the schedule job  and remove the schedule and put as free schedule
9) Fire the tape backups that are not sent, if any

Wednesday, July 20, 2016

Route adding in Solaris 11 


route -p add net <network> -netmask <netmask ip> <gateway> -ifp <interface>

route  -p add net 10.170.28.0 -netmask 255.255.255.0 10.170.8.1 -ifp tst

Tuesday, July 19, 2016

Increasing File System in VXVM

1) Verify recent flasharchive on the system.
2) Verify a recent backup of the file systems to be modified.  
3) Notify the server's CMDB management group and Operations that the change is starting.

4) Use 'df -k' on the partitions to be resized, and store the total size of the file systems.
# df -k /app

Initialize the disk
# /etc/vx/bin/vxdisksetup -i c5t3d9 format=cdsdisk

5) Add the disk into the required diskgroup "datadg"
# vxdg -g datadg adddisk data_emcd34=c5t3d9

Verify that there is sufficient space to complete to request:
# vxassist -g datadg maxsize

6) Increase the size of the /app filesystem by 25GB.
# vxresize -g datadg app +25g

Use 'df -k' to verify increased space when compared to the size before the resize.
# df -k /app


Memory Utilization is High on Solaris Server


Please find the memory utilization status for server radha34.

radha34% prstat -t
NPROC     USERNAME  SIZE   RSS MEMORY      TIME  CPU
  446 tcadmin    10G 5241M    88%  84:10.28 9.6%
    64 root         518M  249M   4.1% 145:25.19 4.8%
     3  balep         13M 8624K   0.1%   0:00.00 0.2%
     1  nobody    3224K 2080K   0.0%   0:00.00 0.0%
     4  topazmon   16M 7712K   0.1%   0:00.00 0.0%
     3  rchr08         12M 6432K   0.1%   0:00.01 0.0%


radha34% swap -s
total: 2492656k bytes allocated + 2925688k reserved = 5418344k used, 8517704k available
radha34%


radha34%  prstat | grep -v tcadmin
   PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP
   681 root     5240K 3552K sleep    0    0   8:37.14 0.9% automountd/5
  1106 root     7632K 6176K sleep   44    0  26:28.37 0.3% sysedge.sol28-s/1
 21308 balep    4624K 4360K cpu1    38    0   0:00.00 0.2% prstat/1
Total: 524 processes, 3040 lwps, load averages: 0.18, 0.32, 0.41


tcadmin  uses 88% and we couldnt find any other process running on this server

Kill the process if you get clearance from respective team member. Else escalate the issue..


Monday, July 18, 2016

Removing files older than 30 days in Unix


 find . -size +10000 -exec ls -lh {} \;   -- To get the list of files occupying more space

 find /var/spool/clientmqueue -mtime +30 -exec rm {} \;  Removes the files older than 30 days
Replacing failed root disk in VXVM

First, we must remove the disk for replacement:

# vxdg -g rootdg -k rmdisk rootdisk

This will save all associated objects in the diskgroup and place the disk into a "removed:was" state.

At this point you will want to do the physical replacement of the disk.
1.  Swap out old drive for new drive
2.  Get the new drive visible to format and label

NOTE:  Your replacement drive must be of equal size or larger than the one you are replacing.

Once labeled ensure VxVM can see the new disk:
# vxdisk scandisks

The drive should appear as:  "online invalid"

Once in this state, setup the disk for VxVM usage:
# /etc/vx/bin/vxdisksetup -i Disk_0 format=sliced   (whatever the disk name is in disk list)

Once set up and seen in the disk list as "online"

We can put the new disk back into the configuration:
# vxdg -g rootdg -k adddisk rootdisk=Disk_0

Then recover all the objects:  (force a sync)
# vxrecover -g rootdg rootdisk&

You can monitor the above sync with:
# vxtask -l list


At this stage, the best thing is to have a reboot done and then we will carry out to add the rootdisk
to the rootdg.

Server panic reboot - Created a case to Sun Support


# ls -ltr |grep pab028
-rw-r--r--   1 root     apac     29587258 Aug 23 21:29 explorer.849d6570.pab028-2008.08.24.02.22.tar.gz

# cp explorer.849d6570.pab028-2008.08.24.02.22.tar.gz /tmp/suncase661417.fnac.explorer.849d6570.pab028-2008.08.24.02.22.tar.gz

# cd /tmp
# ls -ltr |grep suncase66149117
-rw-r--r--   1 root     apac     29587258 Aug 23 21:44 suncase66149117.fnac.explorer.849d6570.pab028-2008.08.24.02.22.tar.gz




# cd /var/crash
# ls -l
total 4
drwx------   2 root     root         512 Nov 23 18:48 pab028
drwx------   2 root     root         512 Aug  1  2007 moplgtotest
# cd pab028
# ls -l
total 12393154
-rw-r--r--   1 root     root           2 Nov 23 18:48 bounds
-rw-r--r--   1 root     root     2481536 Nov 23 18:42 unix.0
-rw-r--r--   1 root     root     6339698688 Nov 23 18:48 vmcore.0

# gzip unix.0
# gzip vmcore.0

# mv unix.0.gz 66166352.unix.0.gz
# mv vmcore.0.gz 66166352.vmcore.0.gz


We have uploaded  explorer files under cores directory to supportfiles.sun.com

Vxresize fails with error message "Subdisk data_emcd1-02 would overlap subdisk data_emcd1-01"


bash-2.05# /etc/vx/bin/vxresize -g datadg health +49g
VxVM vxassist ERROR V-5-1-10127 creating subdisk data_emcd1-02:
        Subdisk data_emcd1-02 would overlap subdisk data_emcd1-01
VxVM vxresize ERROR V-5-1-4703 Problem running vxassist command for volume health, in diskgroup datadg
bash-2.05#

"vxprint -thrg datadg" would not show the new disk "c2t2d1" that was added to that disk group.

# vxprint -thrg datadg | grep health-01
sd data_emcd1-01 health-01  data_emcd1 0      209704704 0        c2t0d0   ENA


Solution:

Cleared issue with vxconfigd daemon by issuing command "vxconfigd –k –x cleartempdir" and extended volume.

Removed and recreated this directory online without affecting normal operation of server using command "vxconfigd –k –x cleartempdir"
# vxconfigd –k –x cleartempdir
# vxprint -thrg datadg | grep health-01
sd data_emcd1-01 health-01  data_emcd1 0      209704704 0        c2t0d0   ENA
sd data_emcd2-02 health-01  data_emcd2 0      102760448 209704704 c2t2d1  ENA

# /etc/vx/bin/vxresize -g datadg ehealth +49g
# df -h /opt/health
Filesystem             size   used  avail capacity  Mounted on
/dev/vx/dsk/datadg/health
                       149G    62G    86G    42%    /opt/health
Users are not able to login after migration 


Before migration if we are not sure what authentication the  users uses , please check the below method .. It worked for me.

#authconfig-tui









So we have enabled the local authorization which fixed the issue.



VNC service went to maintenance mode on Solaris 10

# svcs -l svc:/application/x11/xvnc-inetd:default
fmri         svc:/application/x11/xvnc-inetd:default
name         X server that displays to VNC viewers
enabled      true
state        maintenance
next_state   none
state_time   Thu Jun 30 16:15:33 2016
restarter    svc:/network/inetd:default

Error in /var/svc/log :

 Executing start method ("/lib/svc/method/fs-local") ]
cannot mount 'rpool/export' on '/export': directory is not empty
WARNING: /usr/sbin/zfs mount -a failed: one or more file systems failed to mount
[ Jun 17 00:52:19 Method "start" exited with status 0 ]

Error in /var/adm/messages :

 inetd[15492]: [ID 702911 daemon.error] Property 'name' of instance svc:/application/x11/xvnc-inetd:default is missing, inconsistent or invalid
Jun 30 16:15:27 muvmzn015 inetd[15492]: [ID 702911 daemon.error] Property 'proto' of instance svc:/application/x11/xvnc-inetd:default is missing, inconsistent or invalid
Jun 30 16:15:33 muvmzn015 inetd[15492]: [ID 702911 daemon.error] Property 'name' of instance svc:/application/x11/xvnc-inetd:default is missing, inconsistent or invalid
Jun 30 16:15:33 muvmzn015 inetd[15492]: [ID 702911 daemon.error] Property 'proto' of instance svc:/application/x11/xvnc-inetd:default is missing, inconsistent or invalid
Jun 30 16:15:33 muvmzn015 inetd[15492]: [ID 702911 daemon.error] Invalid configuration for instance svc:/application/x11/xvnc-inetd:default, placing in maintenance

Issue :

home$ cat /etc/services | grep -i vnc
#vnc-servert5900/tcpttt# Xvnc

Fixed after editing as below

home$ cat /etc/services | grep -i vnc
vnc-server      5900/tcp                        # Xvnc

Restart the inetd service.   ( sometimes need to reboot the system).








Host lost its virtual interface to DOM on Oracle Linux 
[root@host1 ~]# netstat -rn
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
140.85.50.16    0.0.0.0         255.255.255.240 U         0 0          0 eth0
144.20.63.32    10.225.160.1    255.255.255.224 UG        0 0          0 eth1
140.85.21.0     10.225.160.1    255.255.255.128 UG        0 0          0 eth1
144.20.110.128  10.225.160.1    255.255.255.128 UG        0 0          0 eth1
144.20.116.128  10.225.160.1    255.255.255.128 UG        0 0          0 eth1
10.224.124.0    10.225.160.1    255.255.255.0   UG        0 0          0 eth1
144.20.118.0    10.225.160.1    255.255.255.0   UG        0 0          0 eth1
144.20.54.0     10.225.160.1    255.255.255.0   UG        0 0          0 eth1
140.85.2.0      10.225.160.1    255.255.254.0   UG        0 0          0 eth1
140.85.12.0     10.225.160.1    255.255.252.0   UG        0 0          0 eth1
10.225.160.0    0.0.0.0         255.255.248.0   U         0 0          0 eth1
10.224.96.0     10.225.160.1    255.255.248.0   UG        0 0          0 eth1
169.254.0.0     0.0.0.0         255.255.0.0     U         0 0          0 eth1
0.0.0.0         140.85.50.17    0.0.0.0         UG        0 0          0 eth0

 [root@host1 ~]# ping 10.225.160.1
PING 10.225.160.1 (10.225.160.1) 56(84) bytes of data.
From 10.225.161.133 icmp_seq=1 Destination Host Unreachable
From 10.225.161.133 icmp_seq=2 Destination Host Unreachable
From 10.225.161.133 icmp_seq=3 Destination Host Unreachable


[root@host1 ~]# ping abc12stor12-nas
PING abc12stor12-nas.us.oracle.com (10.225.163.240) 56(84) bytes of data.

--- abc12stor12-nas.us.oracle.com ping statistics ---
0 packets transmitted, 0 received

[root@host1 ~]# 

[root@host1 ~]# ping abc12osb12-nfs
PING abc12osb12-nfs.us.oracle.com (10.225.164.135) 56(84) bytes of data.
From host1-nfs.us.oracle.com (10.225.161.133) icmp_seq=2 Destination Host Unreachable
From host1-nfs.us.oracle.com (10.225.161.133) icmp_seq=3 Destination Host Unreachable
From host1-nfs.us.oracle.com (10.225.161.133) icmp_seq=4 Destination Host Unreachable

--- abc12osb12-nfs.us.oracle.com ping statistics ---
5 packets transmitted, 0 received, +3 e


On DOM --- 

[root@audom11 ~]# xm list
Name                                        ID   Mem VCPUs      State   Time(s)
45476_host1                           5  8192     2     r----- 1881798.9
45481_server1                           6 16384     4     -b---- 2652174.1
45486_server2                           7 16384     4     -b---- 1111889.1
45491_server3                          12 32768     8     -b---- 1285069.8
45496_server4                           14  8192     2     -b----  15698.7
Domain-0                                     0  2048    24     r----- 1462970.9
[root@audom11 ~]

 its possible the vm has "lost" the link to the dom0.
First get dom0 details for the host
 i'm checking out all the bridges and bonds (bond1 is usually NFS interface)
its definately a dom0 problem cos that dom0 runs 4 other Vm's .. all with the same issue

[root@audom11 ~]# xm list
Name                                        ID   Mem VCPUs      State   Time(s)
45476_host1                           5  8192     2     -b---- 1880611.3
45481_server1                           6 16384     4     -b---- 2652141.5
45486_server2                           7 16384     4     -b---- 1111874.7
45491_server3                          12 32768     8     -b---- 1285042.4
45496_server4                           14  8192     2     -b----  15683.1
Domain-0                                     0  2048    24     r----- 1462740.8

[root@audom11 ~]# xm network-list 45476_host1
Idx BE     MAC Addr.     handle state evt-ch tx-/rx-ring-ref BE-path
0   0  00:16:3E:14:0B:02    0     4      13    1280 /1281    /local/domain/0/backend/vif/5/0  
1   0  00:16:3E:38:24:93    1     4      14    1282 /1283    /local/domain/0/backend/vif/5/1  

[root@audom11 ~]# xm network-detach 45476_host1 1

[root@audom11 ~]# xm network-list 45476_host1
Idx BE     MAC Addr.     handle state evt-ch tx-/rx-ring-ref BE-path
0   0  00:16:3E:14:0B:02    0     4      13    1280 /1281    /local/domain/0/backend/vif/5/0  

[root@audom11 ~]# xm network-attach 45476_host1 bridge=br93 mac=00:16:3E:38:24:93

[root@audom11 ~]# xm network-list 45476_host1
Idx BE     MAC Addr.     handle state evt-ch tx-/rx-ring-ref BE-path
0   0  00:16:3E:14:0B:02    0     4      13    1280 /1281    /local/domain/0/backend/vif/5/0  
2   0  00:16:3E:38:24:93    2     4      14    1282 /1365    /local/domain/0/backend/vif/5/2  

Fixed after doing above.

Sunday, July 17, 2016

Restoring spfile from pfile from RMAN prompt


RMAN> restore spfile to pfile '/home/oracle/init24.ora' from '/testbackup/rmanf ull20140821/JBL_T24_DB_CTL_c-1296243675-20140821-01';


RMAN> set DBID 1296243675

executing command: SET DBID

RMAN> startup force nomount

Oracle instance started

Total System Global Area 20911292416 bytes
To check available tape backups from RMAN


instance user $ rman target /

RMAN > list backup ; - to check the tape backup of RMAN
RMAN > list backup summary;
RMAN > list backup by file;


RMAN> list backup completed before 'sysdate' device type disk;  - list of backups available on disk
RMAN> show all; - to check for the retention
RMAN> list backup completed before 'sysdate' device type SBT_TAPE; - list of tape backups


command that shows list  of  backup that can be used for restore

RMAN> list backup summary tag INCR_BACKUPSET_0 completed after '04-NOV-2012' device type disk;


RMAN > backup database plus archivelog;  (spfile and control file backed up - depends on size it takes time)


sg_map got stuck on Linux media server

[root@adc12osbmed01 by-id]# sg_map -i -x
Strange, could not find device /dev/nst0 mapped to sg device??
Strange, could not find device /dev/nst2 mapped to sg device??
device /dev/nst5 failed on scsi ioctl(idlun), skip: Input/output error
Strange, could not find device /dev/nst7 mapped to sg device??
Strange, could not find device /dev/nst8 mapped to sg device??
Strange, could not find device /dev/nst11 mapped to sg device??
Strange, could not find device /dev/nst12 mapped to sg device??
Strange, could not find device /dev/nst13 mapped to sg device??
Strange, could not find device /dev/nst14 mapped to sg device??
Strange, could not find device /dev/nst15 mapped to sg device??
Strange, could not find device /dev/nst16 mapped to sg device??
/dev/sg0  0 0 0 0  0  /dev/sda  HITACHI   H103014SCSUN146G  A2A8
/dev/sg1  0 0 1 0  0  /dev/sdb  HITACHI   H103014SCSUN146G  A2A8
/dev/sg2  0 0 2 0  0  /dev/sdc  ATA       SEAGATE ST95000N  SF03
/dev/sg3  0 0 3 0  0  /dev/sdd  ATA       SEAGATE ST95000N  SF03
/dev/sg4  0 0 4 0  0  /dev/sde  ATA       SEAGATE ST95000N  SF03
/dev/sg5  0 0 5 0  0  /dev/sdf  ATA       SEAGATE ST95000N  SF03
/dev/sg6  0 0 6 0  0  /dev/sdg  ATA       SEAGATE ST95000N  SF03
/dev/sg7  0 0 7 0  0  /dev/sdh  ATA       SEAGATE ST95000N  SF03
/dev/sg8  -2 -2 -2 -2  -2
/dev/sg9  7 0 1 0  1  /dev/nst1  HP        Ultrium 5-SCSI    I59S
/dev/sg10  -2 -2 -2 -2  -2
/dev/sg11  7 0 3 0  1  /dev/nst3  HP        Ultrium 5-SCSI    I3CS
/dev/sg12  7 0 4 0  1  /dev/nst4  HP        Ultrium 5-SCSI    I3CS
/dev/sg13  -2 -2 -2 -2  -2
/dev/sg14  8 0 1 0  1  /dev/nst6  HP        Ultrium 5-SCSI    I3CS
/dev/sg15  -2 -2 -2 -2  -2
/dev/sg16  -2 -2 -2 -2  -2
/dev/sg17  8 0 4 0  1  /dev/nst9  HP        Ultrium 5-SCSI    I3CS
/dev/sg18  8 0 5 0  1  /dev/nst10  HP        Ultrium 5-SCSI    I3CS
/dev/sg19  -2 -2 -2 -2  -2
/dev/sg20  -2 -2 -2 -2  -2
/dev/sg21  -2 -2 -2 -2  -2
/dev/sg22  -2 -2 -2 -2  -2
/dev/sg23  -2 -2 -2 -2  -2
/dev/sg24  -2 -2 -2 -2  -2
/dev/sg25  -2 -2 -2 -2  -2
/dev/sg26  9 0 0 0  5  /dev/scd0  TEAC      DV-W28SS-R        1.0C

Sg_map is fine. Dismounted one bad media. cancelled one job and refired. Re-added the scsi for Errored  drives. Ran the inventory. Except one drive all the drives seems to be okay.

sg_scan would do..

[root@adc12osbmed01 ~]# sg_map -i -x
/dev/sg0  0 0 0 0  0  /dev/sda  HITACHI   H103014SCSUN146G  A2A8
/dev/sg1  0 0 1 0  0  /dev/sdb  HITACHI   H103014SCSUN146G  A2A8
/dev/sg2  0 0 2 0  0  /dev/sdc  ATA       SEAGATE ST95000N  SF03
/dev/sg3  0 0 3 0  0  /dev/sdd  ATA       SEAGATE ST95000N  SF03
/dev/sg4  0 0 4 0  0  /dev/sde  ATA       SEAGATE ST95000N  SF03
/dev/sg5  0 0 5 0  0  /dev/sdf  ATA       SEAGATE ST95000N  SF03
/dev/sg6  0 0 6 0  0  /dev/sdg  ATA       SEAGATE ST95000N  SF03
/dev/sg7  0 0 7 0  0  /dev/sdh  ATA       SEAGATE ST95000N  SF03
/dev/sg8  7 0 0 0  1  /dev/nst0  HP        Ultrium 5-SCSI    I3CS
/dev/sg9  7 0 1 0  1  /dev/nst1  HP        Ultrium 5-SCSI    I59S
/dev/sg10  7 0 2 0  1  /dev/nst2  HP        Ultrium 5-SCSI    I3CS
/dev/sg11  7 0 3 0  1  /dev/nst3  HP        Ultrium 5-SCSI    I3CS
/dev/sg12  7 0 4 0  1  /dev/nst4  HP        Ultrium 5-SCSI    I3CS
/dev/sg13  -2 -2 -2 -2  -2
/dev/sg14  8 0 0 0  1  /dev/nst5  HP        Ultrium 5-SCSI    I3CS
/dev/sg15  8 0 1 0  1  /dev/nst6  HP        Ultrium 5-SCSI    I3CS
/dev/sg16  8 0 2 0  1  /dev/nst7  HP        Ultrium 5-SCSI    I3CS
/dev/sg17  8 0 3 0  1  /dev/nst8  HP        Ultrium 5-SCSI    I3CS
/dev/sg18  8 0 4 0  1  /dev/nst9  HP        Ultrium 5-SCSI    I3CS
/dev/sg19  8 0 5 0  1  /dev/nst10  HP        Ultrium 5-SCSI    I3CS
/dev/sg20  8 0 6 0  1  /dev/nst11  HP        Ultrium 5-SCSI    I59S
/dev/sg21  8 0 7 0  1  /dev/nst12  HP        Ultrium 5-SCSI    I59S
/dev/sg22  8 0 8 0  1  /dev/nst13  HP        Ultrium 5-SCSI    I59S
/dev/sg23  8 0 9 0  1  /dev/nst14  HP        Ultrium 5-SCSI    I59S
/dev/sg24  8 0 10 0  1  /dev/nst15  HP        Ultrium 5-SCSI    I59S
/dev/sg25  8 0 11 0  1  /dev/nst16  HP        Ultrium 5-SCSI    I59S
/dev/sg26  9 0 0 0  5  /dev/scd0  TEAC      DV-W28SS-R        1.0C
To Stop & Start Database services


# To stop database services, use the following:

# srvctl stop listener -l <Instance>                                    
# srvctl stop database -d <Instance>  

# To start database services, use the following:          

# srvctl start listener -l <Instance>                                  
# srvctl start database -d <Instance>    

# To verify database services,use the following:        

# srvctl status listener -l <Instance>                          
# srvctl status database -d <Instance>

Saturday, July 16, 2016

Assigning and Unassigning volumes from ACSLS Web GUI



Unassigning volumes:

Configuration and Administration -> Logical Library Configuration -> Unassign Volumes -> Select the server which you want to unassign -> Select volumes -> Tick mark the medias and click on Continue -> Click on Unassign ->

Run a inventory on OSB server and check for the medias.


Select volumes -> If the volumes are not displayed, then you have not unassigned. or
try to click on custome filter which is under filter drop down -> Select the volume id -> Is -> Medias -> Ok


Assigning volumes :

Select logical library - zone -> Select volumes -> Tick mark the medias and click on continue -> Assign -> Run a inventory and check the medias on zone.-> It will be labeled as Barcode


Battery down on ACSLS Server


1. Stop OSB services   (make sure all drives are unloaded)
2. acsss disable
3.  db_export.sh -f /export/backup/pre_outage
option :8
4. scp the file pre_outage to adc-acsls
4. acsss shutdown
5. acsss status
6 .Bring down the server                            
7. Correct date (change battery)
7a Verify  date / time on the O.S. before bringing up services.
8. Bring up services
After replacing the drive, its not able to mount the drive on ACSLS


ACSSA> dismount 000000 1,0,10,1 force
Dismount: Dismount failed, Process failure.


Now able to mount the tape into the drive.

ACSSA> q drive 1,0,10,1
2012-08-31 04:40:17               Drive Status
 Identifier   State           Status      Volume               Type
   1, 0,10, 1 online          in use      SB0079               HP-LTO5
CAP shows offline in ACSLS (Automatic Catridge System Library Software )


1.  Login as Service on the SL console.

2.  Select Tools > Diagnostics

3.  Expand CAP folder and select desired CAP

4.  CAP should show as "Locked"

5.  Change right hand drop box to "False"

6.  Click Apply

7.  Unlock the CAP on the SLC Console

8.  Push flashing CAP button

9.  Open CAP and remove carts

10.  Push CAP button on CAP again to close the CAP door.

11.  CAP goes into an unreserved state not assigned to a partition.

12.  From the ACSLS cmd_proc,  issue "vary CAP x,x,x online"
RDP to Windows machine from Linux Box


1) rpm -qa | grep -i rdesktop
2) yum search rdesktop
3) yum install rdesktop.x86_64
4) yum info rdesktop

On windows machine mycomputer-properties- remote- allow remote logins

5) rdesktop windows-v8-1-test

we have bug in 6 version if it is giving some errors

6) just get the 7 package and update it
rpm -Uvh rdesktop

7) rpm -qa | grep -i rdesktop
8) rdesktop hostname(windows) -f   (to get the full screen)
RAID Implementation on Linux Servers


#mdadm --create /dev/md0 --level=0 --raid-devices=2 /dev/hda6 /dev/hda7          (RAID -0)
continue creating array - y
#mdadm --create /dev/md1 --level=1 --raid-devices=2 /dev/hda8 /dev/hda9           (RAID - 1)
continue creating array - y

# RAID1 conf printout:

# mdadm --create /dev/md5 --level=5 --raid-devices=3 /dev/hda10 /dev/hda11 /dev/hda12(RAID -- 5)
continue creating array - y

# mkfs.ext3 /dev/md0 - to create a file system
# mkfs.ext3 /dev/md1
# mkfs.ext3 /dev/md5

# mkdir raid0
# mkdir raid1
# mkdir raid5

#mount /dev/md0 raid0
#mount /dev/md1 raid1
#mount /dev/md5 raid5

# df -hP

# cat /proc/mdstat - to check
# mdadm --detail /dev/md0
# mdadm --detail /dev/md1
# mdadm --detail /dev/md5

# vi /etc/fstab

/dev/md0 /root/raid0 ext3 defaults 0 0
/dev/md1 /root/raid1 ext3 defaults 0 0
/dev/md5 /root/raid5 ext3 defaults 0 0
:wq

# mount -a 
Adding route in Oracle Linux Server 

route add -net 10.217.208.0 netmask 255.255.248.0 gw 10.217.48.1


Assigning IP address in Linux Server

# vi /etc/sysconfig/network-scripts/ifcfg-eth0  (assign/change the ip here)
# service network restart

Changing hostname

vi /etc/sysconfig/network
HOSTNAME=XXXXXX







SSH is not working on Solaris/Linux Hosts 


>@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
>@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
>@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
>IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
>Someone could be eavesdropping on you right now (man-in-the-middle attack)!
>It is also possible that the RSA host key has just been changed.
>The fingerprint for the RSA key sent by the remote host is
>86:0a:09:0d:b6:b9:7e:67:53:e3:f1:df:50:cd:06:e7.
>Please contact your system administrator.
>Add correct host key in /home/cgi/.ssh/known_hosts to get rid of this message.
>Offending key in /home/cgi/.ssh/known_hosts:23
>RSA host key for 10.193.208.170 has changed and you have requested strict checking.
>Host key verification failed.
>lost connection

Remove the older entries by executing below command

ssh-keygen -R ip

Now connect using ssh again. It generates new entry and can able to connect to server without issue.

Linux OS Backup restore using Dump and Restore

1 Insert the OS media and reboot.
2 Go to Linux rescue mode. Dont mount your root directory in /mnt/sysimage. Skip this step
3 Make interface up for eth0 or eth1.
   Give your IP and subnet mask
4 If you are not able to ping to other server check the routing table
   #netstat –rn
   if no gateway is mentioned give your default gateway.
   #route add default gw gateway IP
   Ping your nfs server.
5 Mount the nfs file to local server.
6 Create an example directory like /example-mount_point.
   mount server:/nfs path /example-mount_point.
7 Delete the existing filesystem using fdisk command
   # fdisk /dev/lmo/c0d0
8 Now create the File systems as per the Prod server and label it accordingly.
   # fdisk /dev/lmo/c0d0
   Eg: mkfs.ext3  /dev/lmo/c0d0p1
9 Then create directories as following:
   mkdir /newroot
   mkdir /newvar
   mkdir /newopt
   mkdir /newboot
10 Identify and label your root, boot, var,opt FS
11 Now mount the your FSs on newly created mount point
     Ex: mount /dev/mapper/VolGroup00-lvol00 /newboot
12 Now cd to /newboot and give the following command
   #restore -r -f  /restore/servername/boot-backup
13 Now wait till the time it restore full backup
14 Follow the same step from 6-9 for the rest FSs.
15 Now rename all the mount point as per DC standard. If is not the fresh server then no need to rename the mount points because it will take the updated details from your fstab file.
   Eg: #e2label  /dev/lmo/c0d0p1  /
   e2label  /dev/lmo/c0d0p2  /boot
   e2label  /dev/lmo/c0d0p3  /var
   Make changes in /etc/fstab file
16 Do the necessary changes in grub.conf file according to your boot partition
     Eg: root (hd0,0) for  /dev/lmo/c0d0p1 Partition 1st
17 Now reboot the server.
18 Now create the swap partition or swap FS if it is not there. Use fdisk utility to create it.
    #mkswap dev/lmo/c0d0p5
    #swapon dev/lmo/c0d0p3
    #swapon –s
 
If you have face any grub related problem during booting
Insert the OS media and reboot.
Go to Linux rescue mode. mount your root directory in /mnt/sysimage
     #chroot /mnt/sysimage
     #/sbin/grub-install /dev/lmo/c0d0

Deleting multiple files in a directory

find . -name 'jive*'  | xargs rm -v > test.txt
Stopping & Starting Linux Cluster

Halting package

clusvadm -d "package name"

service rgmanager stop

service cman stop

service ccsd stop



Starting package

service ccsd start

service cman start

service rgmanager start

clusvcadm -e "pacakge name"

Migrating Solaris Servers from One Data Center to Other DC



1. Power down all the zones and apps that are writing to the specific ZFS file systems.
2. Mark a ZFS snapshot of these ZFS file systems.
3. Power on the Zones again so that services can be restored.
4. Copy the snapshots onto external hard drives
5. Ship the hard drives to the new Data Center.
6. Restore the snapshots into the destination servers.
7. Use rsync to transfer the zone configurations from the original server to the destination server.
8. Start the Zones and test the services at the destination server - this is for testing purposes.
9. Power down the zones and revert the ZFS file systems back to their snapshots (since the Zone based apps may have written to the ZFS file systems).
10. At the source data center, power down all the zones and apps that are writing to the specific ZFS file systems.
11. Mark a ZFS snapshot of these ZFS file systems.
12. Use zfs send to export a delta between the previous transported snapshot and the newly marked snapshot.
13. Use rsync to transfer the delta files to the destination Data center
14. Use zfs receive to restore the snapshots.
15. Start the Zones at the destination data center.
Add Network Printer 


To add a printer: # lpadmin -p sup12 -s 10.10.1.4 (sup12 being the name of the printer and 10.10.1.4 being the IP address of the printer.)

To make the printer your default printer:

# lpadmin -d sup12

To view printers:

 # lpstat -a.

Friday, July 15, 2016

OS shows only 3 GB in 32 bit  Linux VM


The memory not showing total 4GB on OS side is a known issue for RHEL 5 32bit to be precise.
The server was installed  with a non-PAE(Physical Address Extension) kernel support, which cause the memory addresses spills over the 32-bit boundary. This is somewhat limitation for non-PAE kernel. The max memory size can only be supported up to 3GB.

Nothing we can do in this case as it’s an OS limitation. Kindly relay this message to Customer if you think this will be a show stopper. I don’t foresee any issue since the OS still running at same memory size of 3GB RAM..
Users are not able to login - Solaris/Linux


1. ssh-keygen -t rsa
Press enter for each line
2. cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

3. chmod og-wx ~/.ssh/authorized_keys

Creating Volumes and qtrees in Netapp


Volume creation :

[root@adm01 ~]# rsh adc1ntap1 vol create test_exp_14sep2012 aggra 20G
Creation of volume 'test_exp_14sep2012' with size 20g on containing aggregate
'aggra' has completed.


qtree creation :


[root@adm01 ~]# rsh adc1ntap1 qtree create /vol/test_exp_14sep2012/test -m 2000m
[root@adm01 ~]#
[root@adm01 ~]# rsh adc1ntap1 vol status | grep -i test_exp_14sep2012
test_exp_14sep2012 online          raid_dp, flex

Aggregate full on Netapp Storage 


Procedure for handling Volume and Aggregate Full

1. If we have enough free space in the aggregate( i.e. more than 40G ) add the space to the volume

 To check free space in aggregate :  rsh <filername> df -Ah

2. Check for any offline volumes and destroy the same if the volume is expired.

rsh <filername> vol status | grep -i offline

Check for any special instruction in the offlineed volumes and after auditing the SR/RFC# and exp date, delete the volume name.

rsh <filername> vol destroy <offline volume name>

3. Check for any unused volumes ( 0% usable sapce )

rsh <filername> df -h | grep -v snap | grep -i GB

Before reducing the space, check if that volume name has any entry in the AFSM exception list.

4. Check for volumes with high snapshot usage and delete the expired snapshots and assign the gained space to aggregate.

If the aggregate usage is more than 99% keep 2 to 3 schedule snapshots for non-prod instances and delete rest of the scheduled snapshots .

rsh <filername> df -h | grep -i snap | grep -i GB

rsh <filername> snap reclaimable <volumename> <snapshotname>

- snap reclaimable command will provide how much space can be obtained from that particular snapshot on deletion.

4. Check for any overallocated code volumes

./overallocate.sh <filername>   ( Customized script )

If any code volumes occupies more than 300G, raise "code cleanup SR" to monitoring team

5. Check for any .del qtrees under arch volume .Check for any unused snapmirror snapshots and delete it .Check of any arch qtrees usage above the threshold and work on cleanup using logarch script

rsh <filername> qtree status <archvolume name>


Firing tar backup on media server using obtar --- Oracle Secure Backup

obtar -C -H -f <drive name>  < Path > -v
Readding the scsi on Linux Servers

echo "scsi remove-single-device 7 0 1 3" > /proc/scsi/scsi
echo "scsi add-single-device 7 0 1 3" > /proc/scsi/scsi
Slots are showing vacant in Oracle Secure Backup Servers

Sol : a) Run fixvol for whole logical library from ACS and run inventory
        b) move the db file and then run the inventory (it creates a new db file)
obrobotd hung issue on SL3K Tape libraries

check the smce log in acsls

obtool dumpdev -s 2013/06/08 drive1 - to check the clear error  ( there will be only one job which will be having error)

# cd /usr/local/oracle/backup/admin/log/device/robot0/
# tail -f obrobotd
2013/08/07.21:33:18 (amh)  state.pass = 0
2013/08/07.21:34:15 LMse: dte 11: VAL, lastse 110, oid 0x0 (0), vid "", barcode "RA0256", code 0x0
2013/08/07.21:36:27 (amh)  state.pass = 0
2013/08/07.21:44:36 LMse: dte 23: VAL,VAC, lastse 0, oid 0x0 (0), vid "", barcode "", code 0x0
2013/08/07.21:47:38 (amh)  state.pass = 1
2013/08/07.21:47:38 (amh)  state.last_se_checked = 119
2013/08/07.21:47:38 (amh)  state.mediainfo_pass = 1
2013/08/07.21:47:38 (amh)  state.mediainfo_loops = 1
2013/08/07.21:47:38 (amh)  state.rls_eltype = se
2013/08/07.21:47:38 (amh)  state.rls_elnum = 119
[root@ robot0]# date
Wed Aug  7 23:11:24 MDT 2013

Cancel the job which has error. check if the tape has stuck and remove..
if it is not getting removed, just bring down the drive. 

It is observed that on specific OSB backup servers that obtool commands( lsjob, catxcr etc) are started hanging. On further analyzing the issue it is found that the obrobotd process got locked by the obndmpd which cause the entire OSB env to be not responsive. 
Create InfiniBand Partition for IPMP group on Solaris


# dladm create-part -f -l net7 -P FFFF bondib0_0 (letter ‘l’)
# dladm create-part -f -l net7 -P FFFF bondib0_1

Create interface for each partition
ipadm create-ip bondib0_0
ipadm create-ip bondib0_1

Create IPMP group out of two interfaces
ipadm create-ipmp -i bondib0_0,bondib0_1 bondib0

Create IP, assign network mask, and bring the interface online
ipadm create-addr -T static -a 192.168.10.1/24 bondib0/v4
 Installing patch bundle and and qlc driver patch on Solaris 10 


a) Download the patch and copy it to /var/tmp, give full permission and execute it in single user mode (make sure that patch is owned by nobody user)

# cp -rp 10_Recommended_CPU_2014-01 /var/tmp/
# cd /var/tmp
# chmod -R 777 10_Recommended_CPU_2014-01
 
b) Install the patch bundle

[root@10_Recommended_CPU_2014-01]# ./installpatchset --s10patchset

Setup .

CPU OS Patchset 2014/01 Solaris 10 SPARC (2014.02.17)
The patch set will complete installation in this session. No intermediate
reboots are required.
Application of patches started : 2014.04.18 07:21:37
Applying 120900-04 (  1 of 364) ... skipped
Applying 150616-01 (360 of 364) ... skipped
Applying 150618-02 (361 of 364) ... skipped
Applying 150756-03 (362 of 364) ... success
Applying 150836-01 (363 of 364) ... success
Applying 150840-01 (364 of 364) ... success
Application of patches finished : 2014.04.18 08:04:51
Installation of patch set complete. PLEASE REBOOT THE SYSTEM.
Install log files written :
  /var/sadm/install_data/s10s_rec_patchset_short_2014.04.18_07.21.37.log
  /var/sadm/install_data/s10s_rec_patchset_verbose_2014.04.18_07.21.37.log
[root@10_Recommended_CPU_2014-01]# reboot


c)   Installing single qlc patch   (SUNWqlc, SUNWqlcu

#patchadd 149175-04
Validating patches...Loading patches installed on the system...
Done!
Loading patches requested to install.
Done!
Checking patches that you specified for installation.
Done!
Approved patches will be installed in this order:
149175-04
Checking installed patches...
Executing prepatch script...
Installing patch packages...
Patch 149175-04 has been successfully installed.
See /var/sadm/patch/149175-04/log for details
Executing postpatch script...