Friday, July 15, 2016

obrobotd hung issue on SL3K Tape libraries

check the smce log in acsls

obtool dumpdev -s 2013/06/08 drive1 - to check the clear error  ( there will be only one job which will be having error)

# cd /usr/local/oracle/backup/admin/log/device/robot0/
# tail -f obrobotd
2013/08/07.21:33:18 (amh)  state.pass = 0
2013/08/07.21:34:15 LMse: dte 11: VAL, lastse 110, oid 0x0 (0), vid "", barcode "RA0256", code 0x0
2013/08/07.21:36:27 (amh)  state.pass = 0
2013/08/07.21:44:36 LMse: dte 23: VAL,VAC, lastse 0, oid 0x0 (0), vid "", barcode "", code 0x0
2013/08/07.21:47:38 (amh)  state.pass = 1
2013/08/07.21:47:38 (amh)  state.last_se_checked = 119
2013/08/07.21:47:38 (amh)  state.mediainfo_pass = 1
2013/08/07.21:47:38 (amh)  state.mediainfo_loops = 1
2013/08/07.21:47:38 (amh)  state.rls_eltype = se
2013/08/07.21:47:38 (amh)  state.rls_elnum = 119
[root@ robot0]# date
Wed Aug  7 23:11:24 MDT 2013

Cancel the job which has error. check if the tape has stuck and remove..
if it is not getting removed, just bring down the drive. 

It is observed that on specific OSB backup servers that obtool commands( lsjob, catxcr etc) are started hanging. On further analyzing the issue it is found that the obrobotd process got locked by the obndmpd which cause the entire OSB env to be not responsive. 

No comments:

Post a Comment