0

Hello Friends,

I have a small challenge, i need to motify a shell script designed to monitor an oracle database. It should notify the DBA when the listener & database is down and also notify when there is any error in the alert log.

Note: The sendmail feature should not keep sending mail as long as the database is down but should check if the instance is down, it should fire a mail and every five minute it should fire a mail.

What i currently have keeps firing an email as long as the listener or database is down. So for any database downtime i get over 30000 mails.

Kindly help review the code. I have included the current shell i am using as an attachment.

Regards,
Whales.

Attachments
#!/bin/ksh
#############################################################################
# checkalert.sh
#
# Script to see if Oracle and the Listener are up.  It also checks the
# Oracle alert log for errors.  Once the script is submitted, it will
# continue running.
# Modified by :Wale Odusanya to suit xxxxx. requirement (11/02/2010)
# 
# Author    : Mike Selvaggio (based on Biju Thomas script from www.dba-oracle.com)
#             Orsel Consulting Inc.
# Created   : 07/23/2002 
# Modified  : 11/02/2010
#############################################################################
#
# Set variables
#
#############################################################################
werr=0
wdate=`date '+%m%d'`
ENV=/oraprod1/test/ora10g/admin/TEST1_sun1/bdump
LOG=/oraprod1/test/ora10g/admin/TEST1_sun1/bdump/log
distlist="whalesola@yahoo.com"
ORACLE_BASE=/oraprod1/test/ora10g
wenvfile=$LOG/calertenv.log
#############################################################################

#############################################################################
#export EDITOR=vi
#export ORACLE_BASE=/oraprod1/test/ora10g
#export ORACLE_HOME=/oraprod1/test/ora10g
#export CONTEXT_NAME=TEST1_sun1
#export LD_LIBRARY_PATH=$ORACLE_HOME/lib:/usr/dt/lib:/usr/openwin/lib:$ORACLE_HOME/ctx/lib
#export TNS_ADMIN=$ORACLE_HOME/network/admin/$CONTEXT_NAME
#export NLS_LANG=american
#export NLS_DATE_FORMAT='Mon DD YYYY HH24:MI:SS'
#export ORATAB=/etc/oratab
#export PATH=$PATH:$ORACLE_HOME:$ORACLE_HOME/bin:/usr/ccs/bin:/bin:/usr/bin:/usr/sbin:/sbin:/usr/openwin/bin:/opt/bin:.
#export DBALIST="whalesola@yahoo.com"
#############################################################################

#############################################################################
#
# Evaluate hour and run script every  two minutes
#
#############################################################################
datestring=$(eval 'date +%H')
daystring=$(eval 'date +%a')
while [ 0 ]
do
mytest=0 
while [ "$datestring" -lt  23 ]
do 
   	  mytest=1 
          ORACLE_SID=TEST1
   	  export ORACLE_SID
          ORACLE_HOME='/oraprod1/test/ora10g'
          export ORACLE_HOME
          wlogfile=$LOG/calertlog.$ORACLE_SID
          werrfile=$LOG/calerterr.$ORACLE_SID
	    #############################################################################
	    #
	    # Initialize message files
	    #
	    #############################################################################
 	    echo "**********************************************************************" >  $wlogfile
          echo "**********************************************************************" >  $werrfile
          echo "**********************************************************************" >  $wenvfile
	      #############################################################################
            #
            # Verify if required Oracle Home and SID are Set
            #
            #############################################################################
            if test `env | grep ORACLE_SID= | wc -l` -ne 1
             then
             werr=1
             echo "Environment Variable ORACLE_SID Not Available \n" >> $wenvfile
             echo "**********************************************************************" >>  $wenvfile
            fi
            if test `env | grep ORACLE_HOME= | wc -l` -ne 1
             then
             werr=1
             echo "Environment Variable ORACLE_HOME Not Available \n" >> $wenvfile
             echo "**********************************************************************" >>  $wenvfile
            fi 
          #############################################################################
	    #
	    # Check if listener and database are up
 	    #
          #############################################################################
  	    $ORACLE_HOME/bin/lsnrctl status > /dev/null
  	    STATUS=$?
  	    if test $STATUS -eq 0
  	      then
             echo "Listener for $ORACLE_SID is up and running" >> $wlogfile
            else
             echo "Listener for $ORACLE_SID is down" >> $werrfile
             echo "**********************************************************************" >>  $werrfile
SUBJ="Listener Monitor (sun1)"
TO=whalesola@yahoo.com
 (
cat << !
TO : ${TO}
Subject : ${SUBJ}
!
echo "Listener for $ORACLE_SID is down") |/usr/sbin/sendmail -F "Monitor" ${TO} ${CC}
#          echo "Listener for $ORACLE_SID is down"|/usr/sbin/sendmail -v  ${distlist}
            
            fi
          #
          ps -ef | grep pmon | grep ${ORACLE_SID} > /dev/null
          STATUS=$?
          if test $STATUS -eq 0
            then
             echo "Oracle up and running" >> $wlogfile
            else
             echo "Oracle Not Available $ORACLE_SID" >> $werrfile
             echo "**********************************************************************" >>  $werrfile
              
SUBJ="Instance Monitor (sun2)"
TO=whalesola@yahoo.com
 (
cat << !
TO : ${TO}
Subject : ${SUBJ}
!
 echo "Oracle Not Available $ORACLE_SID" ) |/usr/sbin/sendmail -F "AlertMonitor" ${TO} ${CC}
#            echo "Oracle Not Available $ORACLE_SID"|/usr/sbin/sendmail -v  ${distlist}
          fi
          #############################################################################  
          #
          # Check the alert log file for any errors since the last run
          #
          #############################################################################
          walertfile=/oraprod1/test/ora10g/admin/TEST1_sun1/bdump/alert_${ORACLE_SID}
          if test -f ${walertfile}.last
            then
             echo "Compare file exists" > /dev/null 
            else
             touch ${walertfile}.last
            fi
         if test -f ${walertfile}.log
           then
             if test `diff ${walertfile}.last ${walertfile}.log |grep -v "WARNING"  | grep -v "ORA-00480" |grep -v "ORA-3217" | grep "ORA-" | wc -l` -ne 0
               then
                 echo "Following errors written to the Alert log file. Please verify" >> $werrfile
                 diff ${walertfile}.last ${walertfile}.log | grep "ORA-" |grep -v "WARNING" |grep -v "ORA-00480" |grep -v "ORA-3217" >> $werrfile
                 echo "**********************************************************************" >>  $werrfile
               else
                 echo "No Errors in the Alert log file" >> $wlogfile
             fi
             cp ${walertfile}.log ${walertfile}.last
             #############################################################################
             #
             # Check if errors encountered, if yes send mail 
             # 
             #############################################################################
             if test `cat $werrfile | wc -l` -ne 1
               then
                echo "**********************************************************************" >>  $werrfile
                echo "Date        : "`date '+%m/%d/%y %X %A '` >> $werrfile
                echo "Database    : "$ORACLE_SID >> $werrfile
                echo "Server      : "`uname -n` >> $werrfile
                echo "**********************************************************************" >>  $werrfile
SUBJ="Alert Log (s0888001)"
TO=whalesola@yahoo.com
 (
cat << !
TO : ${TO}
Subject : ${SUBJ}
!
cat ${werrfile} ) |/usr/sbin/sendmail -F "Monitor" ${TO} ${CC}  
#cat ${werrfile} |/usr/sbin/sendmail -F whalesola@yahoo.com whalesola@yahoo.com  Subject: "Test"

 #cat ${werrfile} | mailx -s "Errors found in routine alert file checkup - $ORACLE_SID" ${distlist} 
                continue
               else
                echo "Successful completion of routine checkup" >> $wlogfile
                echo "No Errors / Alerts Encountered" >> $wlogfile
                echo "**********************************************************************" >>  $wlogfile
                continue
             fi
         fi   
   #   fi
#   esac

sleep  120
#################################
# Set current hour for date check
#################################
datestring=$(eval 'date +%H')
done
if [ "$mytest" -eq 1 ] ; then
#################################
# Remove Compare file 
#################################
#echo "Removing Compare File ${walertfile}.last"
rm ${walertfile}.last
cp ${walertfile}.log /oraprod1/test/ora10g/admin/TEST1_sun1/bdump/alertback/alert_${ORACLE_SID}_${wdate}.log 
find /oraprod1/test/ora10g/admin/TEST1_sun1/bdump/alertback/ -type f -mtime +3 -exec rm -r {} \;
fi ;
done
# End of Script
2
Contributors
1
Reply
2
Views
7 Years
Discussion Span
Last Post by thekashyap
0

Well, you're not waiting for 5 minutes before doing each check so naturally you'll have so many mails.. just sleep for 300 seconds between each check ONCE you detect that DB is down. If DB is up sleep for lesser time if you like frequent checks.
Other way of course is that you do the check at a constant frequency but sendmail only if time elapsed since last mail was sent is more than 300 seconds. For how to find time diff check this thread.

This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.