As a software guy, the first step of your job role demands the automation. And who does not love to have the automation in place, you haven been regularly doing it very well, and people often tend to leave the routine job to the person who had executed it in the first place and takes pains to continue doing it without impact or any adverse effect on whatsoever.
When things get to normal, and you enter the comfort zone, and see mostly others dont like to take part in the application you had been handling so perfectly, but now you need a break.. you often share the set of commands to do the repetitive job.
And what the hell you come back to see the things have been badly messed up, you might have given the right instructions, but may be somewhere you might have missed the sequence, ot the temporary file which was created in the /tmp directly got deleted, where you kept the application lock files, when the system does not auto generate it, and it failed when the file was missing. The person whom you handled for you did not bother to check the script or the log for what actually went wrong just opened the bug as it failed and left it for you to investigate when you were on your break.
If you put your set of commands in a file, and keep running it every day, to make sure it is good to handle the routing job on its own, you are on the first step of automation.
Here comes the crontab, all yours will let you set your decided day, time, hour, minute, or rather set the frequency you want ot run. Voila! you are on top.. to set it right.
Giving it all to cron does not only help, now you need to set up log file, so you know the things which are to happen are to happen the right way, and its not only humans make mistake, but systems also dont go the right way, you should be sure of that.
Well when things are right or wrong send out a mail from the cronjob, which is the log file updating every set of instruction and their execution exit code. Now when you are on your break, you can give the instruction to what to do and if the script really ran well or not.
Wait a minute, did you complete the automation, no.. the system on which the cron was set, can also go down, so you are not on the best design.
What to do next, you should have another machine which runs the same script, but now you should have some way to share the common file between the two machines, so that when one machine has ran the cron the other machine should not run same, this can be achieved by either opening common port between them, which shares the state of running cron.
You can have status file set to another machine which has already ran the cron job, so when seeing the status file the another machine does not kick off cron, and if it does not it will not kick off its own.
The status file can be exposed on http, or scp or both machine check on the common storage.
So here is what i do for my cron, for similar case.
# Ping check PING_HOST every 30 minutes, from two hosts, if its alreay done in last 1 hour dont repeat.
# 30 minutes we are doing, so in case we missed once in an hour, it will do the second time
# crontab: */30 /home/sanjeev/pingcheck.sh
mkdir -p tmp
CZ=`date -d “$CT” +”%s”`
exec &> $TMP_LOG_FILE
if [ -e “$STOP_FILE” ] ; then
echo “`date` : Exiting ping check”
if [ -f “LOG_FILE” ]; then
hs=`stat -c ‘%Z’ LOG_FILE`
if [ $cDIFF -lt 1 ]; then
echo “`date` : ping checked already done”
echo “`date` : Started ping check”
echo “Ping check for : $PING_HOST”
ping -c3 -w3 $PING_HOST
echo “`date` : Done ping check”
cp $TMP_LOG_FILE $LOG_FILE
cat $LOG_FILE | mail -s “Ping Check: $CT $PING_HOST” sanjeev@$PING_HOST
scp $LOG_FILE $OTHER_HOST
I had seen many of my Engineers, doing it but most of the time when the cron fails, they just look into fixing it but many time forget to make it highly-available, systems fail, either host which is running or host which sends mail.
Please dont look into logging, and checking when the systems run on their own, its left unattended and we get into lot of troubles.
I tried to talk in technical aspects, but never found it was correcting the people, so whats the best way..
Story telling, so here I have put up, may be helpful for technical and non technical people to write their simple automation or cronjob..