Frustration with my home network's WAN connectivity randomly flapping and being somewhat cheap led me to see if I could get paged without having worry about email relays or paying for your traditional notification service à la PagerDuty (pricey) or VictorOps (unreliable).
The first order of business was to write up a simple script to perform some checks against my home network. It's a simple shell script run through cron that maintains a running log of check attempts and results. Each time it runs, reads the last result, performs new checks, and if the state changed from the previous attempt, then send out an alert.
The code for the script is below or check it out at GitHub.
In the above, the system alerted when its SSH check against my home machine failed. I only consider this a warning as the overall network could still be okay. The logic flow dictates, that if the SSH fails, then it attempts to ping the router. The below shows an alert and recovery related to the WAN interface dropping off the Internet.
Overall, it works pretty well. Using a script and cron is pretty rudimentary monitoring, but it accomplished the basic need. These alerts have allowed me to narrow down my search through the logs from my home router to track down its problems.
The first order of business was to write up a simple script to perform some checks against my home network. It's a simple shell script run through cron that maintains a running log of check attempts and results. Each time it runs, reads the last result, performs new checks, and if the state changed from the previous attempt, then send out an alert.
The code for the script is below or check it out at GitHub.
The script is run simply through cron and I have it scheduled to run every 2 minutes from a machine outside my home network. For myself, I don't have the script post to a channel, but rather it sends a direct message. Then in my Slack notification settings, I make sure it sends a push notification for direct messages and for certain words/phrases.#!/bin/bash ## Very basic check to see if home network is up. ## First attempt SSH connection to server. ## If this fails, then attempt to ping router. ## Record result of check to logfile. ## If current check status is different than last check status ## sent message to Slack via webhook ## Host and SSH port to check hName="[HostName]" port="[Port]" ## Slack Hook and Channel slackWebHook="https://hooks.slack.com/services/[WebHookLink]" slackChannel="@[SlackUserName]" ## LogFile to maintain check state. logFile="/[someLogDirectory/check-home.log" ## Send a color coded message to Slack function notifySlack () { status=$1 hostname=$2 result=$3 case "${status}" in OK) color="good" ;; WARNING) color="warning" ;; CRITICAL) color="danger" ;; *) color="#909090" ;; esac payload="\"attachments\": [{ \"title\": \"${hostname} status is ${status}\", \"text\": \"${result}\", \"color\": \"${color}\" }]" if [ ! -z "${slackChannel}" ]; then curl -s -XPOST --data-urlencode "payload={ \"channel\": \"${slackChannel}\", ${payload} }" ${slackWebHook} > /dev/null 2>&1 else curl -s -XPOST --data-urlencode "payload={ ${payload} }" ${slackWebHook} > /dev/null 2>&1 fi } ## Get previous status lastEvent=`tail -1 ${logFile}` lastStatus=`echo ${lastEvent} | awk '{ print $4 }'` ### First Check Host if SSH is up. results=`echo QUIT | nc -v -w 5 ${hName} ${port} 2>&1 | grep -v mismatch` res=`echo ${results} | awk '{print $5}'` ## Check connection status, set to WARNING if this fails if [ "${res}" != "open" ]; then echo `date +"%h %d %T"` WARNING $results >> ${logFile} status="WARNING" ## Try pinging host results2=`ping -c 5 ${hName} | grep packets` percent=`echo ${results2} | awk '{ print $6 }' | sed -e 's/\%//'` if [ "${percent}" -gt 0 ]; then echo `date +"%h %d %T"` CRITICAL $results2 >> ${logFile} status="CRITICAL" fi else ## Otherwise we are okay echo `date +"%h %d %T"` OK $results >> ${logFile} status="OK" fi ## Do we notify Slack? if [ "${status}" != "${lastStatus}" ]; then results=`tail -1 ${logFile}` notifySlack "${status}" "${hName}" "${results}" fi
In the above, the system alerted when its SSH check against my home machine failed. I only consider this a warning as the overall network could still be okay. The logic flow dictates, that if the SSH fails, then it attempts to ping the router. The below shows an alert and recovery related to the WAN interface dropping off the Internet.
Overall, it works pretty well. Using a script and cron is pretty rudimentary monitoring, but it accomplished the basic need. These alerts have allowed me to narrow down my search through the logs from my home router to track down its problems.
No comments:
Post a Comment