I have had a couple of people asked me lately about network/infrastructure monitoring. So I thought I would share with you a product which I use to monitor a couple of sites including my own internal infrastructure.
Without going into too much detail (and in order to get to the pretty pictures as quickly as possible) the concept behind network monitoring is to add certain sensors to constantly check for issues with servers, printers, routers, wireless access points etc. Conditions and levels are set for warning or full alerts depending on the device or service in question and notifications are sent to relevant staff if these conditions are triggered.
Without monitoring there is often no indication that things are going bad and the first thing that I hear is when all the users are locked out of the server. With monitoring I can see trends or emerging issues and restart a server or otherwise take some other action after hours or at a convenient time and thereby avoid a crisis. This type of product is not a magic bullet but it's a good tool to have and its saved me a couple of times.
Image 1 (below) is a listing of all the monitored objects on the network, note the memory warning on one of the servers:
Image 2 (below) shows the detail for that memory issue, note the yellow "warning' level and the red "alert" level on the graph.