Nagios: Watching the Watchmen

By AdamT

Nagios: Watching the Watchmen

by Lithium Guru on ‎05-26-2009 10:47 PM

Nagios-whitebg-212x50.png

Even with our skilled team, it helps to have systems in place informing us of any problems, whether real or potential. The Production Operations team's monitoring system of choice is Nagios.

 

Nagios is a 10-year-old, battle-hardened system for monitoring... just about anything. At Lithium, we use it to make sure our communities are running, to make sure our servers are healthy and to give us insights into services that may be due for a preemptive restart or configuration change.

 

service-detail.png

 

Lithium has extended Nagios through various means: We've created a custom IRC bot to alert us to any changes in service status. We also work actively with our development teams to incorporate monitoring hooks directly into our application, to monitor the internals of our communities. These hooks give us rich access to health data such as garbage collection statistics over time, acute views at memory usage of specific pieces of the application and complex analyses of internal structures to provide heads-ups of any possible issues.

 

Nagios is a critical tool for the Production Operations team, but its extension and the maturity of Lithium's internal tools that surround it are what make the process shine.

comments
VIP Council on ‎05-28-2009 09:01 PM

Adam,

 

Great post - when everthing is running well, it's easy to think of hosting as a utility.  Always on, assumptive that the site will always be up.   One can start to take it for granted.

 

Our community has run exceptionally well - there are some performance issues every now and again, but time and time over, they have wound up being internet routing faults where traffic has broken down between hops, and nothing to do with the hosting or the application itself.  I think that speaks volumes about the work your team does and the tools and processes you have developed.  Thanks for sharing a bit of what's going on behind the scenes.

 

Mark

post a comment
Be sure to enter a unique name. You can't reuse a name that's already in use.
Be sure to enter a unique email address. You can't reuse an email address that's already in use.
Type the characters you see in the picture above.Type the words you hear.