So I think that the Charting API is fairly well documented (completely documented, if nothing else) - Time will tell, of course. For instance, just today I was implementing the vaxis_title on timelines. Never though that would be needed, but it was.
Charts API
Just a small change - vaxis_title is now supported as a meta option for timeline charts.
Cacti
Cacti is simple. For the Cacti system itself, everything is stored within a single folder (for security reasons I won’t publicly publish the full directory). Then there’s the Apache config file and a cronjob to run the poller every five minutes. The poller is a PHP script that, when it runs, updates all of the graphs. Cacti runs off of graph templates, which are defined in XML files: The file defines what is graphed, how to get that data, and so on. Sometimes they define more than one graph.
These graph templates have to be imported using the “Import Templates” screen.
Aftermarket (Pimp my Cactus?)
To ease the process of deploying graphs across all of the machines, I created the ‘Add graph to all devices’ link in the left sidebar. Clicking this link will bring you to a page, where you can select a graph from a dropdown and press the associated button. Pushing the button will run a script that will drop that graph onto all of the machines (or dbmasters or webs, according to the button).
The scripts take the form of shell scripts located in the Cacti root directory.
AutoCacti
To make most of life easier, most of the above was consolidated inside a ‘autocacti’ folder. The TOOLCHAIN-INSTALL file documents how to use the toolchain. AutoCacti automates the installation of Cacti and the Better Templates set. The ‘add-target.sh’ script inside the toolchain/ directory is used to add new machines to the list of machines that are being monitored. At the moment it simply prints out instructions of how that works, but at some point it will actually add the target to the monitored devices.
Nagios
Nagios is an alerting system that essentially duplicates Server Density, as well as providing a few features (and, likewise, lacking a few others). Nagios depends heavily on it’s configuration files, which can be a pain to understand.
Nagios is similar to Cacti in it’s internal mechanics: They both run a scheduled task every so often that checks a list of machines for certain statistics. In Nagios, all of these tasks are handled through external plugins called, well, plugins. Adding these plugins can be done by modifying commands.cfg and conf.d/services.cfg (unfortunatly, you still have to do these tasks manually).
Adding a new machine can be done by running the add-web-host.sh script found in the config directory, passing it the name of the machine that you wish to add. It will automatically load up the default set of alerts, and will start alerting for them.
The actual semantics of the config files are quite complex, and I won’t cover them here. Yet.
One thing that’s good to know is that the user that runs the web interface for Nagios MUST be allowed to SSH to the monitored machines without entering credentials.