****************
  Introduction
****************

This document describes the Nagios plugins mainly used to monitor NorduGrid
ARC compute elements and related resources, but some probes should also be
usable to test non-ARC resources.  The package includes commands to do

* LDAP queries and tests on the information system, including GLUE 2.0 and
  legacy schemas.

* Job submission and monitoring of jobs with additional custom checks.

* Transfers to and from storage elements using various protocols.

The following chapters will cover the probes related to each of these topics.
This chapter will describe common configuration and options.

**Acknowledgements.**
This work is co-funded by the EC EMI project under the FP7 Collaborative
Projects Grant Agreement Nr. INFSO-RI-261611.


.. _configuration-files:

Configuration Files
===================

The configuration is merged from a list of the INI-format files, where
settings from later files take precedence, and missing files are ignored.
By the default the files considered are ::

    /etc/nagios/plugins.ini
    /etc/nagios/plugins/arcnagios-dist.ini
    /etc/nagios/plugins/arcnagios.ini
    /etc/nagios/plugins/arcnagios-local.ini

but this can be overridden by setting the environment variable
``$ARCNAGIOS_CONFIG`` to a colon-separated list.  ``arcnagios-dist.ini`` is
distributed with the plugins and contains a small collection of predefined
tests for the CE and infosys probes.

Each probe has a main configuration section, with is named after the probe.
In this section you can provide defaults for command-line options.  The name
of the configuration variable corresponding to an option is obtained by
stripping the initial "``--``" and replacing "``-``" with "``_``", e.g.
"``--home-dir``" becomes "``home_dir``".


Common Options
==============

The following options are common to all probes:

``--home-dir=<dir>``
    Override $HOME at startup. This is a workaround for external commands
    which store things under $HOME on systems where the user account running
    Nagios does not have an appropriate or writable home directory.

``--loglevel=(debug|info|warning|error)``
    This option allows you to increase the verbosity of the Nagios probes.
    Additional messages will occur as extended status lines in Nagios.

``--multiline-separator=<chars>``
    Replacement for newlines when submitting multi-line results to passive
    services. Pass the empty string drop extra lines. This option exists
    because Nagios currently don't support multi-line passive results.

``--command-file=<path>``
    The path of the Nagios command file.  By default $NAGIOS_COMMANDFILE is
    used, which is usually the right thing.

``--how-invoked=(nagios|manual)``, ``--dump-options``
    These are only needed for debugging purposes.


.. _x509-proxy:

Proxy Certificate
=================

The ``check_arcce`` and ``check_gridstorage`` probes will require a proxy
certificate to succeed.  The probes will maintain a proxy when provided a X509
certificate and key.  You can place these in a common section:

.. code-block:: ini

    [gridproxy]
    default_voms = <voms>
    user_key = <path>
    user_cert = <path>
    #user_proxy = <path> # Optionally override the path of the generated proxy.

The probes which require an X509 proxy have a ``--voms=<voms>`` option to
specify the VOMS server to contact instead of ``default_voms``.  When a
``user_key`` and ``user_cert`` pair is given, the default ``user_proxy`` path
is unique to the selected VOMS.

To use a pre-initialized proxy, make sure ``user_key`` and ``user_cert`` are
not set.  You will probably want to use a non-default location for the
proxy.  Either point to it with the environment variable ``X509_USER_PROXY``
or set it in the configuration file:

.. code-block:: ini

    [gridproxy]
    user_proxy = <path>

If you use several VOs with require different certificates, you can replace
the above section with one section ``gridproxy.<voms>`` per ``<voms>`` and use
the ``--voms`` option to select which section to use.  These sections don't
have the ``default_voms`` setting.


Security Notice
===============

The configuration file of these probes should not be generated or parts
substituted from an untrusted source without proper filtering.  In particular
the job tests picks up shell code to be executed on cluster nodes from
configuration variables, and the ARIS tests uses the Python interpreter to
evaluate custom expressions.


Running Probes from the Command-Line
====================================

The following instructions apply to ``check_arcce_submit``, ``check_arcce_monitor``, ``check_arcce_clean``, ``check_aris``, ``check_egiis``, ``check_arcglue2``, and
``check_arcstorage``.  It also applies to the deprecated ``check_arcinfosys``
and ``check_arcce``.  The other probes can be invoked from the command-line
without special attention.

For testing and debugging, it can be convenient to invoke the probes manually
as a regular user.  This can be done as follows.  Choose a directory where you
can store run-time state.  Below, we use ``/tmp``, but it may be tidier to
create a fresh directory.  Then, create a configuration like

.. code-block:: ini

    [DEFAULT]
    plugins_spooldir = /tmp

    [gridproxy]
    default_voms = <your-vo>

    [gridproxy.your-vo]
    user_proxy = /tmp/x509up_u<your-user-id>

substituting suitable values for the ``<your-*>`` meta-variables.  You may
need to add additional settings depending on want you test, of course.  After
acquiring a proxy certificate (if needed) and pointing to the new
configuration file,

.. code-block:: sh

    arcproxy -S <your-vo>
    export ARCNAGIOS_CONFIG=<your-config>

The probes can now be run as

.. code-block:: sh

    check_arcce_submit --how-invoked=manual ...
    check_arcce_monitor --how-invoked=manual ...
    check_arcce_clean --how-invoked=manual ...
    check_egiis --how-invoked=manual ...
    check_aris --how-invoked=manual ...
    check_arcglue2 --how-invoked=manual ...

The main purpose of the ``--how-invoked=manual`` is to tell the probe that any
passives results shall be printed to the screen rather than submitted to the
Nagios command pipe.  It is not strictly needed for active-only probes.


Deprecated Probes
=================

The following probes are deprecated.  They will be removed in a future
release.

* ``check_arcce`` is obsoleted by ``check_arcce_submit``,
  ``check_arcce_monitor`` and ``check_arcce_clean``.
* ``check_arcinfosys`` is obsoleted by ``check_aris``, ``check_arcglue2``, and
  ``check_egiis``.
