[[tsi_api]]

The TSI API
-----------

This document describes the API to the TSI as used by UNICORE/X (XNJS). 
The parts of the TSI that interact with the target system have been 
isolated and are documented here with their function calls.

The functions are implemented in the TSI as calls to Perl methods (with the methods
loaded through modules). Input data from the XNJS is passed as arguments to the method.
Output is returned to the XNJS by calling some global methods documented below or by
directly accessing the TSI's command and data channels.
TSIs are be shipped with default implementations of all the functions and can be
tailored by changing the supplied code or by implementing new versions of the functions
that need to change for the system.

Note that this document is not a complete definition of the API, it is 
a general overview. The full API specification can be derived by reading the 
TSI code supplied with a UNICORE release.

Initialisation
~~~~~~~~~~~~~~

For authentication of the XNJS, a callback mechanism is used. First, the XNJS will
contact the main TSI (the TSI shepherd) to request the creation of a new TSI worker process. 
The main TSI will call back the XNJS and create the necessary communications. It will
receive any initialisation information send by the XNJS.
After successful creation of the TSI worker process, the XNJS can communicate with the worker
and ask it to execute commands. The XNJS-TSI connection uses two sockets, a data and a 
command socket.

After initialisation is complete, the +infinite_loop()+ function (MainLoop.pm module) 
is entered which reads messages from the XNJS and dispatches processing to the various 
TSI functions.

Messages to the XNJS
~~~~~~~~~~~~~~~~~~~~

The TSI provides methods to pass messages to the XNJS.
In particular the XNJS expects every method to call either ok_report or failed_report at
the end of its execution. The messaging methods are:

 * +ok_report(string)+ Sends a message to the XNJS to say that execution of the command was successful.
The string is also logged as a debug message.
 
 * +failed_report(string)+ Sends a message to the XNJS to say that execution of the command failed. 
The string is sent to the XNJS as part of the failure message. It is also logged.

 * +debug_report(string)+ Logs string as a debug message.


User identity and environment setting
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In production mode the TSI will be started as a privileged user capable of changing the
TSI worker process' uid and gid to the user and account requested by the UNICORE user. 
This change is made before the TSI executes any external actions. The idenity is passed 
as a line in the message string sent by the XNJS, which starts with +#TSI_IDENTITY+.

The TSI performs three types of work: the execution and monitoring of jobs prepared by
the user, transfer and manipulation of files on storages and the management of
Uspaces (job working directory). Only the first type of work, execution of jobs, 
needs a complete user environment. The other two types of TSI work use a restricted set 
of standard commands (mkdir, cp, rm etc) and should not require access to specific 
environments set up by users. Furthermore, job execution is not done directly by 
the TSI but is passed off to the local Batch Subsystem which ensures that a full 
user environment is set before a job is executed. Therefore, the TSI only needs to 
set a limited user environment for any child processes that it creates.
The TSI sets the following environment in any child process:
 * +$USER+ This is set to the user name supplied by the XNJS.
 * +$LOGNAME+ This is set to the user name supplied by the XNJS.
 * +$HOME+ This is set to the home directory of the user as given by the target system's
password file.
 * +$PATH+ This is inherited from the parent TSI process (see the +tsi+ script file).
Localisations of the TSI can also set any other environment necessary to access the BSS.
This is done through the Perl ENV array.

For testing, the TSI may be started as a non-privileged user and so no changing of uid and gid
is possible.

Method dispatch
~~~~~~~~~~~~~~~

To determine which method to call, the +infinite_loop+ function checks the message
from the XNJS for the occurrence of special tags (followed by a new line). For
example, the occurrence of +#TSI_SUBMIT+ will lead to execution if the +submit()+ function.
Before entering any method, user/group ID switching is performed, as explained in the previous
section.

Job submission (#TSI_SUBMIT)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The +submit(string)+ function submits a user script to the BSS.

Input
+++++

As input, the script to be executed is expected. The string from the XNJS 
is processed to replace all instances of $USER by the user's
name and $HOME by the user's home directory. No further processing needs to 
be done on the script.

The XNJS will embed information in the script that the TSI may need to use. This
information will be embedded as comments so no further processing is needed.
Each piece of information will be on a separate line with the format:

-------
#TSI_name value
-------

If the value is the string 'NONE', then the particular information should not be
supplied to the BSS during submission. The information is:

 * +#TSI_JOBNAME+ This is the name that should be given to the job. If this is +NONE+, 
the TSI will use a default jobname.

 * +#TSI_PROJECT+ The user's project (for accounting)

 * +#TSI_STDOUT#+ and +#TSI_STRERR+ the names for standard output and error files.

 * +#TSI_OUTCOME_DIR+  The directory where to write the stdout and stderr files to. 
In general this is the same as +#TSI_USPACE_DIR#+
 * +#TSI_USPACE_DIR+ The initial working directory of the script (i.e. the Uspace 
directory).

 * +#TSI_TIME+ The run time (wall clock) limit requested by this job in seconds

 * +#TSI_MEMORY#+ The memory requirement of the job (in megabytes).
The XNJS supplies this as a per node value

 * +#TSI_TOTAL_PROCESSORS+ The number of processors required by the job.

 * +#TSI_PROCESSORS+ The number of processors per node required by the job.

 * +#TSI_NODES+ The number of nodes required by this job.

 * +#TSI_QUEUE+ The BSS queue to which this job should be submitted.

 * +#TSI_UMASK+ The default umask for the job

 * +#TSI_EMAIL+ The email address to which the BSS should send any status change emails.

 * +#TSI_RESERVATION_REFERENCE+ if the job should be run in a reservation, this parameter
contains the reservation ID.

 * +#TSI_PREFER_INTERACTIVE <junk>+ The presence of this indicates that the task 
should be executed 'interactively' i.e. on the TSI node without submission to the BSS. 
The TSI can reply with an OK and not the BSS id.

 * +#TSI_BSS_NODES_FILTER <filterstring>+ Administrators can define a string in the IDB which is 
to be used as nodes filter, if the BSS supports this.

Output
++++++

 * Normal: the output is the BSS identifier of the job unless the execution was interactive.
In this case the execution is complete when the TSI returns from this call and the output
is that from ok_report().

 * Error: +failed_report()+ called with the reason for failure


Reading files (#TSI_FILECHUNK)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The +get_file_chunk(string)+ function is called by the XNJS to fetch the contents of 
a file. 

Input
+++++

 * +#TSI_FILE <file name>+ The full path name of the file to be sent to the XNJS
 * +#TSI_START <start byte>+ Where to start reading the file
 * +#TSI_LENGTH <chunk length>+ How many bytes to return

The file name is modified by the TSI to substitute all occurrences of the string '$USER'
by the name of the user and all occurrences of the string '$HOME' by the home
directory of the user.

Output
++++++

 * Normal: The XNJS has a copy of the request part of the file (sent via the data socket)
 * Error: +failed_report()+ is called with the reason for failure.


Writing files (#TSI_PUTFILES)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The #put_files+ function is called by the XNJS to write the contents of one or more files 
to a directory accessible by the TSI.

Input
+++++

The +#TSI_FILESACTION+ parameter contains the action to take if the file exists (or does not):
0 = don't care, 1 = only write if the file does not exist, 2 = only write if the file exists,
3 = append to file. This action applies to all the files is a call of put_files.

The data to write is then read from the data channel following this pseudo code:

* while there are files to transfer:
** read filename and permissions from command channel
** substitute all occurrences of the string '$USER' by the name of the user and all occurrences of the string
'$HOME' by the home directory of the user.
** while there are more bytes:
*** read packet_size from command channel
*** read packet_size bytes from data channel
*** write bytes to file
 
Where 'permissions' are the permissions to set on the file.

Output
++++++

 * Normal: The TSI has written the files to the directory.
 * Error: +failed_report()+ called with the reason for failure.


Script execution (#TSI_EXECUTESCRIPT)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This function executes the script directly from the TSI process, without submitting the
script to the batch subsystem. This function is used by the XNJS to create and manipulate 
the Uspace, to perform file management functions, and to execute helper scripts like
+tsi_ls+. The XNJS also uses this to execute user defined code, for example 

Input
+++++

The script to be executed. The string from the XNJS is processed to replace all instances 
of +$USER+ by the user's name and +$HOME+ by the user's home directory. No further processing 
needs to be done on the script.
If the a +#TSI_DISCARD_OUTPUT+ string is present, no output will be gathered.

Output
++++++

 * Normal: The script has been executed. Concatenated stderr and stdout from the execution of the 
script is sent to the XNJS following the  +ok_report()+ call.
 * Error: +failed_report()+ called with the reason for failure.

Job control
~~~~~~~~~~~

 *+#TSI_ABORTJOB+ The +abort_job+ function sends a command to the BSS to abort the 
named BSS job. Any stdout and stderr produced by the job before the abort takes effect 
must be saved.

 * +#TSI_CANCELJOB+ The +cancel_job+ function sends a command to the BSS to cancel 
the named BSS job. Cancelling means both finishing execution on the BSS (as for abort) 
and removing any stdout and stderr.
 
 * +#TSI_HOLDJOB+ The +hold_job+ function sends a command to the BSS to hold execution 
of the named BSS job. Holding means suspending execution of a job that has started or 
not starting execution of a queued job. Note that suspending execution can result in 
the resources allocated to the job being held by the job even though it is not executing 
and so some sites may not allow this. This is dealt with by the relaxed post condition below.
Some sites can hold a job's execution and release the resources held by the job (leaving
the job on the BSS so that it can resume execution). This is called freezing. The XNJS can
send a request for a freeze (#TSI_FREEZE) which the TSI may execute, if there is no 
freeze command initialised the TSI may execute a hold in its place
An acceptable implementation is for hold_job to return without executing a command.

 * +#TSI_RESUMEJOB+ the +resume_job+ function sends a command to the BSS to resume execution 
of the named BSS job. Not that suspending execution can result in the resources allocated to the 
job being held by the job even though it is not executing and so some sites may not allow this. 
An acceptable implementation is for resume_job to return without executing a command (if hold_job did
the same).

Input
+++++
All job control functions require the BSS job ID as parameter in the form
+#TSI_BSSID <identifier>+

Output
++++++

 * Normal: the job control function was invoked. No extra output.
 * Error: +failed_report()+ called with the reason for failure.


Status listing (#TSI_QSTAT)
~~~~~~~~~~~~~~~~~~~~~~~~~~~

This +get_status_listing+ function returns the status of all the jobs on the BSS that have been 
submitted through any TSI providing access to the BSS.

This method is called with the TSI's identity set to the special user ID
configured in the XNJS (+CLASSICTSI.priveduser+ property). This is because the XNJS expects 
the returned listing to contain every UNICORE job from every UNICORE user but some BSS only 
allow a view of the status of all jobs to privileged users.

Input
+++++

None.

Output
++++++

 * Normal: The first line is 'QSTAT'. There follows an arbitrary number of lines, each line 
containing the status of a job on the BSS with the following format: 
"id status <queuename>", where +id+ is the BSS identifier of the job 
and +status+ is one of: QUEUED, RUNNING, SUSPENDED or COMPLETED. Optionally, the queue name
can be listed as well. The output must include all jobs still on the BSS that were submitted 
by a TSI executing on the target system (including all those submitted by TSIs other than 
the one executing this command). The output may include lines for jobs on the BSS submitted by other
means.

 * Error: +failed_report()+ called with the reason for failure.


File ACL operations (#TSI_FILE_ACL)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The +process_acl+ function allows to set or get the access control list on a given
file or directory. Please refer to the file +ACL.pm+ to learn about this part of the 
API.

Resource reservation 
~~~~~~~~~~~~~~~~~~~~

The TSI offers functionality to create and manage reservations.
For full information, please refer to the file +ResourceReservation.pm+.


Creating a reservation (#TSI_MAKE_RESERVATION)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This is used to create a reservation.

Input
+++++

 * +#TSI_RESERVATION_OWNER <xlogin>+ The user ID (xlogin) of the reservation owner
 * +#TSI_STARTTME <time>+ The requested start time in ISO8601 format (yyyy-MM-dd'T'HH:mm:ssZ)
 * The requested resources are passed in in the same way as for job submission

Output
++++++

 * Normal: The command replies with a single reservation ID string.

 * Error:+failed_report()+ called with the reason for failure


Querying a reservation (#TSI_QUERY_RESERVATION)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This is used to create a reservation.

Input
+++++

 * +#TSI_RESERVATION_REFERENCE <reservation_ID>+ The reservation reference

Output
++++++

 * Normal: no output

 * Error:+failed_report()+ called with the reason for failure


Cancelling a reservation (#TSI_CANCEL_RESERVATION)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This is used to cancel a reservation.

Input
+++++

 * +#TSI_RESERVATION_REFERENCE <reservation_ID>+ The reservation reference

Output
++++++

 * Normal: The command produces two lines. The first line contains the 
   status (UNKNOWN, INVALID, WAITING, READY, ACTIVE, FINISHED or OTHER) and 
   an optional start time (ISO 8601). The second line contains a human-readable
   description

 * Error:+failed_report()+ called with the reason for failure
