[[ucc_datamanagement]]
Data management functions
-------------------------

UCC offers access to all the data management functions in UNICORE.
You can upload or download data from a remote server, initiate
a server-to-server transfer, create directories and so on.


Specifying remote locations
~~~~~~~~~~~~~~~~~~~~~~~~~~~

Remote locations can be specified in two ways. The first way is to 
use a URI that includes protocol, storage server and filename, for example

-------
BFT:https://mygateway:8080/SITE/services/StorageManagement?res=default_storage#/file
-------

which specifies a file named "/file" on the storage instance 
"https://mygateway:8080/SITE/services/StorageManagement?res=default_storage", 
using the BFT protocol. 

[[NOTE]]
==================
Paths are relative to the storage root, not the root of
the actual file system.
==================
    
This explicit format is sometimes inconvenient, so you can use a shorter, more intuitive
format. This is also a URI, but you need to know only the name of the virtual site (target system), 
and the storage or job id. For example

-------
unicore6://SITE/Home/file?protocol=PROTOCOL
-------

or shorter

-------
u6://SITE/Home/file?protocol=PROTOCOL
-------


This will resolve the current user's "Home" storage at the target system named "SITE". 
Note that if you do not specify the protocol, the BFT protocol will be used as default.

You can also refer to a job Uspace (the job's working directory) on a given site. For this,
you will need the unique ID of that job, which you can get for example using the 'list-jobs'
command. For example,

-------
u6://SITE/1f3bc2e2-d814-406e-811d-e533f8f7a93b/outfile
-------

refers to the file "outfile" in the working directory of the given job on the "SITE" target 
system.

It is also possible to refer to storage services that are registered in the registry using their name,
for example

-------
u6://SHARE/myfiles/a_file
-------

can be used to refer to the shared storage named "SHARE" if it is registered in the registry.

Though convenient, the method using "unicore6://" is much slower, and will generate some 
network traffic. If you do a lot of operations on the same resource, you should use 
the 'resolve' command to find out the URI of the resource, and use that later.


==== The resolve command
  
This will figure out the "real" address for a "unicore6://" URL as defined above.

------------
ucc resolve u6://SHARE/
------------

  
Data movement
~~~~~~~~~~~~~

==== get-file

Use 'get-file' to download remote files to your local machine.
  
Example
  
-------
ucc get-file -s u6://DEMO-SITE/Home/test.txt -t my_test.txt
-------

The "-s" (source) and "-t" (target) options are used to denote
the source file(s) and the target file or directory. Wild card characters '*' and '?'
are supported. For example,

-------
ucc get-file -s u6://DEMO-SITE/Home/*.pdf -t pdfs/
-------

will download all *.pdf files and write them to the "pdfs" directory (which must exist).


==== put-file

Use 'put-file' to upload a local file to a remote location.

Example:
  
-------
ucc put-file -s test.txt -t u6://DEMO-SITE/Home/test.txt 
-------

If you specify the "-a" option, data will be appended to an existing 
file.

==== copy-file

This will initiate a server-to-server data transfer. Use the "-a" option to run
asynchronously, i.e. ucc will not wait for the transfer to complete. Instead, a 
file containing the transfer reference will be written, which can be passed to the 
'copy-file-status' command for status checking later.
  
In case the source and target file are on the same storage resource, UCC will issue 
the remote copy command and return immediately, as there is no need for an 
asynchronous mode.

Example:
  
------
ucc copy-file -s u6://OTHER-SITE/Home/test.txt -t u6://DEMO-SITE/Home/test.txt 
------

  
Sometimes a user wishes to schedule the time when a server-to-server transfer 
is executed, for example because she knows that more network bandwith will be 
available at that time.

[NOTE]
==================
This feature only works with server release 6.4.0 or higher.
==================

To schedule the file transfer, you can use the "-S" option to the ucc "copy-file"
command:

----------------
ucc copy-file -S "12:30" ...
----------------

The format is simply "HH:mm" (hours and minutes). Alternatively you can give 
the time in the full ISO 8601 format including year, date, time 
and time zone:

--------------
ucc copy-file -S "2011-12-24T12:30:00+0200" ...
--------------

==== copy-file-status

This will print the status of the given data transfer. As argument, it expects a file name
containing the transfer reference, or directly the reference.
 
Example (for Unix) which captures the reference into a shell variable:
  
-------
export ID=$(ucc copy-file -a -s u6://OTHER-SITE/Home/test.txt -t u6://DEMO-SITE/Home/test.txt)
ucc copy-file-status $ID 

==== Specifying the file transfer protocol

To use a different protocol from the default BFT, you can use the "-P" option to specify a list
of preferred protocols. UCC will try to match them with the capabilities of the storage and use 
the first match. Your preferred protocols can also be listed in your preferences file using the
"protocols" key:

----------
protocols=UFTP BFT
----------

[NOTE]
================
If necessary, you can specify additional filetransfer options in your preferences file as well. 
For example, to use the UFTP protocol you may want to specify the client host address 
and the number of parallel streams explicitely:

-----
uftp.client.host=your_client_hostname
uftp.streams=2
#encrypt data (at the cost of performance)
uftp.encryption=true
----

You can even override the UFTP server host, which can be useful in case the UFTP server is accessible 
via multiple network interfaces:

----
uftp.server.host=myhost.com
----

UCC will try to use reasonable defaults for any missing parameters.
================

-------

Handling directories
~~~~~~~~~~~~~~~~~~~~

==== mkdir
  
This will create a directory (including required parent directories) remotely.
  
Example
  
-----
ucc mkdir u6://DEMO-SITE/Home/testdirectory/data/pdfs
-----

==== rm
  
This will remove a file or directory remotely. By default, UCC will ask for a confirmation. 
Use the "--quiet" or "-q" option to disable this confirmation (e.g. when using 
this command in scripts).
  
Example
  
------
ucc rm u6://DEMO-SITE/Home/testdirectory/data/pdfs
------

Finding data
~~~~~~~~~~~~

==== ls

This will list a remote directory. Useful options are: "-l" (detailed output), "-H" (human-friendly)
and "-R" (recurse). Example:

-----
ucc ls u6://DEMO-SITE/Home -l -H
-----

If the storage supports metadata, you can get the metadata of a single file using "ls -l -m":

-----
ucc ls u6://DEMO-SITE/Home/.bashrc -l -m
-----


==== find

This command is a similar to the well-known Unix utility, however much less powerful. 
It allows to do recursive listings and retrieve files matching certain conditions.
Currently only "name match" is available. For example to get all PDF files on a storage,

-----
ucc find -r -l u6://DEMO-SITE/Home/ -N .pdf
-----

[[NOTE]]
============
The 'find' command is currently implemented synchronously, and may thus run into a network timeout
when it takes too long. This limitation will be overcome in future versions of this command.
============

