#
# Apache Hadoop integration for UNICORE 6
#


Overview
********
Apache Hadoop (http://hadoop.apache.org) is a software platform for storing and
processing vast amounts of data on clusters of commodity servers. 

The uas-hadoop module offers the possibility to access Apache Hadoop 
filesystems through UNICORE, thus allowing to create huge, reliable and 
scalable data storage systems and access them through UNICORE.

This document pre-supposes that you have an Apache Hadoop installation that you
want to access. For information on how to set up and configure Hadoop, please 
consult the web page cited above.

Installation
************

Place the hadoop-core and uas-hadoop jar files in your UNICORE/X lib/ folder.


Configuration
*************
There are two scenarios for using Hadoop storages.

1) as a shared "default_storage" that is published in the registry. This is very useful
if you are running the workflow system and need a big shared file storage
2) as an additional filesystem attached to target systems

In both scenarios you need to configure the location of the usual Hadoop config files, 
this is done using some properties in uas.config:
  
  unicore.hadoop.core=conf/core-site.xml
  unicore.hadoop.hdfs=conf/hdfs-site.xml
  unicore.hadoop.mapred=conf/mapred.xml

Scenario 1:
 - open unicorex/conf/uas.config and edit the settings controlling the "default_storage":
     
    defaultsms.class=de.fzj.unicore.uas.hadoop.SMSHadoopImpl
    
    #root path on the HDFS
    defaultsms.workdir=/unicore-data
    
    #optional: choose a nice name
    defaultsms.name=HADOOP
    
 The workdir setting denotes the path on the hadoop HDFS file system which is to be used as "root
 path" for the UNICORE data
 
Scenario 2:
 - adding fileystems is generally described in the online manual at
 http://www.unicore.eu/documentation/manuals/unicore6/unicorex/configuration.html
 - to use hadoop, just set the storage type to "CUSTOM" and specify the correct
 class and root path as in scenario 1. For example
 
   uas.targetsystem.storage.1=Work
   uas.targetsystem.storage.1.type=CUSTOM
   uas.targetsystem.storage.1.class=de.fzj.unicore.uas.hadoop.SMSHadoopImpl
   uas.targetsystem.storage.1.path=/unicore-data


Acknowledgements
****************
This software is based on a prototype developed by Wasim Bari during his master thesis
at the Distributed Systems and Grid Computing group of the Jülich Supercomputing Centre
http://www.fz-juelich.de/jsc/vsgc
