ENCP release notes, from v3_6 to v3_7
Encp changes:
=============
Solaris 10 bug fix for encp to not abort when reading from tape onto the
/tmp swap partition.
New encp switches:
--copies <COPIES> Write N copies of the file.
--copy <COPY> Read copy N of the file. (0 = original)
These switches are to support encp writing multiple copies of the same file
to different physical locations. (Originally in encp v3_6h.)
The false warnings that look like:
Got error while trying to obtain configuration: ('KEYERROR', "Configuration Server: no such name: 'pnfs_agent'")
are gone since v3_6f.
Fixed a bug (originally in v3_6g) to prevent encp from clearing a layer
with a single space character while writing from dCache. This space
character would cause retires to fail with layers 1 and 4 not being empty.
Encp will now ignore security port scans. (This was originally in encp
v3_6c. v3_6c remains the minimum allowed version of encp to work at FNAL
as of 12-10-2007.)
A work around for the "invalid directory entry" file problem was implemented.
The root problem is a PNFS bug that creates directory entries pointing
to non-existent i-nodes. The work around is to cleanup the actual filename
of the file, and leave the encp temporary filename in a broken state.
This allows for users to retry writing files without administrator
intervention. (Originally in encp v3_6b.)
Misc.:
======
Detailed cvs commit logs
========== show_volume_cgi.py ====================================================================================
WARNING I changed "cgi-bin/enstore" to "cgi-bin"
========== file_clerk.py ====================================================================================
mylint doesn't like +=
reconnect only on connection error, not all pg.Error
more exception handling with reconnection to the database
bug fix on missing ', again
bug fix on missing '
log multiple copies intents
These changes allow for --erase to be usable for the file clerk client. Support for hidden options now exist in option.py. --erase for the "enstore file" command is now a hidden option. This means that it is not shone in the --help or --usage output and is avaliable only to adminstrators. Currently, the code has the constant ALLOW_ERASE set to false, to enable this command this constant only needs to be set to true.
A minor change about how the length of the bfid brand is used for comparison checks.
Fixed a log message to be more clear to the reader about the registering of copy bfids.
add command interface to find_copies, find_all_copies, find_original, find_the_original, and find_duplicates
add find_all_copies(), find_original(), find_the_original() and find_duplicates()
bug fix
This comment is for previous commit, which was done with wrong -m 1 bug fix in set_deleted() [2] checks size, crc, and sanity_cookie a new_bit_file is requested for a copy [3] retruns F-ERROR in case of error
file_clerk.py
add multiple copies capability
========== pnfs_agent.py ====================================================================================
Added get_xreference() and get_file_size().
========== alarm.py ====================================================================================
remove rexec dependency
remove r_a befor comparing
move r_a to enstore_constants
revert to old version until can debug
need to remove RA when read in from alarm file too
need to remove RA when read in from alarm file too
do not output r_a info on alarm page
ignore r_a in the alarm info
========== inquisitor_plots.py ====================================================================================
fix typo caught by bless.py
make sure we are not using any external argument to get points directory and points nodes (so it is SDE compliant)
extract username from accounting server and pass it to modified accounting_query
use db port if specified in config file
fix call to accounting_query (need to specify port number)
a temporary solution to show GCC mount latencies
========== enstore_files.py ====================================================================================
create outage file if does not exist
add filenames to encp hitory page error lines
========== accounting_query.py ====================================================================================
removed unnecessary subclassing of accounting.accDB
fix call to accounting_query (need to specify port number)
========== operation.py ====================================================================================
d0ensrv4 --> d0ensrv4n
add configuration for database server
operation db has moved to stkensrv0
bug fix
bug fix
add library argument for get_last_write_Protect_*_job_time()
fix quoting of remedy ticket category
change remedy ticket PTI
support CD-LTO3, D0-LTO3 in addition to CDF-LTO3
bug fix for wrong protect/permit directory
help for recent
1 alternate doors for ADIC [2] use standard paths
bug fix
complete auto_close_all() for adic and sl8500
1 add sl8500 [2] clean up library types
1 add aml2 library type [2] add list recent [n] [3] many improvements
switch from media_type to library for tab-flipping selection
user pwd.getpwuid() to figure out user rather than os.getlogin()
sending email when automatically closing jobs
add more help for auto_close_all
Implement the mechanism to close finished open jobs The command is "auto_close_all"
1 allow "no_limit" in auto_write_protect_on/auto_write_protect_off [2] allow "limit " in recommend_write_protect_on/recommend_write_protect_off [3] more help information
========== enstore_functions.py ====================================================================================
Replaced calls to option.default_port() and default_host() with those from enstore_functions2.
if ENSTORE_HOME is defined set tmp dir relative to it
========== generic_client.py ====================================================================================
Support the new --print switch for the configuration client.
Catch sys.stderr.write() errors.
========== configuration_client.py ====================================================================================
Support the new --print switch for the configuration client.
removed extra print that broke tools/service_ips
Added support for --file-fallback in the configuration client. This switch alters the use of --show to read from the local copy of the configuration file (if present), when the configuration server is down.
more logick added to get_config_dict
added get_config_dict to get config dict ether from config server or from config file in case when server does not respond
Fixed the library header to be "library manager" instead of "media changer" for --list-library-managers.
Added get_movers2().
Added get_library_managers2() and get_media_changers2().
add configdict_from_file()
Added --list-movers, --list-library_managers, --list-media-changers to the "enstore config" command.
Comment out code that was usefull for some debugging.
Added code to aid in debugging the encp "pnfs_agent" false errors.
========== enstore_status.py ====================================================================================
remove rexec dependency
========== volume_assert.py ====================================================================================
Added more logging. Most error messages only went to stderr.
Catch sys.stderr.write() errors.
========== inquisitor_client.py ====================================================================================
--is-up now looks into both outage and offline
add reason for override command line
add reason for override
add --is-up
Catch sys.stderr.write() errors.
========== mover-nanny.py ====================================================================================
1. do not restart or reboot movers by default. 2. do not check null movers by default.
remove rexec dependency
========== histogram.py ====================================================================================
created plottre framework, a module used by framework, example main module and example ratekeeper plotter (to replace makeplot)
re-arrange classes for better maintainability
fixed bug in fill of zero bin
added ability to change x label on time axis
added ability to change time axis format
fix bug caused by mismatch in time formats
fix bin overflow problem
fixed integral finally
fixed (I think) a bug with plotting integrated data vs time
========== e_errors.py ====================================================================================
Sometimes Memory Error occurs when reading data from network (memory leak). This was incorrectly interpreted as ENCP_GONE. Now if this occurs mover will dismount a tape and restart itself
Added a "NOT SUPPORTED" Enstore error.
========== manage_queue.py ====================================================================================
added trace
bug fixed
Add file family counters for write requests. If file family counter is bigger than file family width let idle mover to pick up the request, rather than wait for the mover with bound volume.
fixed updating proirity
request counter fixed
request couter fixed
trace added
trace added
restrict the number of requests in the queue
make priority to grow by 1 every 1/2 hour by default
========== migrate.py ====================================================================================
bug fix for new migration file family rule
relax the constraint that '-MIGRATION' has to be the suffix
backoff changes
library_manager.py
========== host_config.py ====================================================================================
Mofified the order that enstore/encp will look for ENSTORE_CONFIG_HOST/PORT/FILE. With this change it will: 1) Check for the environmental variable.* 2) Check for ~/.enstorerc 3) Check for /etc/enstore.conf 4) Check for /pnfs/.../.(config)(enstore)/(enstorerc) 5) Use the default enstore constant.* 1 Means it is the same as before.
Catch sys.stderr.write() errors.
Have unset_route() and update_route() return if a type error occurs. This now works like set_route() does. This avoids a traceback.
========== enstore_overall_status.py ====================================================================================
Added --html-dir and --config-hosts switches.
Fixed this to obtain the known_config_servers section of the default configuration. The loop over it to find the configuration server list to obtain each systems status. Then make the web page using this information (this part didn't change).
temporary fix for d0en, remove hardcoded values
put into prodiuction local changes
change red ball to question mark for communication timeout
========== interface.py ====================================================================================
Catch sys.stderr.write() errors.
added mover-dump to dump internal variables of mover
========== mover_client.py ====================================================================================
Added a --list swich to the mover, library manager and media_changer clients. This will list the currently configured servers, respectively.
added mover-dump to dump internal variables of mover
========== inquisitor.py ====================================================================================
if mover in wam_queue is None do not process it
add reason for override
========== monitor_client.py ====================================================================================
Catch sys.stderr.write() errors.
========== alarm_client.py ====================================================================================
Updated the severity help string to include 'C' for emailable errors.
========== mounts_plot.py ====================================================================================
introduce drive utilization plot
re-implemented mounts plot so it shows all libraries (non null media type)
adopted mounts_plot to work with database
========== drivestat_server.py ====================================================================================
bug fix
add 30 second delay and retry for the first db connection failure
========== enstore_constants.py ====================================================================================
created plottre framework, a module used by framework, example main module and example ratekeeper plotter (to replace makeplot)
Mofified the order that enstore/encp will look for ENSTORE_CONFIG_HOST/PORT/FILE. With this change it will: 1) Check for the environmental variable.* 2) Check for ~/.enstorerc 3) Check for /etc/enstore.conf 4) Check for /pnfs/.../.(config)(enstore)/(enstorerc) 5) Use the default enstore constant.* 1 Means it is the same as before.
move r_a to enstore_constants
Added two constants for the mover and entv: MIN_TRANSFER_TIME and MAX_TRANSFER_TIME.
========== scanfiles.py ====================================================================================
handle duplicate location errors better
Remove skipping of "volmap", ".A" and ".B" files or directories.
Add an age specifier to duplicate location. Also, fixed a missing layer 1 error.
Address duplicate location errors with one of the files marked deleted.
If a location is duplicated and the tape is in a shelf library, then the information is not flagged as an error.
If one of the files flagged for a "duplicate location" is deleted, consider it a warning, not an error.
Handle the case where a reverse scan finds a file without layer 1, but does have layer 4.
Fixed a problem with handling some orphan files. Also, ignored some false errors if a user creates multiple hardlinks to a file, then deletes the original.
Scan directories the begin with ".removed". Don't continue to ignore them.
After all the recent speed improvements, handling multiple copy files became broken. This is now fixed.
Fixed a stupid bug introduced in the last commit.
Fixed a bug where missing file DB info was not getting checked when scanning an entire tape. Also, tweeked get_layer() to go faster.
Made some changes for speeding up the scan. First, made changes for scans of dzero to handle files originally written as /pnfs/sam/lto to be scanned as /pnfs/fs/usr/sam-lto without doing a full name lookup. Also, fixed process_mtab() to stop doing the same os.path.join() twice.
Tweek a paramater to avoid some unnecessary path lookups. It was 2, now is 5. This corresponds to the number of path names to look at trying to find out if they are not in the current pnfs path. For example, a file originally written as /pnfs/sam/dzero is under /pnfs/fs/usr/dzero while being scanned. This paramater skips N directories in hopes to skip over /sam/ and match to dzero. Other changes made to the way scanfiles.py works make 2 unreallistic and thus the change to 5.
Added retries in get_layer(). Commented out code for using alarm() in get_stat() (signals in multithreaded python programs don't work correctly yet). Warnings about no alarm or log server found are suppressed (a bug existed from a previous attempt). Lastly, using new orphan detecting code in pnfs, we can better/correctly report orphan files.
Bug fix to both parse_mtab() functions. They assumed the mtab file is always named /etc/mtab, but that my not be true. Solaris, for example, calls theirs /etc/mnttab.
Added support for the scanfiles.py to be usable via "enstore scanfiles".
Have scanfiles.py use the new pnfs.py that has get_path() returning a list of possible matches.
Modified the error message reported when there is trouble retrieving volume information.
Handle multiple copy files correctly.
Put try ... except around os.stat() call.
Fix get_layer_1() if there is no layer 1 information. Also, when looking for the file by name, give the "no pnfsid in db" error if necessary.
Added --external-transistion. Also, fixed some other reverse scan issues.
Fixed some minor bugs. get_layer_1() now returns a string intead of a one element long list. check_bit_file() will now search all pnfs DB before declaring it couldn't find a file (when it finds a wrong file that just happens to have the same pnfs id).
Added recent fixes for get_layer_1() and get_layer_4(0 by abstracting the common parts out into get_layer(). This made doing the same thing for layer 2 (and even get_database()) much easier. Also, the scan should now stop with the anoying messages about the log_server and alarm_server entries not being found in the configuration.
Handle the posibility that what once was a filepath is now a directory path for reverse scans. Also for referse scans, before doing a full PNFS reverse lookup, first reset the effective uid and gid back to root (0) if the real id is root (0). Leaving the effective uid can lead to permission errors being reported.
Address issues regarding the /pnfs/fnal.gov/usr/ type of paths. These were causing full path lookups, which really slowed things down. Also, address issues with reading ".(get)(database)" and layer 4 information when the effective UID does not have sufficent permissions. Now it retries as root (if originally started as user root).
Fixed some /pnfs/xyz to /pnfs/fs/usr/xyz translation code. This will hopefully avoid get_path() calls.
Modified the scan to be able to handle pnfs mount points from multiple pnfs servers. It should now be patient and find the correct mount point. Then add it to the cache to speed up remaing file scans.
Don't clobber older mount points in the internal list just because another mount point on another pnfs server has the same database id.
If a permission error is found while searching for the file (becuase the scan is not running as root), give the permission error. Also, fixed an unrelated bug when running the scan as root. The code that detects if it needs to munge the original pnfs path to reflect the current mount point was defective. Now it does the correct thing and works much faster since, it avoids full reverse name lookups.
Fix the setting of last_db_tried. It should prevent uncessasry lookups.
Fixed some more typos.
Changed basname() to basename(). Added the seteuid() calls to forward scans. Catch sys.stderr.write() errors.
When running as root, set the uid and gid to the owner of the current file.
Use pnfs.strip_pnfs_mountpoint() to avoid uncecessary get_path() calls.
If just the file family is mismatching between layer 4 and the current file family (from the volume family) don't give an error but info instead. In the cases where the volume's ff was changed this conflict would be expected.
Don't use csc.get_library_managers(). There was a problem if there were not fully configured library_managers. This in turn caused a problem with running scans against offline Enstore database.
Fixed the case where one file is hard linked from two (or more) directories. The directory that parent_id pointed would have scanned as okay. But the "copies" would not. This change does an additionial stat() on the alternate filepath and compares that stats. If they match there is no error.
Fixed a bug caught by pychecker/mylint.
Allow for scanfiles.py to take as input its own output from an previous execution. This looks for the string " ... " as the deliminator.
Modified the scan to identify more cases of orphaned files correctly.
Two things. First a bug was fixed in normalizing the current path. If the traditional enstore path was found, but the file was originally written through /pnfs/fs/usr the code would try and to an expensive get_path() call. Second, pnfs.Pnfs.__init__() calls get_path() and in addition scanfiles calls get_path. We only need one. Avoid the __init__() one by rewriting this line of code to only call get_path() once.
Fixed a bug for reverse scans. If a mount point is found that knows about the real pnfs database, but a mountpoint for that database was not found we just use the ".(access)()" name.
Greatly refactored how check_bit_file() works. It was just to slow. It is now a lot faster.
Bug fix if scanning on a node that doesn't have /pnfs/fs mounted.
========== ensync.py ====================================================================================
Catch sys.stderr.write() errors.
========== log_server.py ====================================================================================
if log path does not exist try to create it
Catch sys.stderr.write() errors.
========== generic_server.py ====================================================================================
When a server recieves a NEWCONFIGFILE message from the event relay, it should take actions to reinitialize itself. These changes make the possible. The test servers are the ratekeeper and the info server.
Moved automatic updating of the list of valid ips to handle_er_msg() from DispatchingWorker.process_request(). This saves having to look up if the list is accurate every time and resolves membership issues warned by pychecker.
========== pnfs.py ====================================================================================
After writing layer 1 or layer 4, try reading it back right away to make sure that pnfs processed the write() correctly.
Fixed a bug that prevented Pnfs.__init__() from correctly handling pnfsids as the pnfsFilename parameter. A change that get_path() now returns a list was not corrected previously.
A call to os.geteuid was missing its parenthisese. Thus the match between the function os.geteuid and the integer zero were always failing to match. Now the returned value from os.geteuid() is used to test if it equals zero or not.
Fixes to better detect and report orphaned files.
Forgot a break statement in the previous commit.
Bug fix to both parse_mtab() functions. They assumed the mtab file is always named /etc/mtab, but that my not be true. Solaris, for example, calls theirs /etc/mnttab.
Fixed an issue with get_path() handling /pnfs/fs being the only mount point that knows about our pnfsid.
It is sys.exc_info() not just sys.exc_info for getting traceback info.
Better dealing with metadata when a single machine mounts multiple pnfs servers. This goes for both of encp when reading and for enstore pnfs --path.
Slight change to previous commit to include one more weird combination while finding the correct mount point.
Fixed a few things pychecker didn't like with previous commit.
Fixed a number of spots in the code where it tries to find the path of a file knowing the pnfsid. These changes are related to handling of cases where a machine has numerous pnfs mount points to different pnfs servers.
do away string exception
Modification to better weed out the wrong pnfs mountpoints.
Additional fix to previous commit.
Fixed a problem in _get_mount_point2(). os.stat() calls behave differently with respect to nameof vs access file(name)s.
Return the correct path for some cases when searching for the correct pnfs db starting directory.
Fixed get_mount_point2() to use the correct database number when checking if the current pnfs mount point is the one we are looking for.
Have _get_mount_point2 report permissions errors correctly. Previously, they were returned interpreted as "No such file or directory".
Fixed get_pnfs_db_directory() to deal with systems that have pnfs mounted from two different pnfs servers. This allows it to tell the two apart and return the correct response instead of just the first one it finds.
Make --size a USER2 (aka dCache) switch.
Handle errors better in get_bit_file_id(). Give the real filename for ENOENT.
Added function strip_pnfs_mountpoint() that removes the "/pnfs/" or "/pnfs/fs/usr" from a pnfs path. Also, modified _get_parent() to use readline() instead of realines(); there looks to be some performace gain.
When doing a --nameof for an orphaned file, give an accurate error message.
Fix a bug with recent changes to get_path(). It dealt with finding the path for files in a database not at the top of a mountpoint.
Modified pparent(), pnameof(), et. al. to catch ValueErrors. This is what gets raised if the arguement passed in is not a pnfsid.
Added support for pnfs commands that take a pnfs id, to work from any directory. It uses the /etc/mtab file to find the correct mount point. Also, gave get_mount_point() the correct symantics. Introduced get_pnfs_db_directory() to support the old get_mount_point() symantics.
Add a comment about the behavior of get_path().
Readability format change. Nothing functional.
Allow for N.__init__() to optionally take a directory as an arguement. Allow for --library to be able to set comma sperated lists of libraries. Catch certain errors when exceptions are raised trying to instantiate the Pnfs, N and T classes (ie. file does not exist).
Modified is_pnfs_path() to deal with ENOENT differently than other errors when checking for the existence of the file.
Added the --mount-point switch. It will return the top directory of the current pnfs db.
========== ftt.py ====================================================================================
_ftt was renamed to ftt2. This change reflects the name change.
========== media_changer_client.py ====================================================================================
Fixed aci wrapper to build with swig 1.3
Added --list-clean to the media changer client command line interface. It prints out the cleaning tapes and their remaining cleaning counts.
Added timeout and retry arguments to list_slots(), list_drives() and list_volumes().
Added a --list swich to the mover, library manager and media_changer clients. This will list the currently configured servers, respectively.
Implimented --list-slots for AML2 and STK. It lists the total, free, used and disabled slots/cells for each robot.
Add support for "enstore media --list-volumes".
Added --show-robot, --show-drive, --show-volume and --list drives to the "enstore media" command. Many related changes to support obtaining the underying information from the robot(s).
Added a --volume command to return the results of a "query volume" request.
========== aml2.py ====================================================================================
Fixed aci wrapper to build with swig 1.3
Uses aci_drivestatus3() instead of aci_drivesstatus2(). Also, replaced whrandom.randint() with random.randint() since whrandom is depricated.
Include a flag for AML2 to indicate if a drive is empty or not.
Report the correct error when aci_robstat() fails.
Implimented --list-slots for AML2 and STK. It lists the total, free, used and disabled slots/cells for each robot.
Add support for "enstore media --list-volumes".
Added --show-robot, --show-drive, --show-volume and --list drives to the "enstore media" command. Many related changes to support obtaining the underying information from the robot(s).
========== accounting_client.py ====================================================================================
revert to 1.18
bugs fixed
added some queries for encp transfers and encp errors
========== ftt_driver.py ====================================================================================
some cleanup
========== dcache_monitor.py ====================================================================================
take care of moved files as well
fix indexerror
fix against IndexError
check volatile files for minos 26 hours back
handle zero length files with zero size correctly handle volatile files
catch exception
filter out files in volatile pool
send only differences to the dcache-admin
changes to alow to run this script as root
syntax error in rsh command
add user="enstore" to p.DB call
correct rm option
correct rm option
fix typo
fix typo
fix typo
fix typo
remove comparison to boolean
use pnfsidparser
studying bless.py behaviour
studying bless.py behaviour
studying bless.py behaviour
studying bless.py behaviour
added e-mail notification if there are files older than 24 hours
added print
make sure we are looking at files created at least one hour prior to running
remove test database
========== option.py ====================================================================================
Support the new --print switch for the configuration client.
Added --html-dir and --config-hosts switches for enstore_overall_status.py.
Added support for --file-fallback in the configuration client. This switch alters the use of --show to read from the local copy of the configuration file (if present), when the configuration server is down.
Added --list-clean to the media changer client command line interface. It prints out the cleaning tapes and their remaining cleaning counts.
Mofified the order that enstore/encp will look for ENSTORE_CONFIG_HOST/PORT/FILE. With this change it will: 1) Check for the environmental variable.* 2) Check for ~/.enstorerc 3) Check for /etc/enstore.conf 4) Check for /pnfs/.../.(config)(enstore)/(enstorerc) 5) Use the default enstore constant.* 1 Means it is the same as before.
Added support for "encp --copy N" where N is the number of extra copies to make of the file. Also, included is a regression test.
Added --list-movers, --list-library_managers, --list-media-changers to the "enstore config" command.
Updated a comment to reflect which clients use --list.
Implimented --list-slots for AML2 and STK. It lists the total, free, used and disabled slots/cells for each robot.
add show_file and show_copies
Add support for "enstore media --list-volumes".
Added --show-robot, --show-drive, --show-volume and --list drives to the "enstore media" command. Many related changes to support obtaining the underying information from the robot(s).
These changes allow for --erase to be usable for the file clerk client. Support for hidden options now exist in option.py. --erase for the "enstore file" command is now a hidden option. This means that it is not shone in the --help or --usage output and is avaliable only to adminstrators. Currently, the code has the constant ALLOW_ERASE set to false, to enable this command this constant only needs to be set to true.
Add support for scanfiles.py --external-transistion. This required a bit of redesign to support handling the options correctly.
add --is-up
Catch sys.stderr.write() errors.
Support the building of a version of encp for dCache.
Update encp to allow for regression tests to force encp to use the pnfs agent; even if the specified filesystem is already mounted.
added mover-dump to dump internal variables of mover
========== callback.py ====================================================================================
We need to catch IOError or OSError incase the open of /proc/net/tcp fails. On 9-10-2007, an encp gave a traceback opening /proc/net/tcp because of "No such file or directory". How that can happen to a file in /proc, I don't know.
remove rexec dependency
Added ValueError to the list of exceptions caught in __get_socket_state().
Document what would be needed to only use cPickle.
Modify read_tcp_obj() to first try and un-pickle the message. If that does not work it will try the _eval(). This should allow for us to move away from repr() and _eval() altogether some time.
Log more information when a bytecount error occurs. The new information is peername of other end of the socket.
Corrected an incorrect comment.
========== movcmd_mc.py ====================================================================================
remove rexec dependency
========== entv.py ====================================================================================
Turn off the alarm() when leaving mainloop(). I believe that alarms live accross exec()s and the alarm is triggering before the new entv process gets a chance to set the signal handler.
Removed local defines of r_eval functions.
Catch sys.stderr.write() errors.
Up the alarm signals from 10 seconds to 10 minutes.
Added support for the main thread to use SIGALRM signals to wake up a test to see if anything is going on.
Attempt to speed up entv.
Fixed a recently indroduced bug that prevents all of the .entvrc file from being read. Now all of the client_color and library_color lines will be read.
Fixed a problem with the entv window being resized. Now the each display Canvas is also resized.
Patch to fix the resizing of each canvas.
Speed up the get_entvrc() function.
Modification to help stop entv from consuming too much memory.
Improved how the fast connect_command() works. Also, cleaned up how the connection library vs. client colors are toggled.
Fixed a bug for displaying scheduled down and outage reasons. Only the last reason processed would 'win.' Speed up the startup and shutdown of entv. Most of this was changing how the .entvrc file was accesses. Added the functionality to have different systems in seperate canvases, instead of having them share one canvas. Makes it easier to read. Lastly, added two radio buttons to the menu. These are for changing how the connection color is chosen. The choices are for client or mover (as grouped by library) colors.
Use the $DISPLAY environmental variable it is set for screenName.
========== udp_server.py ====================================================================================
Immediatly after closing self.server_socket in __del__() set self.server_socket to None.
Create and use the udp_common.r_eval() and udp_common.r_repr() functions. This will help make dumping eval and repr a little easier someday.
When a server recieves a NEWCONFIGFILE message from the event relay, it should take actions to reinitialize itself. These changes make the possible. The test servers are the ratekeeper and the info server.
Cleanup Trace message usage.
Fixed dispatching_worker.process_request() to handle a None value being returned from udp_server.process_request(). Also, added better debugging messages to dispatching_worker and udp_server.
process_request() can now handle requests that didn't originally come from get_message(). The media_changer uses functionality of the dispatching_worker.py to communicate messages between forked processes. It is these messages that process_request() can now process.
Fix "get" to work with the new udp_server.py changes.
Attempt to fix the config server problem of receiving the wrong response. udp_server was not thread safe in the interaction between process_request() and reply_to_caller*().
========== library_manager_client.py ====================================================================================
fixed a bug
moved thread start
Added a --list swich to the mover, library manager and media_changer clients. This will list the currently configured servers, respectively.
========== file_utils.py ====================================================================================
remove rexec dependency
========== plotter.py ====================================================================================
fix typo
define pts_dir from config server
fill destination directories with *gifs
import string to make pychecker happy
first attempt to fix plotter to run w/o harcoded (external parameters)
create destination directories if not existent
========== delete_at_exit.py ====================================================================================
Catch sys.stderr.write() errors.
========== enstore_stop.py ====================================================================================
removed a debug print statement
fixed to stop configuration_server
some mods for dealing with service IPs
more fixes
more fixes
removed print statement
another fix
bug-fix, half way
fixed a bug
fixes for mover stop
try to stop mover by sending quit command
========== esgdb.py ====================================================================================
bug fix
========== priority_selector.py ====================================================================================
read config in init
reload allowed to write discipline and priority when a new config file is loaded
========== enstore_display.py ====================================================================================
Reduced the frequency that the messages get processed. There was a math error that was doubling the intended numbers of calls to process_messages().
Minor code cleanup.
If the mover is in the "Unknown" state and there is no reason for it to be down from the inquisitor/scheduler; then change the background color of the mover to yellow.
For python 2.2 python, don't use __builtins__ when define sum().
Catch sys.stderr.write() errors.
Define last_message_processed in the Display.__init__() function.
Added support for the main thread to use SIGALRM signals to wake up a test to see if anything is going on.
Remove canceling the 'after' action of the next timer update. This is becuase this actions callback is being called, and there is no need to cancel it anymore. This will hopefully save some cpu cycles.
Attempt to speed up entv.
Added some "--verbose 5" trace messages. Modified process_messages() to handle a large number of messages better.
Modification to help stop entv from consuming too much memory.
Improved how the fast connect_command() works. Also, cleaned up how the connection library vs. client colors are toggled.
Fixed a bug for displaying scheduled down and outage reasons. Only the last reason processed would 'win.' Speed up the startup and shutdown of entv. Most of this was changing how the .entvrc file was accesses. Added the functionality to have different systems in seperate canvases, instead of having them share one canvas. Makes it easier to read. Lastly, added two radio buttons to the menu. These are for changing how the connection color is chosen. The choices are for client or mover (as grouped by library) colors.
========== get_total_bytes_counter.py ====================================================================================
fix it so it no longer depends on external parameters
Updated the list of libraries for the total bytes counter.
========== backup.py ====================================================================================
log making JOURNALS directory
do not rely on the return code of enrsh any more
fix the bug in making remote directory
create JOURNALS directory if it is not there
use copy instead of mv for archiving on the same node
Replaced calls to option.default_port() and default_host() with those from enstore_functions2.
print dbInfo, too
print dbInfo, too
print backup_config for debugging
========== enstore_functions2.py ====================================================================================
Mofified the order that enstore/encp will look for ENSTORE_CONFIG_HOST/PORT/FILE. With this change it will: 1) Check for the environmental variable.* 2) Check for ~/.enstorerc 3) Check for /etc/enstore.conf 4) Check for /pnfs/.../.(config)(enstore)/(enstorerc) 5) Use the default enstore constant.* 1 Means it is the same as before.
Add functions this_host() and is_on_host().
Added the get_media_changers() function to the configuration server. Also, modified the get_movers() function to accpt an empty library name to mean return all movers. And cleaned up some of the code. This uses a new function in enstore_functions2.py that returns the locatation of the current configuration file.
========== cleanUDP.py ====================================================================================
If select.select() in Select() gives a traceback with errno 4, "Interrupted system call", then retry the select.select().
SO_NO_CHECKSUM is backwards. Setting this to 1 turns off the checksum checking. We really want to force it to zero (the default) for the udp checksum checks to be performed.
Turn on UDP checksums for Linux. Linux supports these checksum on a per- socket basis. Almost all other Unixes support it on a per system basis.
__del__ needs to be taken out. python 2.4.3 complains about it: Exception exceptions.TypeError: "'NoneType' object is not callable" in > ignored Also, __del__() functions cause the garbage collector problems. There are times when it doesn't know in what order to call them (from different objects), so it doesn't try.
If the reply/sendto address is empty don't try and send the message. This allows UDPServer.process_messages() to process internal media_changer messages without cleanUDP.sendto() breaking.
========== enstore_admin.py ====================================================================================
Building encp/enstore against python 2.4.3 succeeds, but then running enstore fails with an ImportError stating that the _strptime module could not be found. In reality this is a python library, not a module. Explicitly importing this library solves the problem.
========== log_trans_fail.py ====================================================================================
Get the FAILED Transfers log page going again.
Made SDE ready.
Made SDE ready.
get log_dir from configuration server
========== inventory.py ====================================================================================
replace cms_volume_with_all_deleted_files
use makedirs()
take care of missing directories
better format for RECYCLABLE_VOLUMES
fix a typo
show mount_count instead of active in RECYCLABLE_VOLUMES
correct a typo
Exclude "null" and "8MM" media types from recycling consideration.
fix a typo
1 Take care of all 9940B libraries [2] list readonly volumes with all deleted files as candidate for recycling
take care of LTO3
bug fix
add wp count for blanks
========== info_server.py ====================================================================================
Added a log message to reinit() saying that it is reconfiguring itself after being notified by the configuration server there is a new configuration.
mylint doesn't like +=
reconnect only on connection error, not all pg.Error
remove a redundant Trace.log()
more exception handling with reconnection to the database
bug fix
add multiple copies inquries
add file_info()
catch bad bfid_info exception
catch bad bfid_info exception
When a server recieves a NEWCONFIGFILE message from the event relay, it should take actions to reinitialize itself. These changes make the possible. The test servers are the ratekeeper and the info server.
========== encp.py ====================================================================================
bumping version to v3_7 because of encpCut
Allow FQDN in the disk volume labels. For charset, hostnamecharset is now defined.
On Solaris 10 /tmp is a swap partition. os.pathconf(PC_FILESIZEBITS) returns -1 instead of the number of bits. In this case use 32 for the number of bits that can store the size of a file.
Include the user_level (ADMIN, USER or USER2) in the tickets for possible future use.
Fixed the bug in encp that was preventing the RESUBMITS count from being incremented correctly.
When --get-bfid and --override-deleted are test filename to be longer than 0.
When sending an original request (with more copies to follow) update the work_ticket to include the number of copies still to come.
Modified the "encp aborted from" output to include more info.
The patch fixes a bug if --get-bfid matches to to many files for the pnfsid. It uses the bfid in layer 1 to determine which one it really wants.
Use getattr(errno, 'EFSCORRUPTED', errno.EIO) instead of errno.EFSCORRUPTED to avoid pychecker warnings. Added a check in librarysize_check() to check the result of the returned library info from the config server. Currently, goes stait into pulling out data, when the error saying the library does not exist is ignored.
Report a more accurate "Trouble with pnfs" error message in set_pnfs_settings.
Corrected the /etc/mnttab filename for SunOS. Was previously /etc/mntab, which is the wrong name. Also, corrected the Trace.log() line that is supposed to output this error.
Fixed a bug regarding errno module. It should have been errno.EIO and not errno["EIO"].
bumping version to v3_6i because of encpCut
bumping version to v3_6h because of encpCut
Added support for "encp --copy N" where N is the number of extra copies to make of the file. Also, included is a regression test.
Fix encp.py for "get" to work correctly if --sequential-filenames is used.
Support handling 0 or 1 seeded adler32 CRC values instead of just 0 seeded adler32 values.
Fixed some bug that the regression testing found with the previous commit.
Better dealing with metadata when a single machine mounts multiple pnfs servers. This goes for both of encp when reading and for enstore pnfs --path.
Fix the anoying bug that causes sdsscp/get to create new copies of files that have already been read and their metadata created.
bumping version to v3_6g because of encpCut
Address the error handling when encp finds just whitespace located in pnfs layers. It didn't help that encp would set layers to a single space in clear_layers_1_and_4() (also fixed).
bumping version to v3_6f because of encpCut
Change --file-family-wrapper from and admin option to a user2/dcache option.
Log the exception if it occurs in __is_pnfs_local_path(). This should give a clue to what was the real cause the next time it happens. Patched get_volume_clerk_info() to handle a None value for the returned vc_ticket.
Fix a get specific issue related to the create_read_requests() function split into create_read_requests() and create_read_request().
Make the changes for encp to use the new alarm server functionality to send e-mail based on storage groups.
Fixed a spelling error: uninque =? unique.
Catch sys.stderr.write() errors.
Fixed a pnfs_agent related bug. If encp was told to only use the pnfs_agent for PNFS access, then encp was re-throwing the OSError "Force use of pnfs_agent". This re-raise would cause a traceback.
Do the same for create_write_requests() that was done for create_read_requests(). There are now create_write_requests() and create_write_request() functions. This is to provide better error reporting.
Split create_read_requests() into to functions. The new function name is create_read_request(). This is to facilitate better error messages.
If an exception occurs sending an error to the accounting server, don't worry about it. Just log the error and move on. Also, removed some code from create_read_requests() when reading layer 1. This functionality was moved to pnfs.py.
Fixed a bug if the input file is non-existent in a non-pnfs filesystem. Previously an incorrect error message was given. Now the correct one stating that the file does not exist is given.
Fixed is_pnfs_path() to honor REMOTE_ENCP. This was done by using __is_pnfs_remote_path() instead of doing the same (incorrectly) thing inline.
Fixed a bug when reading a deleted file and the user used --override-deleted. The problem files were deleted files that had valid 'pnfs_name0' entires in the Enstore DB. encp was trying to stat() these files and failing the transfer over it.
Fix "get" to work with the new udp_server.py changes.
bumping version to v3_6d because of encpCut
Cleanup of the HSM vs. RHSM file handling.
Protect calls to get_pac() if the pnfs_agent should not be used. This should avoid future false 'pnfs_agent' configuration server errors.
Yet another modification to get_stat(). This one is for it to better take into account the --pnfs-is-automounted switch.
Modified get_stat() to handle intermitent pnfs errors better. This might be new with Linux 2.6, but the EIO errno value can be returned (in addition to or replace ENOENT?). Also, Added a check for the environmental variable REMOTE_ENCP to avoid yet another way to get the 'pnfs_agent' CONFIG error.
Fix a bug where specifying the wrong inputname when reading was resulting in the ".(use)(1)()" file being returned instead of the actuall filename.
bumping version to v3_6c because of encpCut
This should allow for encp to ignore bogus callbacks. This is in response to the security team doing 65,535 port scans. Their scans find encp waiting for a mover to callback and connect. Since the scan does not send what encp is looking for it should ignore the connection. It previously was doing that, but doing so by doing a full retry. Now it goes back to waiting for another connection on the listening socket.
Fix using --get-bfid and --override-deleted switches together. This was a recently introduced bug. Also, catch a situation in handle_retries() to aviod going into an infinit loop when we don't know yet what request attempt failed (aka TCP_EXCEPTION error).
Fixed one more location where the: Got error while trying to obtain configuration: ('KEYERROR', "Configuration Server: no such name: 'pnfs_agent'") errors were comming from.
Fixed a bug in read/write_stall_transfer(). There can be a socket.error raised that wasn't being caught. Now both select.error and socket.error are handled. Also, fixed a bug as to how encp notices if the input file is a directory. Thirdly, fixed a bug that would cause encp to use the wrong tags if the pnfsid contained and "F" because is_access_name() would fail.
bumping version to v3_6b because of encpCut
Support the building of a version of encp for dCache. Workaround for the PNFS ghost file problem. Instead of leaving the true filename as a ghost file, we will leave the temporary filename as a ghost file.
Removed debugging statements.
Missed a global statement that pychecker found.
bumping version to v3_6a because of encpCut
Update encp to allow for regression tests to force encp to use the pnfs agent; even if the specified filesystem is already mounted.
========== enstore_show_inv_summary_cgi.py ====================================================================================
change 'unknown' cluster to the node name
update list of special files
========== alarm_server.py ====================================================================================
fix send_mail program
add more messages
catch Exception and print message
ctach the exceptin in alarm_server.py
Make the changes for encp to use the new alarm server functionality to send e-mail based on storage groups.
add comment
revert the previous change
handling alarm_info
mail per a pattern in config
========== event_relay_client.py ====================================================================================
When a server recieves a NEWCONFIGFILE message from the event relay, it should take actions to reinitialize itself. These changes make the possible. The test servers are the ratekeeper and the info server.
========== acc_daily_summary.py ====================================================================================
fix a typo
add dbport
========== set_lm_noread.py ====================================================================================
Replaced calls to option.default_port() and default_host() with those from enstore_functions2.
========== atomic.py ====================================================================================
Workaround for the PNFS ghost file problem. Instead of leaving the true filename as a ghost file, we will leave the temporary filename as a ghost file.`
Add filenames to error messages.
========== event_relay.py ====================================================================================
default_port() and default_host() are now in enstore_functions2.py.
========== pnfs_agent_client.py ====================================================================================
remove rexec dependency
Fixed the return code that "enstore pnfs_agent" gives the caller.
========== configuration_server.py ====================================================================================
Allow the configuration_server to listen on all IPs configured for the machine.
Added comments about get_library_managers() and get_media_changers() not being thread safe.
Mofified the order that enstore/encp will look for ENSTORE_CONFIG_HOST/PORT/FILE. With this change it will: 1) Check for the environmental variable.* 2) Check for ~/.enstorerc 3) Check for /etc/enstore.conf 4) Check for /pnfs/.../.(config)(enstore)/(enstorerc) 5) Use the default enstore constant.* 1 Means it is the same as before.
Added the get_media_changers() function to the configuration server. Also, modified the get_movers() function to accpt an empty library name to mean return all movers. And cleaned up some of the code. This uses a new function in enstore_functions2.py that returns the locatation of the current configuration file.
========== enstore.py ====================================================================================
Replaced calls to option.default_port() and default_host() with those from enstore_functions2.
Added support for the scanfiles.py to be usable via "enstore scanfiles".
========== multiple_interface.py ====================================================================================
Catch sys.stderr.write() errors.
========== dispatching_worker.py ====================================================================================
added reset_interval method
Moved the incrementing of self.n_childern to after the os.fork() call. If os.fork() were to throw and exception then self.n_children would be incorrect.
Added align_interval bool as an argument to add_interval_func(). This will allow a server to align their scheduled funtions to the nearest interval. For example: if the interval is 15 minutes then the scheduling will run the function at minute 0, 15, 30 and 45 instead of every 15 minutes from when it was started. If you want them to start the scheduling now, set align_interval to false.
added diagnstic message
Reverting changes that added enabling UDP checksums.
ensble checksum
Fixed dispatching_worker.process_request() to handle a None value being returned from udp_server.process_request(). Also, added better debugging messages to dispatching_worker and udp_server.
Added some logging of errors inside process_request(). Hopefully, this will allow us to track the root cause of an error.
More changes with respct to using the thread safe udp_server. This set of changes has to do with not breaking the media_changer that uses functionality of the dispatching worker to allow child and parent process to communicate.
Don't use varibles before they have been defined.
Add TypeError to the list of exceptions to catch in process_request().
Attempt to fix the config server problem of receiving the wrong response. udp_server was not thread safe in the interaction between process_request() and reply_to_caller*().
========== get.py ====================================================================================
bumping version to v1_53 because of sdsscpCut
If wait_for_final_dialog() gets and error, don't send an print.pformat()ed ticket to the log server.
bumping version to v1_52 because of sdsscpCut
Catch sys.stderr.write() errors.
bumping version to v1_51 because of sdsscpCut
Fix "get" to work with the new udp_server.py changes.
bumping version to v1_50 because of sdsscpCut
bumping version to v1_49 because of sdsscpCut
Replace a call to udp_socket.process_request() with one for udp_socket.do_request().
========== udp_client.py ====================================================================================
Reverting changes that added enabling UDP checksums.
ensble checksum
Create and use the udp_common.r_eval() and udp_common.r_repr() functions. This will help make dumping eval and repr a little easier someday.
Catch sys.stderr.write() errors.
========== volume_clerk_client.py ====================================================================================
remove rexec dependency
bug fixing
revert to AAXX99{L1|L2}
allow volume label format AAXX99{L1|L2|L3}
========== discipline.py ====================================================================================
reload allowed to write discipline and priority when a new config file is loaded
========== net_driver.py ====================================================================================
comment trace.log
========== mover.py ====================================================================================
additional check for eod cookie
Include a patch for DiskMover (Mover already has it) to pass along the number of copies to follow.
set mover to offline state with tape in it if set_volume_no_access has failed.
Set volume to NOACCESS and generate alarm if set_remaining_bytes failed.
fixed a bug in read_tape
bug fixed
fixed crc processing for disk mover reads
fixed bug
in disk mover if data directory does not exist create it
increase LM poll interval 3 times if not in IDLE or HAVE_BOUND
Raise an alarm if the write tab status reported by the drive does not match the write tab status recorded in the volume DB.
adde pid to start_draining return ticket
fixed the quit method
fixed the quit method
added quit function
test code is removed
improve memory error exception processing
revesed changes for immediate restart, it does not work
send erro message to lm in transfer failed for memory error
restart mover immediately om memory error, do not dismount the tape
fixed conversion of crc in read tape
fixed conversion of crc in read tape
pychecker inspired cleanup
pychecker inspired cleanup
Sometimes Memory Error occurs when reading data from network (memory leak). This was incorrectly interpreted as ENCP_GONE. Now if this occurs mover will dismount a tape and restart itself
added diagnostics for ENCP_GONE investigation
modifications related to crc_seed
modifications related to crc_seed
changed crc_seed presentation in the configuration
1. By default set CRC seed to 1. This is to select a correct value of seed for Adler32 crc_seed has to be specified in the configuration file to use 0 as it was before. 2. Added "copies" key to thransfer to file clerk for multiple copies intent.
flush the driver to help tape thread to exit faster
fixed indentation problem
if mover has retried the loadvol due to MC_QUEUE_FULL there can be a situation when the tape was actually mounted the next mount attempt will result then in the error message that the tape has been already mounted. This fix is to verify that the requested tape is actually in the requested drive
Python 2.0 had a fcntl and a FCNTL module. Starting with python 2.2 there was only fcntl. These changes comment out the code that would put the FCNTL constants into fcntl for old python 2.0 versions.
added error processing of mc GetWork reply on a mover startup
Limit the 'transfer' notify messages to no closer than 1 second apart. Also, send a 'transfer' notify message no more than 5 seconds after the last one.
make LM and mover working with differnt volume clerks for sharing movers across systems
log_state and dump vars in memory_usage if mem. ran out of limits
commented dump_vars in log_state
restart mover when it approaches a memory limit to dal woth memory error
diagnostics added
log state befor each transfer
fixed a bug
dont call init_data_buffer in transfer_failed as it may cause exceptions in some thread. Also minor fix for log message
fixed a bug
set tape readonly on FTT_EBLANK
if transfer fails in media thread when positioning the tape dismount volume directly not starting another media thread
added diagn for write error
fixed a bug
call reset init_data_buffer transfer is completed to reset buffer
call reset after transfer is completed to reset buffer
log message added
added mover-dump to dump internal variables of mover
========== enstore_html.py ====================================================================================
fixed path to log files
fixed getting configuration ether from config server of from config file
debugging issues on ccfsrv2
get html directory from config server
modify location of log files
bug fixed
WARNING I changed "cgi-bin/enstore" to "cgi-bin"
remove rexec dependency
paint time stamp if picture is outof date
paint time stamp if picture is outof date
use WWW_DIR environment variable if set
do not output r_a info for an alarm
========== enstore_user.py ====================================================================================
Building encp/enstore against python 2.4.3 succeeds, but then running enstore fails with an ImportError stating that the _strptime module could not be found. In reality this is a python library, not a module. Explicitly importing this library solves the problem.
========== library_manager.py ====================================================================================
bug fixed
bug fixed
bug fixed
bug fixed
bug fixed
bug fixed
apply some conditions before calling restrict_host_access in next_work_this volume
For bound volumes allow requests for machines that exceed the number of ongoing transfers. For this the max_permitted argument in restrict_host_access and the coppersponding arg in the discipline section of configuration dictionary can be set a tuple (n1,n2,n3) where n1 - as before max permitted n2 - add this for bound volume for read requests n3 - add this for bound volume for write requests
back to not considerng SG for exceeded limits. It has proven to improve the performance
set lock state from configuration on reinit
reload LM configuration dictionary in reinit
reload LM configuration dictionary in reinit
back to considerng SG for exceeded limits
better vcc creation procedure
typo fixed
In next_work_this_volume if the number of requests exceedded the SG limit, do not consider requests for this SG
bugs fixed
timing added
check for how long the mover is in its state and generate alarm if time is expired
keep the local list of write volumes to not send excess requests to volume clerk
buf fixed
fixed the problem exposed at cdf. In this_volume cycles through the samee requests near "BBB". Fixed by breaking if the same request was already processed.
fixed a logic bug
some additional fixes
some additional fixes
some additional fixes
fixed a bug
bug fixed
bug fixed
bug fixed
Add file family counters for write requests. If file family counter is bigger than file family width let idle mover to pick up the request, rather than wait for the mover with bound volume.
if no pstponed requests check the last one
if no requests left take tmp_rq
midifed output
fixed a bug
fixed a bug
fixed a bug
fixed a bug
fixed a bug
fixed a bug
fixed a bug
fixed a bug
check how log the mover does not update at_movers list and if for more than 10 minutes remove it from the list
moved check of the number of requests
trace added to see whait is in encp ticket
reinitialize max_requests when configuration changes
bug fixed
bug fixed
trace added
trace added
trace added
restrict the number of requests in the queue
fixed reinit
reload allowed to write discipline and priority when a new config file is loaded
replace log of 'access delayed for ...' by trace level 20
Python 2.0 had a fcntl and a FCNTL module. Starting with python 2.2 there was only fcntl. These changes comment out the code that would put the FCNTL constants into fcntl for old python 2.0 versions.
modify allow_access before applying
overridden handle_er_msg of generic server to update self.allow
fixed a typo
selectively allow reads from certain nodes
search from the beginning of the pattern in access_granted
selectively allow writes from certain nodes
backward compatibility with old mover ticket
make LM and mover working with differnt volume clerks for sharing movers across systems
========== hostaddr.py ====================================================================================
Catch sys.stderr.write() errors.
========== file_clerk_client.py ====================================================================================
remove rexec dependency
These changes allow for --erase to be usable for the file clerk client. Support for hidden options now exist in option.py. --erase for the "enstore file" command is now a hidden option. This means that it is not shone in the --help or --usage output and is avaliable only to adminstrators. Currently, the code has the constant ALLOW_ERASE set to false, to enable this command this constant only needs to be set to true.
catch errors in --modify
add command interface to find_copies, find_all_copies, find_original, find_the_original, and find_duplicates
add find_all_copies(), find_original(), find_the_original() and find_duplicates()
bug fix
add multiple copies capability
========== edb.py ====================================================================================
added a new time zone entry
========== Trace.py ====================================================================================
add fractions of the second to do-print
Catch sys.stderr.write() errors.
========== ratekeeper.py ====================================================================================
Cleanup acc_daily_summary() and filler() functionality in accounting_server and update_DRVBusy() and update_slots() in ratekeeper. These changes have forks to execute the updates to avoid pauses to requests.
If the directory to write the depricated rate files in is not in the configuration, continue with just inserting into the DB. If the directory is defined and it doesn't exist, continue with just inserting into the DB.
Add tape_library to the drive-utilization table. CDF had a problem when just the drive_type was not unique, because 9940B are in both the CDF silo(s) and the D0 silo(s).
Added the ability for the ratekeeper to insert the drive_utilization and the slot_usage information directly into the accounting DB.
removed timezone from insert query string
fix errors
added code to fill in "rate" table
Catch sys.stderr.write() errors.
When a server recieves a NEWCONFIGFILE message from the event relay, it should take actions to reinitialize itself. These changes make the possible. The test servers are the ratekeeper and the info server.
========== tab_flipping_nanny.py ====================================================================================
bug fix
1 take -l --library, -o --output arguments [2] works for all libraries
========== media_changer.py ====================================================================================
process the message received during cap operations
Fixed aci wrapper to build with swig 1.3
the comment added
Added mtx class for accessing Overland library. The code was taken from Vanderbilt implementation that uses mtx command line interface.
Added --list-clean to the media changer client command line interface. It prints out the cleaning tapes and their remaining cleaning counts.
Fix support for the STK version of list_volumes(). It was trying to send all the information back over udp, but fails because there is too much. The AML2 implimentation did it right by sending it via TCP, so I copied the code from there.
For STK media changer, make the 'free' slots returned via list_slots() an integer instead of a string. For the AML2 MC, fork inside list_slots() to avoid resource leak from aci_getcellinfo() and check the drive type with the movers (to tell an LTO1 from an LTO2).
Include a flag for AML2 to indicate if a drive is empty or not.
Implimented --list-slots for AML2 and STK. It lists the total, free, used and disabled slots/cells for each robot.
Add support for "enstore media --list-volumes".
Added --show-robot, --show-drive, --show-volume and --list drives to the "enstore media" command. Many related changes to support obtaining the underying information from the robot(s).
Fixed the timed_command() function for STK to handle the shorter copyright notice from "query server" commands. The old code would ignore the answer from newer ACSLSes since a fewer number of lines were recieved than it wanted.
Python 2.0 had a fcntl and a FCNTL module. Starting with python 2.2 there was only fcntl. These changes comment out the code that would put the FCNTL constants into fcntl for old python 2.0 versions.
Return the NOT_SUPPORTED error from the AML2_MediaLoader and Manual_MediaLoader robotQuery() calls. Previously these classes didn't define it. "enstore med --show" commands were hanging with these types of media changers.
if 'max_work' in config use it to set max work
========== charset.py ====================================================================================
Allow FQDN in the disk volume labels. For charset, hostnamecharset is now defined.
========== aci.py ====================================================================================
Fixed aci wrapper to build with swig 1.3
Include a flag for AML2 to indicate if a drive is empty or not.
Implimented --list-slots for AML2 and STK. It lists the total, free, used and disabled slots/cells for each robot.
Add support for "enstore media --list-volumes".
========== ejournal.py ====================================================================================
potential bug fix
bug fix
========== checkdb.py ====================================================================================
create missing directories
remove /diskc dependency
use system_inhibit_0
skip excluded storage group
add volume_storage_group_idx
========== enstore_start.py ====================================================================================
move the output file instead of copying
removed print statement
some more mods
use Intrefaces to detect what network ifaces are up on this host
some mods for dealing with service IPs
removed prints
make enstore start starting configyuration server through service name (alias)
Modified enstore restart to not execute the enstore_start.py and enstore_stop.py files directly. Instead now it calls the python functions inside enstore_start/stop.py directly.
Fixed the starting of movers to use sudo to run it as root. It should have continued to do this, but became broken with revision 1.36.
if ENSTORE_HOME is defined set tmp dir relative to it
Fixed a situation where a (test) mover or media_changer wouldn't get started becuase their wasn't a configured library manager.
If a mover belongs to two (or more) libraries, don't restart it once for each library.
Fix the starting of the movers (as root via sudo) by using the symbolic link in $ENSTORE_DIR/sbin.
Changes necessary for enstore start to be able to start Enstore from sources and also from stand alone executables. This does entail creating symbolic links in $ENSTORE_DIR/sbin/ for the enstore servers. Add enmv to the encp and enstore builds.
Catch sys.stderr.write() errors.
========== enstore_file_listing_cgi.py ====================================================================================
remove hard coded path
========== info_client.py ====================================================================================
more verbose deleted status
add show_copies
add multiple copies inquries
add file_info()
========== enstore_restart.py ====================================================================================
Modified enstore restart to not execute the enstore_start.py and enstore_stop.py files directly. Instead now it calls the python functions inside enstore_start/stop.py directly.
========== makeplot.py ====================================================================================
if log_dir does not exist create it
if log_dir does not exist create it
Catch sys.stderr.write() errors.
========== enmv.py ====================================================================================
Consider supplimental groups when considering if the user has the ability to move/rename a file. Also, move the permisions tests to before the rename() function call.
Add a description to the help and usage.
Catch sys.stderr.write() errors.
Fixed a senario where enmv gets a permission error (re)setting the output file permission bits, gets an error and fails to follow through with updating the Enstore DB. This should only have ben able to happen in cases where rename() succeeded.
========== volume_clerk.py ====================================================================================
mylint doesn't like +=
reconnect only on connection error, not all pg.Error
more exception handling with reconnection to the database
use issubclass() to check pg.Error
make reconnection deal with dead database server
better handling of database reconnection
add reason for override
Instead of using self.r_eval() use udp_common.r_eval().
log new-library to volume change history
1 relax the length restriction on volume comment [2] log each set_comment in history
========== enstore_pg.py ====================================================================================
revert to 1.5
bugs fixed
use accounting server, not direct query, which may hang inquisitor
use db port if specified in config file
add dbport for connection
========== enstore_show_inventory_cgi.py ====================================================================================
WARNING I changed "cgi-bin/enstore" to "cgi-bin"
change 'unknown' cluster to the node name
update list of special files
========== accounting_server.py ====================================================================================
revert to 1.25
1 fix last_xfers() and last_bad_xfers() [2] reconnect whenever it restarts
added some queries for encp transfers and encp errors
Fixed syntax error from prvious commit.
Cleanup acc_daily_summary() and filler() functionality in accounting_server and update_DRVBusy() and update_slots() in ratekeeper. These changes have forks to execute the updates to avoid pauses to requests.
add 30 second delay and retry for the first db connection failure
========== udp_common.py ====================================================================================
remove rexec dependency
Replace "raise sys.exc_info()" with "raise sys.exc_info()[0], sys.exc_info()[1], sys.exc_info()[2]" in r_eval().
Create and use the udp_common.r_eval() and udp_common.r_repr() functions. This will help make dumping eval and repr a little easier someday.
========== pnfs_backup_plot.py ====================================================================================
copy data from pnfs server and then plot. Still partially hardcoded
========== weekly_summary_report.py ====================================================================================
fix a typo
eliminate /diska dependency
change db_port to dbport
add genser@fnal.gov to cdf mailing list
========== get_all_bytes_counter.py ====================================================================================
chmod +x
get_all_bytes_counter is compatible with SDE