ENCP release notes, from v2_18 to v2_19
Encp changes:
=============
Encps on 64bit alpha (OSF1) will now work. Previous version had a problem
converting 64bit unsigned longs to 32bit unsigned ints.
Moved a corrupted filesystem check that listed the entire contents of a
directory. This particular check is now only done if an error occurs first.
For directories with large number of files this was a large performance hit.
Encp now checks the input files filesize on reads against the recorded
filesize. This is to catch files with zero length in pnfs but have
been successfully written to tape without any other errors.
Numerous log file messages were changed and/or added. Most improve the use
of the unique id as search string when reading the log files manually.
When using encp via dcache files and directories can now have the name "root".
In the log file vendor was spelled venor. This is now corrected.
Misc.:
======
There is a program called ecrc that will calculate the CRC for a local file.
Detailed cvs commit logs
========== ./doc/WWW/Makefile ====================================================================================
add making route files
========== ./doc/WWW/index.html ====================================================================================
get rid of bad colon
========== ./doc/WWW/talks.html ====================================================================================
add route.ps
========== ./etc/stk.conf ====================================================================================
Added 100 tapes to the miniboone allocation.
discipline ketchup
Updated MINOS quota
Moved 994081 and 91 back to production
eagle42 back to production
added test.cw quota
Put the test library quota back to 10
Increase test quota from 10 to 20 (from the right server node this time)
Increase test quota from 10 to 20
move eagle42 to test lib for cern_wrapper testing, correctly this time
using eagle42 for cern_wrapper testing
enable DAQ access for miniboone
Swapped 9940b drives to alternative controllers
Take 9940 81 and 91 back to evaluation library
restrict host access for chimichanga, snickers and mustard is set back to 3
Moved 9940B10 to eval-b from eval-bb library
change quota for T1
restrict host access for chimichanga, snickers and mustard is set to 5
added sg for test tape
9940 81 and 91 to 9940 lib once more
remove discipline restrictions for snickers and chimichanga
remove colon
assign 9940B10 mover to eval-bb library
give sdss more tapes
Put d0lib-archive's quota back to 75.
Try to get d0lib-archive quota to work.
increased allocation for d0lib-archive
increase the buffer sizes for 9940B to 1.3Gb
uppercased b in 9940b in block size def
down to 500Mb of max_buffers for 9940b drives
downgraded buffers to 1Gb from 1.35Gb for 9940B
returned eagle42 to eagle library from test
added cern wrapper info
eagle42 to test library for cern wrapper testing
Switched 9940B10 to tps2d0n
Modify dev entry for 9940B 10 and 11
added quota for eval-a and eval-b
Fixed "NUL1BM" typo to "NULLBM"
Corrected 9940b rate
eval-a and eval-b libraries created LE10 and LE11 9940B drives setup
9940 movers 81 and 91 moved to test library for performance testing
allow 3 simultaneous requests from chimichanga
added user allocation for cdf-sam of 85 9940 tapes - tj
========== ./etc/enstore_alarm_search.html ====================================================================================
sam.conf
========== ./etc/enstore_cambot.html ====================================================================================
add .fnal.gov to adiccam links
========== ./etc/enstore_log_file_search.html ====================================================================================
sam.conf
========== ./etc/d0en.enstore.k5login ====================================================================================
added new users
added d0enmvr25a
add enstore on hppc
Added d0endca3a to these files.
add d0enmvr4a and 7a back into d0en
========== ./etc/enstore_user.html ====================================================================================
sam.conf
========== ./etc/plotHelp.html ====================================================================================
update
========== ./etc/rip.conf ====================================================================================
remove colon
========== ./etc/rip.enstore.k5login ====================================================================================
added new users
========== ./etc/root.k5login ====================================================================================
added new users
added d0enmvr25a
Added d0endca3a to these files.
Added stken movers 10 and 11, and stkenout 1 and 2
add user principals to allow ksu to root
========== ./etc/sam.conf ====================================================================================
added d0enmvr25a to the mezsilo library
Remove sammam library, for real this time
Moved D30A, B, C into samlto library
Removed sammam library
Added D30A, B, C LTO drives
Updated d0enmvr19a configuration
Removed DI36, 37, 44, 45
Removed DC03 and DC04 at Frank's request
disable write access to sammam and sam-m2 libraries
994025 added mezsilo
remove colon
sam.conf
typo in prev DC03 dev entry
DC03 and DC04 /dev/ entries have changed.. no idea why.. firmaware reloaded perhaps?
change dismount delay time
applied discipline to d0bbin and fnd+ nodes per Jon's request
add support for extra links on plot page
set adminpri for d0ola,b,c
set update_interval for LTO movers to 5 s
changed dismount_delay and max_dismount_delay to 10s for LTO and 9940 movers
Modified logname for meztest.library_maanger to MZTSTLM
changed library for d0enmvr4a and d0enmvr7a
added d0enmvr4a, port 7606 - added d0enmvr7a, prot 7607 and meztest,library_manager port 2524
========== ./etc/stken.enstore.k5login ====================================================================================
added new users
add enstore on hppc
add stkensrv5
Added stken movers 10 and 11, and stkenout 1 and 2
========== ./etc/auth_stk.conf ====================================================================================
more allocate for sdss
steve authorized sdss to write to the eagles
Added user allocation for cdf-sam of twenty tapes.
added e835 and e907 allocations
========== ./etc/enstore_system_info.html ====================================================================================
remove user command section
sam.conf
add inventory summary
========== ./etc/cdfen.enstore.k5login ====================================================================================
added new users
add enstore on hppc
========== ./etc/hosts ====================================================================================
Added d0endca3a to these files.
add mvr10a, 11a, out1, out2
add d0enmvr4a and d0enmvr7a back into hosts correctly
========== ./etc/cdf.conf ====================================================================================
remove colon
set online priority for user stager stager coming from fcdfsgi1
========== ./etc/enstore_system_html_d0ensrv2 ====================================================================================
add user data bytes count
========== ./etc/enstore_system_html_stkensrv2 ====================================================================================
add user data bytes count
========== ./etc/enstore_system_top.html ====================================================================================
add user data bytes count
========== ./etc/make_enstore_system_html ====================================================================================
only cat file if it exists
========== ./etc/enstore_system_html_cdfensrv2 ====================================================================================
add user data bytes count
========== ./etc/enstore_system_middle.html ====================================================================================
add link to ngop monitoring of enstore
sam.conf
production page on www-ccf now
add user data bytes count
========== ./modules/.cvsignore ====================================================================================
ignore ecrc binary
========== ./modules/EXfer.c ====================================================================================
On alpha machines we need to be careful about using unsigned longs. They are actuall 64bit sized values. Which doesn't always play nice when coercing down to 32bits. Since, we only need 32bits for the adler32 algorithm just use unsigned integer.
The 64 bit unsigned long size problem is now fixed for the threaded version of this code.
Changed the crc variable from an unsigned long to unsigned int. On true64 bit machines where the long was 8 bytes this created a conversion error to 32 bits. This should work since all supported platforms define unsigned int to be 4 bytes. However, that is not going to necessarily remain true. C99 defines a header file called stdint.h that defines various int types. One of them is uint32_t. Unfortunatly, only Linux 6x and 7x have this header file. It might be a while until other platforms have releases with this functionality.
========== ./modules/Makefile ====================================================================================
added ecrc
========== ./sbin/.cvsignore ====================================================================================
ignore ecrc link
========== ./sbin/encpCut ====================================================================================
add ecrc
include encp_t
========== ./sbin/release-notes ====================================================================================
better parsing left to do: automatically specify current and previous verison - it is hard coded now - which is terrible.
========== ./sbin/routes ====================================================================================
I've added d0endca3a to the file.
add stkenmvr10a,11a,out1,out2, clean up private lan
========== ./sbin/ADICDrvBusy ====================================================================================
remove DLT plots
========== ./sbin/checkPNFS ====================================================================================
move rc to inside fail loop
output result to stdout and not separate file that gets lost
========== ./sbin/netscan ====================================================================================
cut can be in 2 different places on linux machines
remove check for processors and disk space
add option to not check ipmi stuff
allow sendmail on srv2 from now on - needed for automated processing of helpdesk ticket summary
more attempts to ignore normal monitoring items
========== ./sbin/ntpset ====================================================================================
cut can be in 2 different places on linux machines
allow ntpdc for rh7 systems
========== ./sbin/readDcache ====================================================================================
correct typo on dccp file name
correct { } errors
remove rm of files in pnfs written via dcache because this is causing problems - rm before written to enstore
fix weak read test on cdf, fix different ports on cdf/stk
no alarms on errors, but final error code
========== ./sbin/silo-check ====================================================================================
allow tape to be in any library if sg is test
========== ./sbin/choose_ran_file ====================================================================================
Added the list of active volumes to the output of at the request of ISA.
========== ./sbin/keytab_check ====================================================================================
cut can be in 2 different places on linux machines
========== ./sbin/tapes-burn-rate.py ====================================================================================
complete path to binaries
changes needed to make this run on the production nodes and not airedale
========== ./sbin/tapes-plot-sg.py ====================================================================================
complete-r path to binaries
complete path to binaries
========== ./src/Makefile ====================================================================================
remove a bad comment
========== ./src/alarm.py ====================================================================================
move non enstore import functions to enstore_functions2
========== ./src/atomic.py ====================================================================================
Moved the filesystem is currputed test from encp.py to here to remove a listdir for each transfer. Also, fixed the way default errnos are determined.
Better error detection for file creation problems.
========== ./src/delete_at_exit.py ====================================================================================
move non enstore import functions to enstore_functions2
========== ./src/e_errors.py ====================================================================================
Fixed a bug in is_ok().
Added various is_XXX() functions. These test for retriable, non-retriable, alarmable and resendable error conditions. Also, includes a test for OK.
========== ./src/encp.py ====================================================================================
bumping version to v2_19 because of encpCut
Fixed more potential problems with errno usage.
Included a check on the input file for reads to make sure the os filesize and the pnfs filesize match.
Fixed a bug when writing to enstore from dcache. If the file only had read permissions then an unecessary test for write permissions (file and directory) would falsly fail the transfer.
Fixed a dcache interfacing problem. Encp was trying to make sure the output file had write priledges. It shouldn't even care when dcache is involed what the output files permissions are. I cannot be assumed that all system exceptions contain the attribute errno. Defaults are now in place where needed.
Moved the FSCORRUPTED test to atomic.py. This removed doing a lisdir for each write transfer.
Moved a call to os.listdir() outside of a loop. This could be a permformace hit for large directories and multi-file transfers. Added the unique id to some more log messages...
Added some log messages. These are to help trace when encp does certain things. Others are to make the unique_id more usable.
Cleaned up the code to use new functions from e_errors for checking the status fields of tickets.
Spelling fix: recieved -> received. Created the setup_signal_handling() function. Moved code from "__main__" to do so.
Added a log message that associates unique_id with filenames. Changed FNCTL.O_NONBLOCK to os.O_NONBLOCK in anticipation of using python 2.2.
bumping version to x2_18_2 because of encpCut
Fixed an internal comment.
Made changes to handle problems with 64bit OSF nodes making encp requests. These machines could handle file sizes larger than 2GB-1 within a C int type variable. When the integer is placed in the stringified dictionary it was not containing an appending L for long type. It was this missing L that caused the problemd for the library manager since it runs on a 32bit Linux node.
venor is now spelled vendor. "unable to registor bfid" error now a warning not informational. Bug fix to the internal error handling mechanism.
bumping version to x2_18_1 because of encpCut
move non enstore import functions to enstore_functions2
Change to pass mylint.py.
Three bug fixes. 1) Empty permissions left after read. 2) OS file size zero and pnfs layer 4 size correct after writes. 3) Files writing under 2) were allowed to be read.
========== ./src/enstore_constants.py ====================================================================================
add D_MPD_FILE
no trace levels less than 6
add extra page support to plot page
check more the mover and lm status tickets
add fields for lm
ping node before rcping
========== ./src/enstore_files.py ====================================================================================
add handling of timestamp from client on log line
add mounts/drive type plots
move non enstore import functions to enstore_functions2
remove enstore_files import, use enstore_functions2
check more the mover and lm status tickets
add import
type
check for pending queue as []
check if lm queue is valid first
========== ./src/enstore_html.py ====================================================================================
add mounts/drive type plots
bug fixes
change functions to functions2
add enstatusonlypage back in
remove enstore_files import, use enstore_functions2
add volume_audit to alarm page
remove safe_dict
make sure lm queue is a dict
========== ./src/enstore_make_plot_page.py ====================================================================================
add label for chk_prod_code
add comma
add user_bytes
========== ./src/enstore_saag.py ====================================================================================
move non enstore import functions to enstore_functions2
========== ./src/enstore_status.py ====================================================================================
add volume name to messages
move non enstore import functions to enstore_functions2
check more the mover and lm status tickets
get lm state correct
get movers state right
check for drive_id
========== ./src/enstore_up_down.py ====================================================================================
move non enstore import functions to enstore_functions2
========== ./src/entv.py ====================================================================================
Now uses a .entvrc file to record the geometry of the window when it is closed. It is also possible to change the background color with the .entvrc file.
Typing on the command line after invoking the entv: entv.py d0en instead of "entv.py d0en &" would cause the status and message threads to abort (like a C^c happened). Now only a C^c will kill entv.
Working on reducing CPU usage.
Doesn't crash when resized. Child window for mover status seems stable.
move non enstore import functions to enstore_functions2
Removed a debug return that prevented unsued clients from being deleted.
Volume background now diapears along with the text. Other little fixes.
Everything seems to work *correctly* now.
Volume class gone. Trace class used. Debugging output (mostly) gone.
debug
Added a sleep call. This seems to limit the amount of CPU it uses.
Faster startup. With new movers, entv can determine the client machine at startup. Death of entv is handled more gracefully.
Speed up the initialization. Put mover timeouts in during initial status check. Before 0 through k movers were positioned for each k, now when k is positioned that is the only one positioned. Also, changed where the movers get drawn to.
This includes some general fixes. Mostly having to do with cleaner start and stopping. But some with location of graphics.
major cleanup. better threading. better startup. ability to reinitalize.
========== ./src/espion.py ====================================================================================
move non enstore import functions to enstore_functions2
========== ./src/enstore_functions.py ====================================================================================
Changed is_ok() to accept a status (aka a tuple of length two) or a dictionary with an element named 'status' that is a tuple of length two. Previously, it only accepted the dictionary.
move read_erc to enstore_erc_functions
move non enstore import functions to enstore_functions2
add subscribe for new config msg
bug in format
ping is different on d0ensrv2
ping node before rcping
========== ./src/pnfs.py ====================================================================================
Fixed bug if a directory named "root" was in the directory path of a command that took a pnfs id as argument. Enstore was confusing this 'root' directory with the "root" pnfs directory.
Added to new options, --tagchmod and --tagchown.
revert to previous version
if layer 4 is missing, instantiate it as a new file
take care of moved file
handle missing field in initialization
fix File.set_size() again
fix File.set_size()
Modified to use option.check_correct_count() to prevent user from entering to many options in by mistake. Removed errnious retry attempts from the pnfs "No such file or directory" bug.
Fixed the --pnfs-state command.
Determines if the user is currently in a pnfs directory before allowing a tag to be written/read.
========== ./src/ftt_driver.py ====================================================================================
make diagnostic messaages reflecting what seek does
========== ./src/inquisitor.py ====================================================================================
move non enstore import functions to enstore_functions2
check more the mover and lm status tickets
check ticket from mover for completeness
mover state check
remove safe_dict
remove old interface docs
========== ./src/inquisitor_plots.py ====================================================================================
add mounts/drive type plots
add import of enstore_functions2
move non enstore import functions to enstore_functions2
check if node is up before rcp
========== ./src/inventory.py ====================================================================================
move non enstore import functions to enstore_functions2
also output a file to be read in to create the total bytes counter
========== ./src/makeplot.py ====================================================================================
move non enstore import functions to enstore_functions2
========== ./src/monitored_server.py ====================================================================================
check more the mover and lm status tickets
add self and default
add fields for lm
check ticket from mover for completeness
========== ./src/mover.py ====================================================================================
made the same change to disk mover
do not calculate CRC in the tape thread if net thread detected a transfer failure
added volume name to alarm message
generate an alarm if mover is too long in the certain states
fixed a bug
added log message after dismount completes
1. in assert_volume check a volume label and then decide if it is correct even if there is no label at all. 2. do not dismount tape on ENCP_GONE.
replace current_work_ticket with wrapper_dict in the vol_labels call
added trace fro complete CRC
removed rewriting label for 9940B because firmware was fixed
Changed FNCTL.O_NONBLOCK to os.O_NONBLOCK in preperation of python 2.2.
use create_wrapper_dict to get an expected by wrapper dictionary
added handle_error
moved log message
use a vol_labels wrapper method to label blank tapes to match with whatever vrapper requires
Due to a HW problem a driver does not report a label on the mounted tape, which was causing a traceback during a forced dismount.
modified restart, added timestamps for mounting and mounted log messages
attempt to fix an attribute errorr when sending status while reinitializing a buffer
fixed bug
changes related to volume assert and drive type in the mounting and mount log messages
1st running code for the volume assert
volume assert bugs fixed
volume assert added
fixed lint complaint
changed message in stop_draining
better way of dealing with specifics of 9940B drive, actually it is bug in the firmware
Added hack to write label twice. On initial tape density conversion in a 9940B, the drive fails to write the filemark. Rewriting the label and filemark covers up the problem.
Silly bug in pad subtraction fixed and neat Wayne's shortcut for pad calculation
read last block with padding and remove pad length from CRC checks this is due to an ILI/ENOMEM on 9940B when you try to read less than full block
return client_ip in the status message
move non enstore import functions to enstore_functions2
added client hostname to status info
========== ./src/null_wrapper.py ====================================================================================
added vol_labels
========== ./src/plotter.py ====================================================================================
add debug messages
add extra page support to plot page
move non enstore import functions to enstore_functions2
========== ./src/verify_db.py ====================================================================================
make mylint happy
use low level cursor to make it much faster
========== ./src/volume_clerk.py ====================================================================================
fix check veto list
guard inquire_vol() against external_label == None
log when listing all volumes
add a missing return
========== ./src/file_clerk_client.py ====================================================================================
survive prematural failure of get_brand()
========== ./src/option.py ====================================================================================
Added the --get-asserts for the library mananger. Added the --mover-timeout for the volume_assert.
Added to new options, --tagchmod and --tagchown.
Added the function check_correct_count(). This allows the code to check if the user specified extra options that were not expected.
========== ./src/configuration_client.py ====================================================================================
bump up trace severities
remove print
add subscribe for new config msg
========== ./src/event_relay.py ====================================================================================
remove some log messages
========== ./src/file_clerk.py ====================================================================================
log when listing the tape
ignore pnfs_mapname if the record does not have it
========== ./src/enstore_display.py ====================================================================================
Negitive positions are possible. The code did not take this into account. The regular expressions parsing the geometry now handle negative positions.
Now uses a .entvrc file to record the geometry of the window when it is closed. It is also possible to change the background color with the .entvrc file.
The checkbutton on the menubar is now set to on initially. Also, the resize event -- from a user -- waits a finite amount of time until the window contents are redrawn. Should extra resize events occur the wait time is reset to the full wait time. This should cut down on the number of times the canvas is redrawn when the user is resizing the window.
Added the menubar with the option to turn off animation.
Working on reducing CPU usage.
Fixed the MoverDisplay windows. User can click on a connection and it will change color. Other general cleanups.
Doesn't crash when resized. Child window for mover status seems stable.
move non enstore import functions to enstore_functions2
Removed a debug return that prevented unsued clients from being deleted.
Volume background now diapears along with the text. Other little fixes.
When a connection is terminated the line is removed from the display.
Everything seems to work *correctly* now.
Volume class gone. Trace class used. Debugging output (mostly) gone.
Font sizes are good. Some font color changes too. Changes to the timer. Code cleanup.
The font selection for the mover text will select a size to fit in the designated space. Geometry selection is cleaner.
Bug fix for font size selection.
Faster startup. With new movers, entv can determine the client machine at startup. Death of entv is handled more gracefully.
Speed up the initialization. Put mover timeouts in during initial status check. Before 0 through k movers were positioned for each k, now when k is positioned that is the only one positioned. Also, changed where the movers get drawn to.
This includes some general fixes. Mostly having to do with cleaner start and stopping. But some with location of graphics.
major cleanup. better threading. better startup. ability to reinitalize.
========== ./src/library_manager.py ====================================================================================
fixed a bug in the processing write requests
This change returns a status to volume_assert.py on the initial request.
Undue the prevous changes.
Made changes to the volume assert code. Largest of which is to add the assert pending and active queues to the output from --get-queue.
These changes are for the volume assert test.
queue requests with restricted host access, do not ignore them
fixed a bug introduced in the previous rev.
expanded fuctionality of restrict_host_access
more flexible match in restrict_host_access
explicitely delete a 3rd argument returned by discipline for restrict_version_access
added diagnostic messages
do not process adminpri write requests if at least one for the given vollume family has been processed
modified postponed queue
add subscribe for new config msg
change affecting only disk movers
========== ./src/host_config.py ====================================================================================
Fixed a problem with the file passing mylint.py/pychecker.
This file was not correctly handling the case were a hostip line was listed in the enstore.conf file, but no interfaces listed.
========== ./src/volume_clerk_client.py ====================================================================================
fix --set-comment argument counting bug
sort file list according to location cookie
========== ./src/cpio_odc_wrapper.py ====================================================================================
make ticket optional
add vol_labels procedure
========== ./src/cern_wrapper.py ====================================================================================
fix default for declaration date
fix vol_labels call
fix indenting
remove unused DEVICE
add fermi specific info
fix bug in getting info from ticket
lots of changes, backup
bug fixes
========== ./src/discipline.py ====================================================================================
make deepcopy of arguments to return, because if not and args are modified, the original args are modified as well
discard changes made in the previous release
do not reread configuration information
========== ./src/ratekeeper.py ====================================================================================
move non enstore import functions to enstore_functions2
========== ./src/show_volume_cgi.py ====================================================================================
calculate bytes written
========== ./src/generic_server.py ====================================================================================
check for msg existence first
add subscribe for new config msg
========== ./src/scanfiles.py ====================================================================================
skip symbolic links
ignore .removed directories
ignore .bad directory from top-down scan
use alternative path if necessary. take alternative path into consideration while comparing the path
make path mismatch to be warning only
fix drive comparison
fix typo
deal with missing field when missing layer 4
take care of no layer 4 exceptions
fix a typo
now does batch
take care of missing layer 1 and/or layer 4
take care of very large file size
make mylint and pychecker happy
consistent treatment to symbolic links
protected for missing keys
skip volmap directories
guard for none enstore file
========== ./src/monitor_client.py ====================================================================================
If the node name cannot be resolved into an ip, skip the node.
Changed FNCTL.O_NONBLOCK to os.O_NONBLOCK in preperation of python 2.2.
move non enstore import functions to enstore_functions2
========== ./src/enstore_overall_status.py ====================================================================================
bug fixes
change output directory
remove enstore_files import, use enstore_functions2
use enstore_functions2
make enstore_overall_status run on hppc
lint fix
reduce frequency of emails
send mail when cant rcp from a node for overall status page
ping node before rcping
========== ./src/monitor_server.py ====================================================================================
Changed FNCTL.O_NONBLOCK to os.O_NONBLOCK in preperation of python 2.2.
move non enstore import functions to enstore_functions2
========== ./src/enstore_saag_network.py ====================================================================================
move non enstore import functions to enstore_functions2
========== ./src/library_manager_client.py ====================================================================================
Undue the prevous changes.
Made changes to the volume assert code. Largest of which is to add the assert pending and active queues to the output from --get-queue.
These changes are for the volume assert test.
========== ./src/enstore_plots.py ====================================================================================
do not traceback when there are no xfers in a day
add mounts/drive type plots
move non enstore import functions to enstore_functions2
========== ./src/manage_queue.py ====================================================================================
Undue the prevous changes.
Made changes to the volume assert code. Largest of which is to add the assert pending and active queues to the output from --get-queue.
========== ./tools/pychecker/Config.py ====================================================================================
Updating to pychecker 0.8.6.
========== ./tools/pychecker/OP.py ====================================================================================
Updating to pychecker 0.8.6.
========== ./tools/pychecker/Stack.py ====================================================================================
Updating to pychecker 0.8.6.
========== ./tools/pychecker/__init__.py ====================================================================================
Updating to pychecker 0.8.6.
========== ./tools/pychecker/checker.py ====================================================================================
Updating to pychecker 0.8.6.
========== ./tools/pychecker/warn.py ====================================================================================
Updating to pychecker 0.8.6.