Q: What has changed from version 1.2.0 to 1.3.0
A: Below is a list of changes:
  • datacondAPI:

    • Occasionally a strange error message like this is seen:
        LDAS-DEV35212650 error!
        aborted after 1022.36 seconds in datacondAPI.
        You have encountered an LDAS system bug!
        This condition will be automatically reported to the developer
        who is responsible for maintaining the datacond API.
        User command LDAS-DEV35212650 aborting!
      
      What is happening is that the managerAPI is aborting a job that is hung in the datacondAPI before the datacondAPI has a chance to try to unhang it. This will typically result in a memory leak.

      To avoid this kind of failure, the value of the manager API resource variable ::MANAGER_ABORT_AFTER_N_SECONDS_IN_DATACOND_API should be made at least 200 seconds longer than the datacond API resource variable ::DC_LONG_RUNNING_THREAD_WARNING.

  • diskcacheAPI:

    • There is a known issue in the diskcacheAPI which causes the loss of metadata when a raid archive server that is visible to the diskcache API is rebooted.

      If an archive server is rebooted, the diskcache API should be restarted via the Control and Monitor Interface.

    • There is a new resource variable in the LDASdiskcache.rsc for controlling the excessive email generated when rapidly updating directories exist on a system.

      The LDASdiskcache.rsc resource variable ::IGNORE_REMOVED_DIRS_UNDER_MTPT can be set to a list of directory names that are otherwise defined in the ::MOUNT_PT resource variable to reduce the number of email level log messages.

      Any ::MOUNT_PT entry that is also in the ::IGNORE_REMOVED_DIRS_UNDER_MTPT list will only generate email when a CONFLICTING DATA error occurs. Log entries that would otherwise raise email due to REMOVED type errors will be logged with a purple ball.

      Note that it is always the case that ADDED type 'errors' are logged with blue balls unless another type of error occurred under the same mount point, when they are logged as part of the real error report to facilitate diagnostic analysis.

      Example:

      set ::IGNORE_REMOVED_DIRS_UNDER_MTPT [ list frames/full /samraw/extra ]

    • mpiAPI:

      There is a new utility that can be used to detect Linux system errors on the beowulf cluster. It is called scancluster, and it is usually found in the /ldas/bin directory.
      The script should be examined and possibly modified before use to set the base name of the beowulf node machines. The default value is set to node.
      The scancluster utility is run on the beowulf gateway, and it produces a report called scancluster.log.NNNNNNNNNN, where the file extension is the unix timestamp.
      This utility issues rsh commands as the user running it to capture information from the system logs on all the nodes.
      The log file will have a seperate section for each node on the cluster, and there will be two subsection to each node report. The first section will contain error conditions, and the second section will contain the last 10 lines of the /var/log/messages file.
      The scancluster log is visually inspected for system errors. Even a fairly large cluster can be examined relatively quickly by this method.
  • User Commands

    • The createRDS command has several new options. They are:
      • -framesperfile
      • -secperframe
      • -filechecksum
      • -allowshortframes
      • -generatechecksum
      • -fillmissingdatavalid
      The documentation for these new options can be found here.

    Closed Problem Reports




Back to LDAS FAQ