About

For many years I have dealing with the HP OpenView OA (Operations Agent) either its in full incarnation of the software stack or just when using the OVPA (Performance Agent) part.

For a variety of reasons the software will cease working at times or simply will not start any more. Here are key notes on the troubleshooting of some of such issues.

oacore hangs at startup with “communication” problems

Problem

# perfstat -p

list of performance tool processes:
----------------------------------

 Perf Agent status:
    Running midaemon              (Measurement Interface daemon) pid 15866
    Running ttd                   (ARM registration daemon) pid 16987

 Perf Agent Server status:

    Running ovcd                  (OV control component) pid 8591
    Running ovbbccb               (BBC5 communication broker) pid 10347
WARNING: oacore is not active (oacore component)
WARNING: perfalarm is not active (alarm generator)
OV Operation Agent status:
oacore      Operations Agent Core               AGENT,OA              Aborted
ovbbccb     OV Communication Broker             CORE         (10347)  Running
ovcd        OV Control                          CORE         (8591)   Running
ovconfd     OV Config and Deploy                COREXT       (12927)  Running


******** (end of perfstat output: note above warning) ********
You have mail in /var/mail/root

Some example log entries in /var/opt/OV/log/System.txt might be:

0: INF: Sun May 29 11:00:28 2016: ovcd (7007/53): (ctrl-214) oacore has been exited. PID: 25488
0: WRN: Sun May 29 11:00:28 2016: ovcd (7007/53): (ctrl-208) Component 'oacore' with pid 25488 exited with exit value '1'. Restarting component.
0: ERR: Sun May 29 11:00:43 2016: ovcd (7007/3): (ctrl-42) Initialization of component 'oacore' failed. Stopping component.
0: INF: Sun May 29 11:01:10 2016: oacore (961/1): Collection intervals: Process = 60   Global = 300   DataFile Rollover% = 20.
0: INF: Sun May 29 11:01:12 2016: oacore (961/1): Initializing CODA
0: INF: Sun May 29 11:01:12 2016: oacore (961/1): No CODA files to process
0: INF: Sun May 29 11:01:12 2016: oacore (961/1): Waiting to initialize SCOPE ...
0: INF: Sun May 29 11:01:17 2016: oacore (961/1): SCOPE datasource initialization succeeded
0: INF: Sun May 29 11:01:17 2016: oacore (961/1): OA datasource initialization succeeded
0: INF: Sun May 29 11:01:24 2016: ovcd (7007/57): (ctrl-214) oacore has been exited. PID: 961
0: ERR: Sun May 29 11:01:24 2016: ovcd (7007/57): (ctrl-209) Component 'oacore' exited with '1', automatic restart limit exceeded. Use 'ovc -start oacore'.

Solution

In this case the fix is to make sure that CORE component starts before any other components:

# /opt/OV/bin/ovc -status
(ctrl-112) Ovcd is being initialized..

# /opt/OV/bin/ovc -kill

# /opt/OV/bin/ovc -start CORE

# ./bbcutil -deregister "*"

  (bbc-292) The OV Communication Broker on host 'localhost' denied the request due to an authorization failure. Ensure the proper SSL certificates
         are installed and configured.

# /opt/OV/bin/ovc -kill

# /opt/OV/bin/ovc -start

# perfstat -p

list of performance tool processes:
----------------------------------

 Perf Agent status:
    Running midaemon              (Measurement Interface daemon) pid 5701
    Running ttd                   (ARM registration daemon) pid 11248

 Perf Agent Server status:

    Running ovcd                  (OV control component) pid 19139
    Running ovbbccb               (BBC5 communication broker) pid 20517
    Running oacore                (Operations Agent Core) pid(s) 21945
       Configured DataSources(2)
                  SCOPE
                  CODA

    Running perfalarm             (alarm generator) pid(s) 16087
OV Operation Agent status:
hpsensor    HP Compute Sensor                   AGENT,OA     (22889)  Running
oacore      Operations Agent Core               AGENT,OA     (21945)  Running
ovbbccb     OV Communication Broker             CORE         (20517)  Running
ovcd        OV Control                          CORE         (19139)  Running
ovconfd     OV Config and Deploy                COREXT       (23976)  Running
rtmd        HP Real Time Measurement            AGENT                 Aborted


************* (end of perfstat -p output) ****************

mideamon exiting after startup

Problem

The midaemon log file /var/opt/perf/status.mi is showing:

Unable to allocate MI shared memory. The current MI shared memory size is
too small. Possible MI subclass overflow (too  many processes, threads,
transactions, etc.) Further allocation errors will be suppressed. Terminate
the midaemon and restart using the -smdvss option. For more details see the
midaemon man page.

mi_create - MI pid structures (-11664) allocation failed
Not enough space.

Solution

Allocate a greater shared memory segment:

  1. Stop the Agent:
/opt/perf/bin/ovpa stop
/opt/perf/bin/midaemon -T
/opt/perf/bin/ttd -k
  1. Search for any midaemon active shared memory segments using ipcs as follows:
# ipcs -m |grep '0x0c6629c9'
  1. Use the following command to remove these entries from the shared memory table: ipcrm -m <ID>
Example:
# ipcs -m |grep 0x0c6629c9
m            502562828 0x0c6629c9 --rw-r-----      root       sys
# ipcrm -m 502562828
# ipcs -m |grep 0x0c6629c9
  1. Edit the /etc/rc.config.d/ovpa configuration and alter the MIPARMS parameter:
MIPARMS="-p -smdvss 512M"
  1. Start the Agent:
# /opt/perf/bin/ovpa start

Comments