eventsworkload-scheduler

Workload Scheduler events on version 9.4 not working after switch over in production


Looking at conman show cpus the state is incorrect. %sc
CPUID RUN NODE LIMIT FENCE DATE TIME STATE METHOD DOMAIN

TWS1 842 *UNIX MASTER 10 0 06/14/17 09:06 I J MDe MDM

After JnextPlan everything is still working properly, but it is still reporting MDe and not MDEA

From WAS logs

com.ibm.tws.util.jmx.JMXBrowser.getSSLAttributeList(JMXBrowser.java:391)
at
com.ibm.tws.util.jmx.JMXBrowser.loadSSLServerConfiguration(JMXBrowser.ja
va:1078)
at
com.ibm.tws.util.jmx.JMXBrowser.getSSLTrustFilePassword(JMXBrowser.java:
1061)
at
com.ibm.tws.event.EIFListener.addSSLCertsProperties(EIFListener.java:668
at
com.ibm.tws.event.EIFListener.loadServerProperties(EIFListener.java:641)
at
com.ibm.tws.event.EIFListener.generateConfigurationFile(EIFListener.java
:310)
at com.ibm.tws.event.EIFListener.start(EIFListener.java:163)
at
com.ibm.tws.conn.event.engine.EventRuleEngineImpl.startEventProcessor(Ev
entRuleEngineImpl.java:638)
at
com.ibm.tws.conn.event.engine.ConnEventRuleEngineBean.startEventProcesso
r(ConnEventRuleEngineBean.java:314)
at
com.ibm.tws.conn.event.engine.EJSLocalStatelessConnEventRuleEngine_28e79
c7e.startEventProcessor(Unknown Source)
at
com.ibm.tws.conn.event.engine.ConnEventRuleEngineEjbLocalImpl.startEvent
Processor(ConnEventRuleEngineEjbLocalImpl.java:245)
at
com.ibm.tws.cli.events.command.StartEventProcCommand.execute(StartEventP
rocCommand.java:116)

Workload Scheduler is running, but events are not triggering...


Solution

  • To help understand the state The STATE field has a lowercase e
    If the STATE field has a lower case e, the event processor is installed
    but not running. Start the event processor using the conman startevtproc
    command, or the Dynamic Workload Console. If you use conman, for
    example, you will see the following output:
    %startevtproc
    AWSJCL528I The event processor has been started successfully.

    The STATE field has no M If the STATE field has no M, monman is not
    running. Start monman using the conman startmon command. You will see
    the following output: %startmon
    AWSBHU470I A startmon command was issued for CPU_MASTER. The STATE field
    has no D

    Resolution 1

    1) make sure that value if the event processor port is wrong into the
    database
    db2 => select mpr_value from mdl.mpr_model_properties
    where mpr_name='EVPROC_HTTPS_PORT'

    This returns the event processor https port, that should
    be something like 31116, but in this error case it could be 0 or -1.

    2) If the value is wrong, we have to
    save the correct value. Run wastools/showHostProperties.sh to retrieve
    the value of httpsPort (suppose it is 31116).

    3) Update the database: db2 => UPDATE MDL.MPR_MODEL_PROPERTIES SET
    MPR_VALUE='31116' WHERE MPR_NAME='EVPROC_HTTPS_PORT'

    4) The change will take effect at next JnextPlan.
    In order to apply the change immediately, make sure if
    Carry Forward is
    set to ALL (run "optman ls" to get the value of cf value).
    If it is not set to ALL take note of its value and run the following

    optman chg cf = ALL

    5) run JnextPlan -for 0000

    6) if Carry Forward was not set to ALL revert its value
    to the original value with optman chg cf =

    7) updateWas.sh -user -password

    where the user is the primary admin id (TWS admin user).

    The user specified in useropts file should be the user
    defined as WAS primary admin id in security.xml file.
    These are the steps to perform:
    - login as tws_user
    - remove the file useropts_tws_user
    - in order to recreate the file, launch "composer sc" :
    you will be asked to specify a username and
    a password. Please specify the the primary admin id
    as user and its password.
    - now run "conman stopeventprocessor"
    - run "conman starteventprocessor"

    If problem still occurs, Check if the planman command has an issue: planman showinfo
    Check /etc/TWA/twainstanceX.TWA.properties file to see if EWas_basePath
    is correct. (Default is /opt/IBM/WebSphere/AppServer)
    If each of the above has no issue, then truss the problematic command
    using the following syntax:
    truss -o /tmp/truss_conman.out conman "stopappserver;wait"
    Check the truss_conman.out output file and look for an error similar to
    this:

    /1: stat("/opt/IBM/WebSphere/AppServer", 0xFFFFFD7FFFDEFB00) Err#13
    EACCES
    [file_dac_search]
    /1: write(2, " A W S B H U 6 2 6 W T".., 76) = 76

    Resolution part 2

    Consult with your UNIX system administrator to investigate the cause for
    the Operation System level error code 13.

    Compare the /opt /opt/IBM /opt/IBM/WebSphere

    /opt/IBM/WebSphere/AppServer - and their sub-directories' permissions against a working environment to correct the permissions

    Correct permissions should look like the following. drwxr-xr-x 6 root root 4096 Apr 8 2015 ibm
    drwxr-xr-x 18 root root 4096 Oct 3 2016 IBM

    drwxr-x--- 2 root root 4096 Apr 17 12:34 CAP
    drwxr-xr-x 6 root root 4096 Jan 15 2016 IMShared
    drwxr-xr-x 5 root root 4096 Apr 13 2015 InstallationManager
    drwxr-xr-x 3 root root 4096 Apr 13 2015 ISA
    drwxr-xr-x 14 root root 4096 Jun 21 15:12 JazzSM
    drwxr-xr-x 3 root root 4096 Apr 8 2015 tsamp
    drwxr-xr-x 2 m92 root 4096 Mar 1 09:59 TWA
    drwxr-xr-x 8 root root 4096 May 24 12:18 TWAUI
    drwxr-xr-x 3 root root 4096 Aug 6 2015 WebSphere

    drwxr-xr-x 37 root root 4096 Jun 21 15:11 AppServer