indexingatgendeca

ATG Endeca Commerce: Indexing failing at last mile crawl


The indexing is getting failed with the following error at the stage of EndecaScriptService

The cas crawl is getting failed with a wierd socketTimeout exception which is quite random. Not sure whether this is because of the size of the data set that is getting indexed.

INFO: Starting baseline CAS crawl with id 'art-last-mile-crawl'.
Aug 02, 2017 11:18:44 AM com.endeca.soleng.eac.toolkit.script.Script runBeanShellScript
SEVERE: Error starting baseline crawl 'art-last-mile-crawl'.
Occurred while executing line 11 of valid BeanShell script:
[[

 8|      Dgidx.cleanDirs();
 9|
10|      // run crawl and archive any changes in dvalId mappings
11|      CAS.runBaselineCasCrawl("art-last-mile-crawl");
12|      CAS.archiveDvalIdMappingsForCrawlIfChanged("art-last-mile-crawl");
13|
14|      // archive logs and run the indexer

]]

Aug 02, 2017 11:18:44 AM com.endeca.soleng.eac.toolkit.Controller execute
SEVERE: Caught an exception while invoking method 'run' on object 'BaselineUpdate'. Releasing locks.
java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at com.endeca.soleng.eac.toolkit.Controller.invokeRequestedMethod(Controller.java:933)
        at com.endeca.soleng.eac.toolkit.Controller.execute(Controller.java:271)
        at com.endeca.soleng.eac.toolkit.Controller.main(Controller.java:138)
Caused by: com.endeca.soleng.eac.toolkit.exception.AppControlException: Error executing valid BeanShell script.
        at com.endeca.soleng.eac.toolkit.script.Script.runBeanShellScript(Script.java:180)
        at com.endeca.soleng.eac.toolkit.script.Script.run(Script.java:127)
        ... 7 more
Caused by: com.endeca.soleng.eac.toolkit.exception.CasCommunicationException: Error starting baseline crawl 'art-last-mile-crawl'.
        at com.endeca.eac.toolkit.component.cas.ContentAcquisitionServerComponent.startBaselineCasCrawl(ContentAcquisitionServerComponent.java:451)
        at com.endeca.eac.toolkit.component.cas.ContentAcquisitionServerComponent.runBaselineCasCrawl(ContentAcquisitionServerComponent.java:357)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at bsh.Reflect.invokeMethod(Unknown Source)
        at bsh.Reflect.invokeObjectMethod(Unknown Source)
        at bsh.Name.invokeMethod(Unknown Source)
        at bsh.BSHMethodInvocation.eval(Unknown Source)
        at bsh.BSHPrimaryExpression.eval(Unknown Source)
        at bsh.BSHPrimaryExpression.eval(Unknown Source)
Caused by: java.net.SocketTimeoutException: Read timed out
        at org.apache.axis.AxisFault.makeFault(AxisFault.java:101)
        at org.apache.axis.transport.http.HTTPSender.invoke(HTTPSender.java:154)
        at org.apache.axis.strategies.InvocationStrategy.visit(InvocationStrategy.java:32)
        at org.apache.axis.SimpleChain.doVisiting(SimpleChain.java:118)
        at org.apache.axis.SimpleChain.invoke(SimpleChain.java:83)
        at org.apache.axis.client.AxisClient.invoke(AxisClient.java:165)
        at org.apache.axis.client.Call.invokeEngine(Call.java:2784)
        at org.apache.axis.client.Call.invoke(Call.java:2767)

Using ATG 11.2 & Endeca 11.2


Solution

  • From the ATG Support Documentation:

    "httpSocketTimeout" is the maximum period of inactivity in milliseconds between two consecutive data packets before http times out. An in adequate value in this can cause the Read timed out error.

    By default the "httpSocketTimeout" is set to 60 secs. Increasing this resolves the read timeout error.

    To fix this issue, edit the /<app>/config/script/DataIngest.xml in the deployed app to add <property name="httpSocketTimeout" value="(some value)" />

    Your file will look something like this:

    <property name="casHost" value="localhost" />
    <property name="casPort" value="8500" />
    <property name="httpSocketTimeout" value="180000" />
    <property name="numPartialsBackups" value="5" />
    

    You will need to experiment with the httpSocketTimeout until you no longer get the timeout. At some point you may need to increase it again.