ATG - Endeca Baseline update is getting failed with the following error in ART apps. But partial indexing is successful.
Attaching the CAS logs also for the corressponding error.
Aug 15, 2017 12:23:37 PM com.endeca.soleng.eac.toolkit.script.Script runBeanShellScript
SEVERE: Crawl 'ART-last-mile-crawl' failed with error: Problem running full acquisition on data source for ART-last-mile-crawl: Error reading from Record Store ART-data: malformed input around byte 10.
Occurred while executing line 11 of valid BeanShell script:
[[
8| Dgidx.cleanDirs();
9|
10| // run crawl and archive any changes in dvalId mappings
11| CAS.runBaselineCasCrawl("ART-last-mile-crawl");
12| CAS.archiveDvalIdMappingsForCrawlIfChanged("ART-last-mile-crawl");
13|
14| // archive logs and run the indexer
]]
Aug 15, 2017 12:23:37 PM com.endeca.soleng.eac.toolkit.Controller execute
SEVERE: Caught an exception while invoking method 'run' on object 'BaselineUpdate'. Releasing locks.
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
In CAS the error logs are:
2017-08-15 12:23:36,485 ERROR [ART-data] [cas-ART-last-mile-crawl-worker-1] com.endeca.itl.recordstore.impl.RecordStoreImpl: Error executing method RecordStoreImpl.readRecords()
com.endeca.itl.recordstore.RecordStoreException: malformed input around byte 10
at com.endeca.itl.recordstore.impl.ReadCursor.read(ReadCursor.java:81)
at com.endeca.itl.recordstore.impl.RecordStoreImpl.readRecords(RecordStoreImpl.java:480)
at sun.reflect.GeneratedMethodAccessor106.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at com.endeca.itl.service.ServicePublisher$1.invoke(ServicePublisher.java:121)
at com.sun.proxy.$Proxy57.readRecords(Unknown Source)
at com.endeca.itl.recordstore.RecordStoreReader.fetchNextChunk(RecordStoreReader.java:267)
at com.endeca.itl.recordstore.RecordStoreReader.hasNext(RecordStoreReader.java:244)
at com.endeca.itl.extension.source.merger.RecordStoreMergerDataSourceRuntime$RecordStoreReadSession.runFull(RecordStoreMergerDataSourceRuntime.java:252)
at com.endeca.itl.extension.source.merger.RecordStoreMergerDataSourceRuntime.runFullAcquisition(RecordStoreMergerDataSourceRuntime.java:148)
at com.endeca.itl.util.CasExtensionRegistry$ContextClassLoaderDataSourceExtensionRuntime$2.doWork(CasExtensionRegistry.java:220)
at com.endeca.itl.util.CasExtensionRegistry$ContextClassLoaderDataSourceExtensionRuntime$2.doWork(CasExtensionRegistry.java:218)
at com.endeca.itl.plugin.ThreadContextRunner.run(ThreadContextRunner.java:136)
at com.endeca.itl.plugin.ThreadContextRunner.run(ThreadContextRunner.java:89)
at com.endeca.itl.util.CasExtensionRegistry$ContextClassLoaderDataSourceExtensionRuntime.runFullAcquisition(CasExtensionRegistry.java:218)
at com.endeca.itl.executor.extension.ExtensionDataSourceProcessor.processRecord(ExtensionDataSourceProcessor.java:104)
at com.endeca.itl.executor.extension.IncrementalDataSourceProcessor.processRecord(IncrementalDataSourceProcessor.java:106)
at com.endeca.itl.executor.TaskManager$2.work(TaskManager.java:166)
at com.endeca.itl.executor.WorkExecutor$WorkRunnable.run(WorkExecutor.java:194)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
at com.endeca.itl.util.LoggingContextAwareThread.run(LoggingContextAwareThread.java:71)
Caused by: java.io.UTFDataFormatException: malformed input around byte 10
at java.io.DataInputStream.readUTF(DataInputStream.java:656)
at java.io.DataInputStream.readUTF(DataInputStream.java:564)
at com.endeca.itl.recordstore.impl.storage.RecordStorageEntry.load(RecordStorageEntry.java:114)
Any inputs on how to triage and debug further would be helpful
The above error was resolved after completing the following steps.
Step1 - CAS configuration was exported and imported after the addition of the following change. <ignoreInvalidRecords>true</ignoreInvalidRecords>
recordstore-cmd.sh get-configuration -a ART-data -f dataConfig.xml
recordstore-cmd.sh set-configuration -a ART-data -f dataConfig.xml
Step2 - The same change was made in the record store configuration file present under the <CAS_WS>/workspace/state/ART-data
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<recordStoreConfiguration xmlns="http://recordstore.itl.endeca.com/">
<changePropertyNames/>
<idPropertyName>record.id</idPropertyName>
<ignoreInvalidRecords>true</ignoreInvalidRecords><!-- newly added -->
<jdbmSettings/>
</recordStoreConfiguration>
Step3 - The cas_output
folder <Endeca_apps>/ART/data/cas_output
was replaced with the backup (2 days past) that was available.
After the above steps, indexing was initiated directly from backend (by invoking scripts). Once this indexing was successful, indexing was invoked from the dynamo and the same was successful.