apachespark-streamingflumeflume-ngspark-structured-streaming

Unable to read an log file(server running log) from windows using flume


Hi am facing an issue while reading an application's running log file from windows via apache flume. Please find the below configuration details i hace used in "flume-conf.properties"

# The configuration file needs to define the sources, 
# the channels and the sinks.
# Sources, channels and sinks are defined per agent, 
# in this case called 'agent'

wh.sources = ws
wh.channels = mem
wh.sinks = k1

# For each one of the sources, the type is defined
wh.sources.ws.type = exec
wh.sources.ws.command = tail -F C:/Users/nirmal.b/Desktop/serverlogs/serverlog.txt


# The channel can be defined as follows.
wh.sources.ws.channels = mem

# Each sink's type must be defined
wh.sinks.k1.type = logger

#Specify the channel the sink should use
wh.sinks.k1.channel = mem

# Each channel's type is defined.
wh.channels.mem.type = memory

# Other config values specific to each type of channel(sink or source)
# can be defined as well
# In this case, it specifies the capacity of the memory channel
wh.channels.mem.capacity = 1000
wh.channels.mem.transactionCapacity = 1000

when i start executing the flume agent with the below command

flume-ng agent -n wh --conf ./conf/ -f C:/apache-flume-1.8.0-bin/conf/flume-conf-serverlogs.properties

I am facing the below exception from flume

Sourcing environment configuration script ./conf/\flume-env.ps1
WARN: Did not find ./conf/\flume-env.ps1
Including Hadoop libraries found in (C:\hadoop-2.7.6) for DFS access
Including HBase libraries found via (C:\hbase-2.1.0) for HBase access
WARN: HIVE_HOME not found

  Running FLUME agent :
    class: org.apache.flume.node.Application
    arguments: -n wh -f "C:\apache-flume-1.8.0-bin\conf\flume-conf-serverlogs.properties"

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/C:/apache-flume-1.8.0-bin/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/C:/hadoop-2.7.6/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
18/10/16 14:23:20 INFO node.PollingPropertiesFileConfigurationProvider: Configuration provider starting
18/10/16 14:23:20 INFO node.PollingPropertiesFileConfigurationProvider: Reloading configuration file:C:\apache-flume-1.8.0-bin\conf\flume-conf-serverlogs.properties
18/10/16 14:23:20 INFO conf.FlumeConfiguration: Processing:k1
18/10/16 14:23:20 INFO conf.FlumeConfiguration: Processing:k1
18/10/16 14:23:20 INFO conf.FlumeConfiguration: Added sinks: k1 Agent: wh
18/10/16 14:23:20 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [wh]
18/10/16 14:23:20 INFO node.AbstractConfigurationProvider: Creating channels
18/10/16 14:23:20 INFO channel.DefaultChannelFactory: Creating instance of channel mem type memory
18/10/16 14:23:20 INFO node.AbstractConfigurationProvider: Created channel mem
18/10/16 14:23:20 INFO source.DefaultSourceFactory: Creating instance of source ws, type exec
18/10/16 14:23:20 INFO sink.DefaultSinkFactory: Creating instance of sink: k1, type: logger
18/10/16 14:23:20 INFO node.AbstractConfigurationProvider: Channel mem connected to [ws, k1]
18/10/16 14:23:20 INFO node.Application: Starting new configuration:{ sourceRunners:{ws=EventDrivenSourceRunner: { source:org.apache.flume.source.ExecSource{name:ws,state:IDLE} }}
sinkRunners:{k1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@2ebdc61b counterGroup:{ name:null counters:{} } }} channels:{mem=org.apache.flume.channel.MemoryChan
nel{name: mem}} }
18/10/16 14:23:20 INFO node.Application: Starting Channel mem
18/10/16 14:23:20 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: CHANNEL, name: mem: Successfully registered new MBean.
18/10/16 14:23:20 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: mem started
18/10/16 14:23:20 INFO node.Application: Starting Sink k1
18/10/16 14:23:20 INFO node.Application: Starting Source ws
18/10/16 14:23:20 INFO source.ExecSource: Exec source starting with command: tail -F C:/Users/nirmal.b/Desktop/serverlogs/serverlog.txt
18/10/16 14:23:20 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: SOURCE, name: ws: Successfully registered new MBean.
18/10/16 14:23:20 INFO instrumentation.MonitoredCounterGroup: Component type: SOURCE, name: ws started
18/10/16 14:23:20 ERROR source.ExecSource: Failed while running command: tail -F C:/Users/nirmal.b/Desktop/serverlogs/serverlog.txt
java.io.IOException: Cannot run program "tail": CreateProcess error=2, The system cannot find the file specified
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
        at org.apache.flume.source.ExecSource$ExecRunnable.run(ExecSource.java:302)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified
        at java.lang.ProcessImpl.create(Native Method)
        at java.lang.ProcessImpl.<init>(ProcessImpl.java:386)
        at java.lang.ProcessImpl.start(ProcessImpl.java:137)
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
        ... 6 more
18/10/16 14:23:20 INFO source.ExecSource: Command [tail -F C:/Users/nirmal.b/Desktop/serverlogs/serverlog.txt] exited with -1073741824

I have attached the console image also screenshot

My goal is to stream the data from server log file in to hdfs via flume. kindly help me to resolve this issue.


Solution

  • tail is a native Unix tool and a native GNU tool for linux system.
    You do not have the command tail available on your system (windows).

    Solution 1
    Install UnxUtils for Windows so that the tail command is available on your windows system. (make sure the tail command is present in your PATH environment variable).

    Solution 2
    Use a flume Spooling Directory Source instead the exec flume source.