**While running the flume command i am getting the following error,i tried changing the envi variables in .bashrc along with classpath in flume.env.sh ,still no use
Picked up JAVA_TOOL_OPTIONS: -javaagent:/usr/share/java/jayatanaag.jar
16/12/08 01:57:11 INFO node.PollingPropertiesFileConfigurationProvider: Configuration provider starting
16/12/08 01:57:11 INFO node.PollingPropertiesFileConfigurationProvider: Reloading configuration file:../conf/twitter.conf
16/12/08 01:57:11 WARN conf.FlumeConfiguration: Invalid property specified: sink.HDFS.hdfs.path
16/12/08 01:57:11 WARN conf.FlumeConfiguration: Configuration property ignored: TwitterAgent.sink.HDFS.hdfs.path = hdfs://localhost:8020/datamain/tweets
16/12/08 01:57:11 WARN conf.FlumeConfiguration: Invalid property specified: sink.HDFS.hdfs.writeFormat
16/12/08 01:57:11 WARN conf.FlumeConfiguration: Configuration property ignored: TwitterAgent.sink.HDFS.hdfs.writeFormat = Text
16/12/08 01:57:11 WARN conf.FlumeConfiguration: Invalid property specified: sink.HDFS.hdfs.rollCount
16/12/08 01:57:11 WARN conf.FlumeConfiguration: Configuration property ignored: TwitterAgent.sink.HDFS.hdfs.rollCount = 10000
16/12/08 01:57:11 INFO conf.FlumeConfiguration: Added sinks: HDFS Agent: TwitterAgent
16/12/08 01:57:11 WARN conf.FlumeConfiguration: Invalid property specified: sink.HDFS.hdfs.rollSize
16/12/08 01:57:11 WARN conf.FlumeConfiguration: Configuration property ignored: TwitterAgent.sink.HDFS.hdfs.rollSize = 0
16/12/08 01:57:11 WARN conf.FlumeConfiguration: Invalid property specified: sink.HDFS.channels
16/12/08 01:57:11 WARN conf.FlumeConfiguration: Configuration property ignored: TwitterAgent.sink.HDFS.channels = MemChannel
16/12/08 01:57:11 WARN conf.FlumeConfiguration: Invalid property specified: sink.HDFS.hdfs.batchSize
16/12/08 01:57:11 WARN conf.FlumeConfiguration: Configuration property ignored: TwitterAgent.sink.HDFS.hdfs.batchSize = 1000
16/12/08 01:57:11 WARN conf.FlumeConfiguration: Invalid property specified: sink.HDFS.hdfs.fileType
16/12/08 01:57:11 WARN conf.FlumeConfiguration: Configuration property ignored: TwitterAgent.sink.HDFS.hdfs.fileType = DataStream
16/12/08 01:57:11 WARN conf.FlumeConfiguration: Invalid property specified: sink.HDFS.type
16/12/08 01:57:11 WARN conf.FlumeConfiguration: Configuration property ignored: TwitterAgent.sink.HDFS.type = hdfs
16/12/08 01:57:11 WARN conf.FlumeConfiguration: Invalid property specified: sink.HDFS.hdfs.rollInterval
16/12/08 01:57:11 WARN conf.FlumeConfiguration: Configuration property ignored: TwitterAgent.sink.HDFS.hdfs.rollInterval = 600
**Also seems like there is problem with my sink command in twitter.conf but i am unable to figure it out,below is the twitter.conf file
16/12/08 01:57:11 WARN conf.FlumeConfiguration: no context for sinkHDFS
16/12/08 01:57:12 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [TwitterAgent]
16/12/08 01:57:12 INFO node.AbstractConfigurationProvider: Creating channels
16/12/08 01:57:12 INFO channel.DefaultChannelFactory: Creating instance of channel MemChannel type memory
16/12/08 01:57:12 INFO node.AbstractConfigurationProvider: Created channel MemChannel
16/12/08 01:57:12 INFO source.DefaultSourceFactory: Creating instance of source Twitter, type org.apache.flume.source.twitter.TwitterSource
16/12/08 01:57:12 ERROR node.PollingPropertiesFileConfigurationProvider: Failed to load configuration data. Exception follows.
org.apache.flume.FlumeException: Unable to load source type: org.apache.flume.source.twitter.TwitterSource, class: org.apache.flume.source.twitter.TwitterSource
at org.apache.flume.source.DefaultSourceFactory.getClass(DefaultSourceFactory.java:67)
at org.apache.flume.source.DefaultSourceFactory.create(DefaultSourceFactory.java:40)
at org.apache.flume.node.AbstractConfigurationProvider.loadSources(AbstractConfigurationProvider.java:327)
at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:102)
at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:140)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: org.apache.flume.source.twitter.TwitterSource
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:195)
at org.apache.flume.source.DefaultSourceFactory.getClass(DefaultSourceFactory.java:65)
... 11 more
`TwitterAgent.sources =Twitter
TwitterAgent.channels =MemChannel
TwitterAgent.sinks = HDFS
TwitterAgent.sources.Twitter.type = org.apache.flume.source.twitter.TwitterSource
TwitterAgent.sources.Twitter.channels = MemChannel
TwitterAgent.sources.Twitter.consumerKey = ###
TwitterAgent.sources.Twitter.consumerSecret = ###
TwitterAgent.sources.Twitter.accessToken = ###
TwitterAgent.sources.Twitter.accessTokenSecret = ####
TwitterAgent.sources.Twitter.keywords =donald trump,republican,democratic
TwitterAgent.channels.MemChannel.type = memory
TwitterAgent.channels.MemChannel.capacity = 10000
TwitterAgent.channels.MemChannel.transactionCapacity = 10000
TwitterAgent.sink.HDFS.channels = MemChannel
TwitterAgent.sink.HDFS.type = hdfs
TwitterAgent.sink.HDFS.hdfs.path =hdfs://localhost:8020/datamain/tweets
TwitterAgent.sink.HDFS.hdfs.fileType = DataStream
TwitterAgent.sink.HDFS.hdfs.writeFormat = Text
TwitterAgent.sink.HDFS.hdfs.batchSize = 1000
TwitterAgent.sink.HDFS.hdfs.rollSize = 0
TwitterAgent.sink.HDFS.hdfs.rollCount = 10000
TwitterAgent.sink.HDFS.hdfs.rollInterval = 600 `
here are my flume.env.sh file details
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# If this file is placed at FLUME_CONF_DIR/flume-env.sh, it will be sourced
# during Flume startup.
# Enviroment variables can be set here.
JAVA_HOME=/usr/lib/jvm/java-7-openjdk-i386
# Give Flume more memory and pre-allocate, enable remote monitoring via JMX
JAVA_OPTS="-Xms500m -Xmx1000m -Dcom.sun.management.jmxremote"
# Note that the Flume conf directory is always included in the classpath.
FLUME_CLASSPATH=/home/user/hadoop_store/flume-sources-1.0-SNAPSHOT.jar
.bashrc details
##Flume home directory
export FLUME_HOME=/home/user/hadoop_store/apache-flume-1.4.0-bin
export FLUME_CONF_DIR=$FLUME_HOME/conf
export FLUME_CLASSPATH=$FLUME_CONF_DIR
export PATH=$FLUME_HOME/bin:$PATH
##Flume home directory
tried changing snapshot file path still didnt work,
By looking your log file messages, you are missing two things:
flume-sources-1.0-SNAPSHOT.jar
must be present in
your flume lib directory.Replace sink with sinks keyword below:
TwitterAgent.sinks.HDFS.channels = MemChannel
TwitterAgent.sinks.HDFS.type = hdfs
TwitterAgent.sinks.HDFS.hdfs.path =hdfs://localhost:8020/datamain/tweets
TwitterAgent.sinks.HDFS.hdfs.fileType = DataStream
TwitterAgent.sinks.HDFS.hdfs.writeFormat = Text
TwitterAgent.sinks.HDFS.hdfs.batchSize = 1000
TwitterAgent.sinks.HDFS.hdfs.rollSize = 0
TwitterAgent.sinks.HDFS.hdfs.rollCount = 10000
TwitterAgent.sinks.HDFS.hdfs.rollInterval = 600
Hope this help you!!!