Good Morning,
I'm trying to implement the massive data dump cassandra example using the bulk-loading (http://www.datastax.com/dev/blog/bulk-loading) as a guide.
In the example resolve dependencies with the script (http://www.datastax.com/wp-content/uploads/2011/08/DataImport) but I find that the dependencies to be covered with cassandra libraries not located in the directories listed here because the version I'm working with dse with cassandra 2.0. Well then trying to cover such dependencies get the following script.
#!/bin/sh
# paths to the cassandra source tree, cassandra jar and java
CASSANDRA_HOME="/usr/share/dse/cassandra"
# CASSANDRA_JAR="./apache-cassandra-2.0.10.jar"
JAVA=`which java`
# Java classpath. Must include:
# - directory of DataImportExample
# - directory with cassandra/log4j config files
# - cassandra jar
# - cassandra depencies jar
CLASSPATH=".:/usr/share/dse/dse.jar:./slf4j-1.7.7/slf4-nop-1.7.7.jar:./slf4j-1.7.7/slf4j-simple-1.7.7.jar:/etc/dse/cassandra"
for jar in $CASSANDRA_HOME/lib/*.jar; do
CLASSPATH=$CLASSPATH:$jar
done
$JAVA -ea -cp $CLASSPATH -Xmx256M \
-Dlog4j.configuration=log4j-tools.properties \
CassandraDataBulk "$@"
CASSANDRA_JAR is commented and I use "cassandra-all-2.0.8.39.jar" located in the folder "/ usr / share / dse / cassandra / lib" and is already included.
I solve slf4j dependencies downloading that in 1.7.7 version.
Due to the difference of cassandra version also I had to accustom SSTableSimpleUnsortedWriter builder.
IPartitioner partitioner = new RandomPartitioner();
SSTableSimpleUnsortedWriter sourcesWriter = new SSTableSimpleUnsortedWriter(
directory,
partitioner,
keyspace,
table,
AsciiType.instance,
null,
64
);
It seems that the problem today is that there are still dependencies. Under, the trace error I get.
There is a dependency but it seems that being "org.apache.commons.configuration.ConfigurationRuntimeException" the real problem could be another, Could have a bad configuration "cassandra.yaml"?
Thanks, A greeting!
[dmdb@vm-dmdb01 ~]$ ./init_env.sh export.csv
[main] ERROR org.apache.cassandra.cql3.QueryProcessor - Unable to initialize MemoryMeter (jamm not specified as javaagent). This means Cassandra will be unable to measure object sizes accurately and may consequently OOM.
[main] INFO org.apache.cassandra.config.YamlConfigurationLoader - Loading settings from file:/etc/dse/cassandra/cassandra.yaml
[main] INFO org.apache.cassandra.config.DatabaseDescriptor - Data files directories: [/data01, /data02]
[main] INFO org.apache.cassandra.config.DatabaseDescriptor - Commit log directory: /datatmp/commitlog
[main] INFO org.apache.cassandra.config.DatabaseDescriptor - DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap
[main] INFO org.apache.cassandra.config.DatabaseDescriptor - disk_failure_policy is stop
[main] INFO org.apache.cassandra.config.DatabaseDescriptor - commit_failure_policy is stop
[main] INFO org.apache.cassandra.config.DatabaseDescriptor - Global memtable threshold is enabled at 61MB
[main] INFO com.datastax.bdp.snitch.Workload - Setting my workload to Cassandra
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/configuration/ConfigurationRuntimeException
at com.datastax.bdp.config.ConfigUtil.defaultValue(ConfigUtil.java:18)
at com.datastax.bdp.config.DseConfig.<clinit>(DseConfig.java:51)
at com.datastax.bdp.snitch.DseDelegateSnitch.<init>(DseDelegateSnitch.java:42)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at java.lang.Class.newInstance(Class.java:374)
at org.apache.cassandra.utils.FBUtilities.construct(FBUtilities.java:488)
at org.apache.cassandra.config.DatabaseDescriptor.createEndpointSnitch(DatabaseDescriptor.java:508)
at org.apache.cassandra.config.DatabaseDescriptor.applyConfig(DatabaseDescriptor.java:341)
at org.apache.cassandra.config.DatabaseDescriptor.<clinit>(DatabaseDescriptor.java:111)
at org.apache.cassandra.io.sstable.AbstractSSTableSimpleWriter.<init>(AbstractSSTableSimpleWriter.java:50)
at org.apache.cassandra.io.sstable.SSTableSimpleUnsortedWriter.<init>(SSTableSimpleUnsortedWriter.java:96)
at org.apache.cassandra.io.sstable.SSTableSimpleUnsortedWriter.<init>(SSTableSimpleUnsortedWriter.java:80)
at org.apache.cassandra.io.sstable.SSTableSimpleUnsortedWriter.<init>(SSTableSimpleUnsortedWriter.java:91)
at CassandraDataBulk.main(CassandraDataBulk.java:35)
Caused by: java.lang.ClassNotFoundException: org.apache.commons.configuration.ConfigurationRuntimeException
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 17 more
You are missing a "javaagent" parameter in your java call. Add the following:
-javaagent:$CASSANDRA_HOME/lib/jamm-0.2.5.jar
Your final call should look like:
$JAVA -ea -cp $CLASSPATH -Xmx256M \
-Dlog4j.configuration=log4j-tools.properties \
-javaagent:$CASSANDRA_HOME/lib/jamm-0.2.5.jar
CassandraDataBulk "$@"
NOTE: Adjust the path to jamm.jar as necessary
As for the runtime configuration error, download apache commons 'lang' library and include it to your classpath.
If you receive NEW exceptions after implementing the fix, download google-common.jar and guava-16.0.1.jar and include them as well to your classpath. These are all of the JARs that my own bulk loader required so far.