I am trying to configure beans for Hadoop/Hive environment. According to documentation I need Apache Hadoop Configuration class, which should be autowired. See: http://docs.spring.io/spring-hadoop/docs/2.4.0.RELEASE/reference/html/springandhadoop-store.html (section 6.2.2 Configuring the dataset support)
Yet, when I try to run my app, I get: NoSuchBeanDefinitionException: No qualifying bean of type [org.apache.hadoop.conf.Configuration] found for dependency: expected at least 1 bean which qualifies as autowire candidate for this dependency.
My class is very simple:
@SpringBootApplication
public class HiveTestApp implements CommandLineRunner {
private
@Autowired
org.apache.hadoop.conf.Configuration hadoopConfiguration;
...
I am using Cloudera cluster, here are dependencies:
dependencies {
compile(
'org.springframework.boot:spring-boot-starter-web',
'org.springframework.data:spring-data-hadoop-hive:2.4.0.RELEASE-cdh5',
'org.apache.hive:hive-jdbc:1.1.0-cdh5.4.3',
)
Now, I might be wrong, but I can remember in the past I used autowired config, and it worked fine. Has anything changed in the latest version? Am I missing something?
OK here's the solution.
@Configuration
public class ApplicationConfiguration {
@Value("${com.domain.app.hadoop.fs-uri}")
private URI hdfsUri;
@Value("${com.domain.app.hadoop.user}")
private String user;
@Value("${com.domain.app.hadoop.hive.jdbc-uri}")
private String hiveUri;
@Autowired
private org.apache.hadoop.conf.Configuration hadoopConfiguration;
@Bean
public org.apache.hadoop.conf.Configuration hadoopConfiguration() {
return new org.apache.hadoop.conf.Configuration();
}
@Bean
public HdfsResourceLoader hdfsResourceLoader() {
return new HdfsResourceLoader(hadoopConfiguration, hdfsUri, user);
}
@Bean
public HiveTemplate hiveTemplate() {
return new HiveTemplate(() -> {
final SimpleDriverDataSource dataSource = new SimpleDriverDataSource(new HiveDriver(), hiveUri);
return new HiveClient(dataSource);
});
}
}
Configuration file below.
com.domain.app.hadoop:
fs-uri: "hdfs://hadoop-cluster/"
user: "hdfs-user"
hive.jdbc-uri: "jdbc:hive2://hadoop-cluster:10000/hive-db"
I've made Hadoop configuration object a bean, because I need to inject it in one of the classes. If you don't need a bean, you can just create new instance by yourself.