javahadoopspring-bootspring-data-hadoop

Spring Hadoop config - No qualifying bean of type org.apache.hadoop.conf.Configuration


I am trying to configure beans for Hadoop/Hive environment. According to documentation I need Apache Hadoop Configuration class, which should be autowired. See: http://docs.spring.io/spring-hadoop/docs/2.4.0.RELEASE/reference/html/springandhadoop-store.html (section 6.2.2 Configuring the dataset support)

Yet, when I try to run my app, I get: NoSuchBeanDefinitionException: No qualifying bean of type [org.apache.hadoop.conf.Configuration] found for dependency: expected at least 1 bean which qualifies as autowire candidate for this dependency.

My class is very simple:

@SpringBootApplication
public class HiveTestApp implements CommandLineRunner {
    private
    @Autowired
    org.apache.hadoop.conf.Configuration hadoopConfiguration;

    ...

I am using Cloudera cluster, here are dependencies:

dependencies {
    compile(
            'org.springframework.boot:spring-boot-starter-web',
            'org.springframework.data:spring-data-hadoop-hive:2.4.0.RELEASE-cdh5',
            'org.apache.hive:hive-jdbc:1.1.0-cdh5.4.3',
    )

Now, I might be wrong, but I can remember in the past I used autowired config, and it worked fine. Has anything changed in the latest version? Am I missing something?


Solution

  • OK here's the solution.

    @Configuration
    public class ApplicationConfiguration {
        @Value("${com.domain.app.hadoop.fs-uri}")
        private URI hdfsUri;
    
        @Value("${com.domain.app.hadoop.user}")
        private String user;
    
        @Value("${com.domain.app.hadoop.hive.jdbc-uri}")
        private String hiveUri;
    
        @Autowired
        private org.apache.hadoop.conf.Configuration hadoopConfiguration;
    
        @Bean
        public org.apache.hadoop.conf.Configuration hadoopConfiguration() {
            return new org.apache.hadoop.conf.Configuration();
        }
    
        @Bean
        public HdfsResourceLoader hdfsResourceLoader() {
            return new HdfsResourceLoader(hadoopConfiguration, hdfsUri, user);
        }
    
        @Bean
        public HiveTemplate hiveTemplate() {
            return new HiveTemplate(() -> {
                final SimpleDriverDataSource dataSource = new SimpleDriverDataSource(new HiveDriver(), hiveUri);
                return new HiveClient(dataSource);
            });
        }
    }
    

    Configuration file below.

    com.domain.app.hadoop:
      fs-uri: "hdfs://hadoop-cluster/"
      user: "hdfs-user"
      hive.jdbc-uri: "jdbc:hive2://hadoop-cluster:10000/hive-db"
    

    I've made Hadoop configuration object a bean, because I need to inject it in one of the classes. If you don't need a bean, you can just create new instance by yourself.