javaspringhbasespring-data-hadoop

Spring HbaseTemplate keeping connection alive


I managed to integrate Hbase into a Spring app using HbaseTemplate:

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.data.hadoop.hbase.HbaseTemplate;
import org.springframework.stereotype.Component;

import java.util.List;

@Component
public class ItemRepositoryImpl implements ItemRepository {

    @Autowired
    private HbaseTemplate hbaseTemplate;

    @Override
    public List<Item> findAll() {
        Scan scan = new Scan();
        scan.addColumn(CF, CQ);
        hbaseTemplate.find("TABLE_NAME", scan, (result, rowNum) -> {
            return new Item(...)
        });
    }
}

However, the connection to Hbase is opened every time I run findAll() (and closed just after). I read somewhere that the way to keep the connection alive is to use Connection and Table for calls to Hbase. The problem is that HbaseTemplate uses HConnection and HTableInterface.

How can I keep my connection alive using HbaseTemplate? Initiating a new connection is very time-consuming and I'd like to do it only once. Alternatively is there any other way to connect to Hbase from a Spring app?

I'm using:

org.springframework.data:spring-data-hadoop:2.5.0.RELEASE
org.apache.hbase:hbase-client:1.1.2

Solution

  • I found two solutions to this problem:

    Custom HbaseTemplate which extends HbaseAccessor and implements HbaseOperations

    The best way seems to be to create a custom class which extends HbaseAccessor and implements HbaseOperations in the similar way as the original HbaseTemplate does, but using the newer API (ie. Table instead of HTableInterface etc.)

    One of the examples how it's implemented can be found in the easyhbase project.

    Injecting Connection instead of HbaseTemplate

    The other solution is to inject Connection to the repository and do all the heavy lifting there:

    import org.apache.hadoop.hbase.TableName;
    import org.apache.hadoop.hbase.client.*;
    import org.springframework.beans.factory.annotation.Autowired;
    import org.springframework.stereotype.Component;
    
    import java.util.List;
    import java.stream.Collectors;
    import java.stream.StreamSupport;
    
    @Component
    public class ItemRepositoryImpl implements ItemRepository {
    
        @Autowired
        private Connection connection;
    
        @Override
        public List<Item> findAll() throws IOException {
            Scan scan = new Scan();
            scan.addColumn(CF, CQ);
            try (Table table = connection.getTable(TableName.valueOf(TABLE_NAME))) {
                return StreamSupport
                    .stream(table.getScanner(scan).spliterator, false)
                    .map(...)
                    .collect(Collectors.toList());
            }
        }
    }
    

    The Connection @Bean can be configured like that:

    @Configuration
    public class HbaseConfiguration {
    
        @Bean
        public Connection() throws IOException {
            org.apache.hadoop.conf.Configuration conf = HBaseConfiguration.create();
            // configuration setup
            return ConnectionFactory.createConnection(conf);
        }
    
    }