regexhivehive-query

How to select columns in Hive SQL with the same prefix (beginning) or suffix (ending) or key word in the middle (including)


EDIT 1: Note, I know some of us will question why not list different parts of information in different attributes, so that I will have a relational database to query. The real case is not like the example I am listing below, the variable names are just used here for convenience.

EDIT 2: To reduce the confusion of database design, I change the variable names in the example.

In Hive query, I am looking for a way to select columns with the same prefix, or the same suffix, or including the same key word in the middle of the variable names from the same table.

Here is an example: I have a list of variables like this:

a_A_1, a_A_2, a_B_1, a_B_2,
b_A_1, b_A_2, b_B_1, b_B_2

Exercise 1 I want to select all the attributes starting with 'a'.

Exercise 2 I want to select all the attributes ending with '1'.

Exercise 3 I want to select all the attributes including 'B'.

Much thanks in advance!


Solution

  • Luckily I found a way to do so and I hope it can benefit many others who are looking for the same answer.

    First of all, you need to run this setting in your Hive environment:

    set hive.support.quoted.identifiers=none;
    

    See solutions below

    Exercise 1

    select `a.*` from test_table;
    

    Exercise 2

    select `.*1$` from test_table;
    

    Exercise 3

    select `.*B.*` from test_table;