Does hive have any array comparison functions/udf's to see if array=array
?
For example:
Select a.xyz, b.abc from a left join b on a.C=b.D
Where C
and D
are arrays
Arrays are ordered structures,
concat_ws( <separator> , <array> )
this function will join all the array elements into a string using the separator.
create table arrayDemo( id bigint, list array<String> );
create table dummy( a int );
insert into table dummy values ( 1 ) ;
insert into arraydemo select 1, array("Paperino", "Topolino") from dummy;
insert into arraydemo select 2, array("Pippo", "Pluto") from dummy;
insert into arraydemo select 1, array("Pippo", "Pluto") from dummy;
select * from arraydemo;
+---------------+--------------------------+--+
| arraydemo.id | arraydemo.list |
+---------------+--------------------------+--+
| 1 | ["Paperino","Topolino"] |
| 2 | ["Pippo","Pluto"] |
| 1 | ["Pippo","Pluto"] |
+---------------+--------------------------+--+
select *
from arraydemo as a1
inner join arraydemo as a2
on concat_ws( "|", a1.list ) = concat_ws("|", a2.list );
+--------+--------------------------+--------+--------------------------+--+
| a1.id | a1.list | a2.id | a2.list |
+--------+--------------------------+--------+--------------------------+--+
| 1 | ["Paperino","Topolino"] | 1 | ["Paperino","Topolino"] |
| 2 | ["Pippo","Pluto"] | 2 | ["Pippo","Pluto"] |
| 1 | ["Pippo","Pluto"] | 2 | ["Pippo","Pluto"] |
| 2 | ["Pippo","Pluto"] | 1 | ["Pippo","Pluto"] |
| 1 | ["Pippo","Pluto"] | 1 | ["Pippo","Pluto"] |
+--------+--------------------------+--------+--------------------------+--+
I'm using it in this example in a "toString()" fashion. Sometimes the best is to compare only the important pieces of the arrays
select *
from arraydemo as a1
inner join arraydemo as a2
on a1.list[0] = a2.list[0];
Hope this helps.