hive> CREATE TEMPORARY FUNCTION row_sequence as 'org.apache.hadoop.hive.contrib.udf.UDFRowSequence';
OK
CREATE TABLE IF NOT EXISTS users_inc
(
ID string,
name string
) row format delimited fields terminated by ',' stored as textfile;
id ID value is int ,Then (ID int ...use this when creating table)
hive> insert into table users_inc
select m.max+row_sequence() as inc , ename
from (select ename from emp_csv) e
join
(select max(ID) as max from users_inc) m;
handling nulls when inserting table first time : (max val we get null when table is empty)
================
insert into table users_inc
select m.max+row_sequence() as inc , ename
from (select ename from emp_csv) e
join
(select coalesce(max(ID),0) as max from users_inc) m;
select * from users_inc;
===>for alpha numeric id increment value
insert into table users_inc
select concat("ABC-",m.max+row_sequence()) as inc , ename
from (select ename from emp_csv) e
join
(select coalesce(max(substr(ID,5)),0) as max from users_inc) m;
---->regexp_replace
insert into table users_inc
select regexp_replace(concat("ABC-",m.max+row_sequence()),'\\.0','') as inc, ename from (select ename from emp_csv) e
join
(select coalesce(max(substr(ID,5)),0) as max from users_inc) m; coalesce() : COALESCE(T v1, T v2, ...)
Returns the first v that is not NULL in list of values, or NULL if all v's are NULL.
 
Bro Can you show the steps with Screenshots of hive sessions. It's not working for me.
ReplyDelete