Hive 如何在配置单元中获取管道分隔符后的第n个字符串
我在配置单元中有一个表,在其中,我想从如下所示的一列中提取字符串的第5个部分- 样本数据Hive 如何在配置单元中获取管道分隔符后的第n个字符串,hive,Hive,我在配置单元中有一个表,在其中,我想从如下所示的一列中提取字符串的第5个部分- 样本数据 john:12|doe|google|usa|google.com|newspaper - title - 1 - volume - 1234|360671191 john:34|doe|fb|usa|google.com|newspaper - title - X - volume - 1233|360671192 john:45|doe|twitter|usa|google.com|newspaper
john:12|doe|google|usa|google.com|newspaper - title - 1 - volume - 1234|360671191
john:34|doe|fb|usa|google.com|newspaper - title - X - volume - 1233|360671192
john:45|doe|twitter|usa|google.com|newspaper - title - Y - volume - 1232|360671193
jane:45:1323
我想解析出第一个管道字符(|)后的第5个字符串。输出列的值为-
newspaper - title - 1 - volume - 1234
newspaper - title - X - volume - 1233
newspaper - title - Y - volume - 1232
jane:45:1323
如果标题不存在(如记录4所示),则返回原始字符串。使用拆分函数,如下所示:
with your_data as (
select stack(4,
'john:12|doe|google|usa|google.com|newspaper - title - 1 - volume - 1234|360671191',
'john:34|doe|fb|usa|google.com|newspaper - title - X - volume - 1233|360671192',
'john:45|doe|twitter|usa|google.com|newspaper - title - Y - volume - 1232|360671193',
'jane:45:1323'
) as str
)
select nvl(splitted_str[5], original_str) result
from
(
select split(str,'\\|') splitted_str, str original_str
from your_data
)s;
返回:
newspaper - title - 1 - volume - 1234
newspaper - title - X - volume - 1233
newspaper - title - Y - volume - 1232
jane:45:1323
使用拆分函数,如下所示:
with your_data as (
select stack(4,
'john:12|doe|google|usa|google.com|newspaper - title - 1 - volume - 1234|360671191',
'john:34|doe|fb|usa|google.com|newspaper - title - X - volume - 1233|360671192',
'john:45|doe|twitter|usa|google.com|newspaper - title - Y - volume - 1232|360671193',
'jane:45:1323'
) as str
)
select nvl(splitted_str[5], original_str) result
from
(
select split(str,'\\|') splitted_str, str original_str
from your_data
)s;
返回:
newspaper - title - 1 - volume - 1234
newspaper - title - X - volume - 1233
newspaper - title - Y - volume - 1232
jane:45:1323