Issue
I have a japanese string "さいたま市 中央区" in my hive table. I just want the first part of the string from table i.e さいたま市.
I have tried with split function and regular expression, its not working. I tried with hive and python
Tried all these below, it didnt work
select split("さいたま市 中央区",'')[0];
select regexp_extract("さいたま市 中央区","^(.*?)\\s(.*)",1)
select regexp_extract("さいたま市 中央区","[ur'[\u4e00-\ufaff]']",1)
Just I want the first part of string.
Solution
Posting this as an answer as well ...
Copy/pasting the text from your question and running repr
gives me
>>> repr("""I have a japanese string "さいたま市 中央区" in my hive table""")
'\'I have a japanese string "さいたま市\\u3000中央区" in my hive table\''
This suggests that split(... that text ..., '\u3000')[0]
should produce the result you want.
The expression "[ur'[\u4e00-\ufaff]']"
looks extremely wrong; correcting it to ur'[\u4e00-\ufaff]'
would perhaps work as well. Or maybe try simply "[\u4e00-\ufaff]"
.
Answered By - tripleee
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.