正则工具 : regexr.com

GBK (GB2312/GB18030)

x00-xff GBK双字节编码范围
x20-x7f ASCII
xa1-xff 中文
x80-xff 中文

UTF-8 (Unicode)

u4e00-u9fa5 (中文)
x3130-x318F (韩文)
xAC00-xD7A3 (韩文)
u0800-u4e00 (日文)
uff21 – uff5a 英文全角 A-z 
uff01 - uff09 美式键盘 1-9 上标字符  02 双引号 06 中文省略号……
uff10 - uff19 全角数字  0 – 9 
uff20 @ 

韩文是大于[u9fa5]的字符

正则例子(使用PHP):

    preg_replace(“/([x80-xff])/”,”",$str);    //GBK中匹配
    preg_replace(“/([u4e00-u9fa5])/”,”",$str);    //UTF8中匹配

有的语言需要转义,使用[\u4e00-\u9fa5]来匹配

r'\u1100-\u11FF'    # Hangul Jamo
r'\u3040-\u309F'    # Hiragana
r'\u30A0-\u30FF'    # Katakana
r'\u3130-\u318F'    # Hangul Compatibility Jamo
r'\u3400-\u4DBF'    # CJK Unified Ideographs Extension A
r'\u4E00-\u9FFF'    # CJK Unified Ideographs
r'\uA960-\uA97F'    # Hangul Jamo Extended-A
r'\uAC00-\uD7A3'    # Hangul Syllables
r'\uD7B0-\uD7FF'    # Hangul Jamo Extended-B
r'\uF900-\uFAFF'    # CJK Compatibility Ideographs
r'\uFF65-\uFF9F'    # half-width katakana
r'\uFFA0-\uFFDC'    # halfwidth forms of compatibility jamo characters for Hangul
r'\u20000-\u2A6DF'  # CJK Unified Ideographs Extension B
r'\u2A700-\u2B73F'  # CJK Unified Ideographs Extension C
r'\u2B740-\u2B81F'  # CJK Unified Ideographs Extension D
r'\u2B820-\u2CEAF'  # CJK Unified Ideographs Extension E
r'\u2F800-\u2FA1F'  # CJK Compatibility Ideographs Supplement

参考