Skip to content

Invalid links returned with some chinese characters as delimiters #15

@tommedema

Description

@tommedema

Steps to reproduce:

  1. linkify the following text

【视频奇志大兵《发烧友》 在线观看 - 酷6视频】奇志大兵《发烧友》 在线观看,奇志 大兵 搞笑双簧 _ 发烧友 (追星族) http://t.cn/RZwjG7U(分享自 @酷6网)

  1. the output link is

http://t.cn/RZwjG7U(分享自

whereas the output link should be

http://t.cn/RZwjG7U

The reason is that ( is not recognized as a separating delimiter, yet it is quite common in Chinese.

Out of 500 posts I gathered, about 20 to 30 of them had links like this, resulting in invalid links reported by linkify.

Note that I realize that these users are technically posting invalid URLs, but 20-30 out of 500 is very common and therefore there should be a way to deal with this. Any suggestion?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions