Skip to content

feat: 允许直接从文件对象加载词典 #24

@WFLing-seaer

Description

@WFLing-seaer

使用场景:cn2t使用了cutword进行分词,但是在自定义词典时不得不采用如下方案:

path_kwd_temp = os.path.expandvars("%temp%/cn2t_kwd.txt")
with open(path_kwd_temp, "w", encoding="utf-8") as f:
    CUTWORD_EXCLUDE = {"_NUM", ".", " "}
    f.write("\n".join(keyword + "\t1\tN" for keyword in keywords if keyword not in CUTWORD_EXCLUDE))

cutter = cutword.Cutter(dict_name=path_kwd_temp)

也就是说,必须先把词库写到文件,再让Cutter从文件加载词库。
可否为cutter添加一file或类似的参数,使得它能直接从文件对象(比如BytesIO)甚至dict中加载词库?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions