Replies: 1 comment
-
I was unable to find any existing documentation or issues in the Dify repository that directly address the problem of periods (.) being omitted during text preprocessing in version 1.3. The available documentation and code references describe general text preprocessing steps—such as removing extra spaces, URLs, emails, and stopwords, and segmenting text using separators like Based on what is documented, periods should not be removed by default during preprocessing, unless they are being affected by a custom segmentation rule or an unintended side effect of the chosen segment identifier. Since you mentioned that periods are only retained when using the newline character ( At this time, I could not find a documented solution or workaround specific to this problem in version 1.3. If you have already tried adjusting the segment identifier and the issue persists, it may be an undocumented or new issue. If you need further troubleshooting, reviewing the specific preprocessing or segmentation logic in your deployment or sharing more details about your configuration might help identify the root cause. To continue talking to Dosu, mention @dosu-bot. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Self Checks
Provide a description of requested docs changes
Hi,
I'm writing to inquire about an issue I'm encountering while trying to input internal knowledge.
Specifically, during the text preprocessing stage (after 1. selecting the data source), I've noticed that periods (.) are being omitted from the chunked content.
I've tried adjusting the segment identifier. However, unless I use \n (newline character) as the identifier, periods continue to be omitted.
I am using version 1.3.
Is there a solution to prevent periods from being omitted?
Beta Was this translation helpful? Give feedback.
All reactions