You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/content/docs/reference/tools/stringmanipulatortool/index.md
+184-9Lines changed: 184 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,8 @@
1
1
---
2
2
title: String Manipulator Tool
3
-
description: Used to process the String fields of a work item. This is useful for cleaning up data. It will limit fields to a max length and apply regex replacements based on what is configured. Each regex replacement is applied in order and can be enabled or disabled.
3
+
description: Processes and cleans up string fields in work items by applying regex patterns, length limitations, and text transformations. Essential for data cleanup and standardization during migration.
The String Manipulator Tool provides powerful text processing capabilities for work item migration. It applies configurable string manipulations to all text fields in work items, enabling data cleanup, standardization, and format corrections during the migration process.
18
19
19
-
{{< class-options >}}
20
+
The tool processes string fields through a series of regex-based manipulators that can remove invalid characters, standardize formats, replace text patterns, and enforce field length limits. Each manipulation is applied in sequence and can be individually enabled or disabled.
21
+
22
+
### How It Works
23
+
24
+
The String Manipulator Tool operates on all string fields within work items during migration:
25
+
26
+
1.**Field Processing**: The tool identifies all string-type fields in each work item
27
+
2.**Sequential Application**: Each configured manipulator is applied in the order defined in the configuration
28
+
3.**Regex Transformations**: Pattern-based replacements using regular expressions
29
+
4.**Length Enforcement**: Truncates fields that exceed the maximum allowed length
30
+
5.**Conditional Execution**: Each manipulator can be individually enabled or disabled
31
+
32
+
The tool is automatically invoked by migration processors and applies transformations before work items are saved to the target system.
33
+
34
+
### Use Cases
35
+
36
+
Common scenarios where the String Manipulator Tool is essential:
20
37
21
-
## Samples
38
+
-**Data Cleanup**: Removing invalid Unicode characters, control characters, or formatting artifacts
39
+
-**Format Standardization**: Converting text patterns to consistent formats
40
+
-**Length Compliance**: Ensuring field values don't exceed target system limits
41
+
-**Character Encoding**: Fixing encoding issues from legacy systems
42
+
-**Pattern Replacement**: Updating URLs, paths, or references to match target environment
43
+
44
+
## Configuration Structure
45
+
46
+
### Options
47
+
48
+
{{< class-options >}}
22
49
23
50
### Sample
24
51
@@ -28,13 +55,161 @@ discussionId: 2643
28
55
29
56
{{< class-sample sample="defaults" >}}
30
57
31
-
### Classic
58
+
### Basic Examples
59
+
60
+
The String Manipulator Tool is configured with an array of manipulators, each defining a specific text transformation:
61
+
62
+
```json
63
+
{
64
+
"StringManipulatorTool": {
65
+
"Enabled": true,
66
+
"MaxStringLength": 1000000,
67
+
"Manipulators": [
68
+
{
69
+
"$type": "RegexStringManipulator",
70
+
"Enabled": true,
71
+
"Description": "Remove invalid characters",
72
+
"Pattern": "[^\\x20-\\x7E\\r\\n\\t]",
73
+
"Replacement": ""
74
+
}
75
+
]
76
+
}
77
+
}
78
+
```
79
+
80
+
### Complex Examples
81
+
82
+
#### Manipulator Types
83
+
84
+
Currently, the tool supports the following manipulator types:
85
+
86
+
-**RegexStringManipulator**: Applies regular expression pattern matching and replacement
87
+
88
+
#### Manipulator Properties
89
+
90
+
Each manipulator supports these properties:
91
+
92
+
-**$type**: Specifies the manipulator type (e.g., "RegexStringManipulator")
93
+
-**Enabled**: Boolean flag to enable/disable this specific manipulator
94
+
-**Description**: Human-readable description of what the manipulator does
95
+
-**Pattern**: Regular expression pattern to match text
96
+
-**Replacement**: Text to replace matched patterns (can be empty string for removal)
97
+
98
+
## Common Scenarios
99
+
100
+
### Removing Invalid Characters
101
+
102
+
Remove non-printable characters that may cause issues in the target system:
103
+
104
+
```json
105
+
{
106
+
"$type": "RegexStringManipulator",
107
+
"Description": "Remove invalid characters from the end of the string",
108
+
"Enabled": true,
109
+
"Pattern": "[^( -~)\n\r\t]+",
110
+
"Replacement": ""
111
+
}
112
+
```
113
+
114
+
### Standardizing Line Endings
115
+
116
+
Convert all line endings to a consistent format:
117
+
118
+
```json
119
+
{
120
+
"$type": "RegexStringManipulator",
121
+
"Description": "Standardize line endings to CRLF",
122
+
"Enabled": true,
123
+
"Pattern": "\r\n|\n|\r",
124
+
"Replacement": "\r\n"
125
+
}
126
+
```
127
+
128
+
### Cleaning HTML Content
129
+
130
+
Remove or clean HTML tags from text fields:
131
+
132
+
```json
133
+
{
134
+
"$type": "RegexStringManipulator",
135
+
"Description": "Remove HTML tags",
136
+
"Enabled": true,
137
+
"Pattern": "<[^>]*>",
138
+
"Replacement": ""
139
+
}
140
+
```
141
+
142
+
### Fixing Encoding Issues
143
+
144
+
Replace common encoding artifacts:
145
+
146
+
```json
147
+
{
148
+
"$type": "RegexStringManipulator",
149
+
"Description": "Fix common encoding issues",
150
+
"Enabled": true,
151
+
"Pattern": "’|“|â€\u009d",
152
+
"Replacement": "'"
153
+
}
154
+
```
155
+
156
+
## Good Practices
157
+
158
+
### Pattern Testing
159
+
160
+
-**Test regex patterns** thoroughly before applying to production data
161
+
-**Use regex testing tools** to validate patterns against sample data
162
+
-**Consider edge cases** and unintended matches in your patterns
163
+
164
+
### Performance Considerations
165
+
166
+
-**Order manipulators efficiently**: Place simpler patterns before complex ones
167
+
-**Use specific patterns**: Avoid overly broad regex that may match unintended content
168
+
-**Consider field length**: Set appropriate `MaxStringLength` to prevent excessive processing
169
+
170
+
### Data Safety
171
+
172
+
-**Backup source data**: Always maintain backups before applying string manipulations
173
+
-**Test with sample data**: Validate manipulations on a subset before full migration
174
+
-**Review results**: Check processed fields to ensure transformations are correct
175
+
176
+
### Configuration Management
177
+
178
+
-**Document patterns**: Include clear descriptions for each manipulator
179
+
-**Version control**: Maintain configuration files in version control
180
+
-**Incremental changes**: Test one manipulator at a time when developing complex transformations
181
+
182
+
## Troubleshooting
183
+
184
+
### Common Issues
185
+
186
+
**Manipulations Not Applied:**
187
+
188
+
- Verify the tool is enabled (`"Enabled": true`)
189
+
- Check that individual manipulators are enabled
190
+
- Review regex patterns for syntax errors
191
+
- Ensure the tool is configured in the processor's tool list
192
+
193
+
**Unexpected Results:**
194
+
195
+
- Test regex patterns in isolation with sample data
196
+
- Check the order of manipulators (they execute sequentially)
197
+
- Verify escape sequences in JSON configuration
198
+
- Review field content before and after processing
199
+
200
+
**Performance Issues:**
32
201
33
-
{{< class-sample sample="classic" >}}
202
+
- Consider reducing `MaxStringLength` if processing very large fields
203
+
- Optimize regex patterns to avoid catastrophic backtracking
204
+
- Disable unnecessary manipulators
205
+
- Process smaller batches of work items
34
206
35
-
## Metadata
207
+
**Regex Pattern Errors:**
36
208
37
-
{{< class-metadata >}}
209
+
- Validate regex syntax using online tools or testing utilities
210
+
- Escape special characters properly in JSON configuration
0 commit comments