Skip to content

Commit 1271339

Browse files
committed
Updates for v0.0.3
1 parent 356863d commit 1271339

File tree

15 files changed

+213
-48
lines changed

15 files changed

+213
-48
lines changed

CHANGELOG.md

Lines changed: 41 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -11,15 +11,52 @@ Here are a few common issues.
1111

1212
## [Unreleased]
1313

14-
## [0.0.2]
14+
## [0.0.3] - 2019-11-01
15+
16+
### Added
17+
18+
- Code provider:
19+
- Strings starting with a digit detection.
20+
- DateTime patterns (eg.: `dd/MM/yyyy hh:mm`) detection.
21+
- `img` HTML expression detection.
22+
- Mathematical symbols detection (eg.: `this > that`).
23+
- Strings with more than two contiguous spaces.
24+
- Class provider:
25+
- `calc()` JavaScript expression detection.
26+
- GraphQL expressions (`query`, `mutation`) detection.
27+
- New configuration setting `string-checker-js.file-extension-exclude` to specify file extensions to exclude from scan (default: `.d.ts`, `.min.js`).
28+
- New configuration setting `string-checker-js.variable.non-alpha-ratio-threshold` to specify the amount of non-alphabetical characters allowed in a string (default: `0.2`).
29+
30+
### Changed
31+
32+
- Entropy provider:
33+
- Entropy is now computed from a string cleaned of its non-alphabetical characters.
34+
Example:
35+
`'"{field}" is invalid!'` has an entropy of 3.63.
36+
`'field is invalid'` has an entropy of = 3.12.
37+
- File extension filtering now uses a suffix-compare method to support multiple-dots extension (eg.: `.d.ts`).
38+
- Configuration setting `string-checker-js.workspace.file-max` maximum value set to 1000 (500 before).
39+
- Tokens are all collapsed by default when switching to token/file view.
40+
- Code refactoring.
41+
42+
## [0.0.2] - 2019-10-30
43+
44+
### Added
45+
46+
- Code provider:
47+
- Environment variable detection.
48+
- Class provider:
49+
- `rgb()` JavaScript expression detection.
50+
51+
### Changed
1552

1653
- Formatting in README file.
1754
- "Release Notes" section removed from README file.
1855
- "Known Issues" section mode from README to CHANGELOG file.
19-
- Code provider improvement (environment variable detection).
20-
- Class provider improvement (`rgb()` javascript statement detection).
2156
- Code refactoring.
2257

23-
## [0.0.1]
58+
## [0.0.1] - 2019-10-28
59+
60+
### Added
2461

2562
- Initial release.

README.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,12 +30,14 @@ Strings are evaluated by different providers, each being dedicated to a specific
3030
| Provider | Description | Example |
3131
|---|---|---|
3232
| Keywords provider | Detects strings from **user list**. | "far fa-smile" will be detected as a [Font Awesome smile icon](https://fontawesome.com/icons/smile?style=regular). |
33-
| Class provider | Detects strings as **class names**. | |
33+
| Class provider | Detects strings as **class names** or **expressions**. | "use strict" will be detected as JavaScript expression. |
3434
| Code provider | Detects strings as **code** (variable names). | "../path/to/my/file" will be detected as a path.<br>"someVariable" will be detected as a camel case variable. |
3535
| Natural language provider | Detects strings as **natural language**. | ["Ceci n'est pas une pipe"](https://en.wikipedia.org/wiki/Ren%C3%A9_Magritte) will be detected as french language. |
36-
| Entropy provider | Detects string as **[Gibberish](https://en.wikipedia.org/wiki/Gibberish)**.<br>String [entropy](https://en.wikipedia.org/wiki/Entropy_(information_theory)) threshold can be configured in settings (`entropy.threshold`, default = 3). | "abbcccddddeeeee" has an entropy of 2.15.<br>"dd/MM/yyyy hh:mm:ss" has an entropy of 2.88.<br>["Gloubi-boulga"](https://fr.wikipedia.org/wiki/Gloubi-boulga) has an entropy of 2.93. |
36+
| Entropy provider | Detects string as **[Gibberish](https://en.wikipedia.org/wiki/Gibberish)**.<br>String [entropy](https://en.wikipedia.org/wiki/Entropy_(information_theory)) [1] threshold can be configured in settings (`entropy.threshold`, default = 3). | "abbcccddddeeeee" has an entropy of 2.44.<br>"dd/MM/yyyy hh:mm:ss" has an entropy of 2.88.<br>["Gloubi-boulga"](https://fr.wikipedia.org/wiki/Gloubi-boulga) has an entropy of 2.75. |
3737
| String provider | **Pass-through** detection. | *Any string will be detected as such.* |
3838

39+
- [1] Starting at version v0.0.3, **string entropy** is computed after removing non-alphabetical characters.
40+
3941
The `string.checker.js.testString` [command](#extension-settings) brings a convenient way to test all providers for a given string.
4042

4143
![demo-test-string](https://raw.githubusercontent.com/michelcaradec/string-checker-js/master/readme_assets/demo-test-string.gif)
@@ -71,4 +73,5 @@ This extension contributes the following settings:
7173
- [Font Awesome](https://fontawesome.com/icons/) icon is used for tokens activity bar.
7274
- [freeicons.io](https://www.freeicons.io/) icons are used for tokens view.
7375
- [franc](https://github.com/wooorm/franc) library is used for natural language detection.
76+
- [escape-string-regexp](https://github.com/sindresorhus/escape-string-regexp) library is used for regular expressions strings escape.
7477
- [TypeScript-Node-Starter](https://github.com/Microsoft/TypeScript-Node-Starter) Microsoft sample project is used for demonstration purpose.

package-lock.json

Lines changed: 16 additions & 5 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

package.json

Lines changed: 24 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
"url": "https://github.com/michelcaradec/string-checker-js"
1010
},
1111
"license": "MIT",
12-
"version": "0.0.2",
12+
"version": "0.0.3",
1313
"engines": {
1414
"vscode": "^1.38.0"
1515
},
@@ -212,18 +212,26 @@
212212
],
213213
"description": "Set workspace file extensions to include in document scan."
214214
},
215+
"string-checker-js.file-extension-exclude": {
216+
"type": "[string]",
217+
"default": [
218+
".d.ts",
219+
".min.js"
220+
],
221+
"description": "Set workspace file extensions to exclude from document scan."
222+
},
215223
"string-checker-js.folder-name-exclude": {
216224
"type": "[string]",
217225
"default": [
218226
"node_modules"
219227
],
220-
"description": "Set workspace folder names to exclude in document scan."
228+
"description": "Set workspace folder names to exclude from document scan."
221229
},
222230
"string-checker-js.workspace.file-max": {
223231
"type": "number",
224232
"default": 100,
225233
"minimum": 0,
226-
"maximum": 500,
234+
"maximum": 1000,
227235
"description": "Set maximum number of files to process while executing workspace documents scan."
228236
},
229237
"string-checker-js.entropy.threshold": {
@@ -235,8 +243,12 @@
235243
"string-checker-js.language.languages-check": {
236244
"type": "[string]",
237245
"default": [
238-
"fra", "eng", "spa", "ita"
246+
"fra",
247+
"eng",
248+
"spa",
249+
"ita"
239250
],
251+
"description": "Set languages to detect (see franc (https://github.com/wooorm/franc) library).",
240252
"markdownDescription": "Set languages to detect (see [franc](https://github.com/wooorm/franc) library)."
241253
},
242254
"string-checker-js.variable.word-min-length": {
@@ -251,6 +263,13 @@
251263
"minimum": 10,
252264
"description": "Set variable detection maximum word length. All words above this length will be detected as technical items."
253265
},
266+
"string-checker-js.variable.non-alpha-ratio-threshold": {
267+
"type": "number",
268+
"default": 0.2,
269+
"minimum": 0,
270+
"maximum": 1,
271+
"description": "Set non-alphabetical characters threshold."
272+
},
254273
"string-checker-js.parser.jquery-exclude": {
255274
"type": "boolean",
256275
"default": true,
@@ -280,6 +299,7 @@
280299
"typescript": "^3.6.4"
281300
},
282301
"dependencies": {
302+
"escape-string-regexp": "^2.0.0",
283303
"franc": "^4.1.0",
284304
"typescript": "^3.6.4"
285305
}

src/checker/providers/classNameProvider.ts

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,9 +15,29 @@ export class ClassNameDetect implements IDetectProvider {
1515
return [ConfidenceLevel.Technical, 'javascript'];
1616
}
1717

18+
if (/^calc\([^\)]+\)$/m.test(text)) {
19+
return [ConfidenceLevel.Technical, 'javascript'];
20+
}
21+
1822
if (text.search(/fa[rs]? fa(-[^-]*)*/g) >= 0) {
1923
return [ConfidenceLevel.Technical, 'font awesome'];
2024
}
25+
26+
if (/^([a-z]+)\s\1-.+$/.test(text)) {
27+
// `table table-*`.
28+
return [ConfidenceLevel.Technical, 'class'];
29+
}
30+
31+
if (/^query +\{/.test(text)
32+
|| /^query\(/.test(text)
33+
|| /^query +[a-zA-Z_][a-zA-Z\d_]+\(/.test(text)
34+
|| /^mutation\(/.test(text)) {
35+
// `query {...}`
36+
// `query(...)`
37+
// `query myFunction(...)`
38+
// `mutation(...)`
39+
return [ConfidenceLevel.Technical, 'graphql'];
40+
}
2141

2242
if (text.search(/btn btn(-[^-]*)*/g) >= 0) {
2343
return [ConfidenceLevel.Technical, 'button'];

src/checker/providers/codeDetect.ts

Lines changed: 35 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -2,17 +2,20 @@ import * as vscode from 'vscode';
22
import { IDetectProvider } from "./detectProvider";
33
import { ConfidenceLevel } from "../../enumerations";
44
import { Constants } from '../../constants';
5+
import { isCamelCase, isPascalCase, getNonAlphaRatio } from '../../helpers/utils';
56

67
export class CodeDetect implements IDetectProvider {
78
private _minWordLength: number;
89
private _maxWordLength: number;
10+
private _nonAlphaThreshold: number;
911
readonly name: string = 'Code provider';
1012

1113
readonly isStopOnEval: boolean = true;
1214

1315
constructor() {
1416
this._minWordLength = vscode.workspace.getConfiguration(Constants.ExtensionID).get<number>('variable.word-min-length')!;
1517
this._maxWordLength = vscode.workspace.getConfiguration(Constants.ExtensionID).get<number>('variable.word-max-length')!;
18+
this._nonAlphaThreshold = vscode.workspace.getConfiguration(Constants.ExtensionID).get<number>('variable.non-alpha-ratio-threshold')!;
1619
}
1720

1821
private static readonly _startChars = [
@@ -62,6 +65,31 @@ export class CodeDetect implements IDetectProvider {
6265
return [ConfidenceLevel.Technical, 'env'];
6366
}
6467

68+
if (/^\d.*$/.test(text)) {
69+
// Starting with a digit.
70+
return [ConfidenceLevel.Technical, 'digit'];
71+
}
72+
73+
if (/^((d{2,3}|m{2,3}|yyyy|yy|hh|s{2,3})(\/|\-|:|\.|,| )?)+$/i.test(text)) {
74+
// DateTime patterns.
75+
return [ConfidenceLevel.Technical, 'datetime'];
76+
}
77+
78+
if (/^img\d+x\d+\s/.test(text)) {
79+
// img.
80+
return [ConfidenceLevel.Technical, 'img'];
81+
}
82+
83+
if (/ ?(>|<|\+|\*|=|\/) ?/.test(text)) {
84+
// Text with mathematical symbols.
85+
return [ConfidenceLevel.Technical, 'formula'];
86+
}
87+
88+
if (/\s{3,}/.test(text)) {
89+
// Many contiguous spaces (2 contiguous might me a typo error, but not 3 or more).
90+
return [ConfidenceLevel.Technical, 'spaces'];
91+
}
92+
6593
const posSpace = text.search(/\s/g);
6694
if (posSpace < 0) {
6795
// No white space...
@@ -75,11 +103,11 @@ export class CodeDetect implements IDetectProvider {
75103
return [ConfidenceLevel.Technical, 'length'];
76104
}
77105

78-
if (this.isCamelCase(text)) {
106+
if (isCamelCase(text)) {
79107
return [ConfidenceLevel.Technical, 'camelCase'];
80108
}
81109

82-
if (this.isPascalCase(text)) {
110+
if (isPascalCase(text)) {
83111
return [ConfidenceLevel.Technical, 'PascalCase'];
84112
}
85113
} else {
@@ -89,14 +117,11 @@ export class CodeDetect implements IDetectProvider {
89117
}
90118
}
91119

92-
return [ConfidenceLevel.Unknown, ''];
93-
}
94-
95-
private isCamelCase(text: string): boolean {
96-
return /^[a-z][^\s]+$/.test(text);
97-
}
120+
const nonAlphaRatio = getNonAlphaRatio(text);
121+
if (nonAlphaRatio > this._nonAlphaThreshold) {
122+
return [ConfidenceLevel.Technical, `non-alpha=${nonAlphaRatio.toFixed(2)}`];
123+
}
98124

99-
private isPascalCase(text: string): boolean {
100-
return /^([A-Z][^\sA-Z]+)+$/.test(text);
125+
return [ConfidenceLevel.Unknown, ''];
101126
}
102127
}

src/checker/providers/entropyDetect.ts

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@ import * as vscode from 'vscode';
22
import { IDetectProvider } from "./detectProvider";
33
import { ConfidenceLevel } from "../../enumerations";
44
import { Constants } from '../../constants';
5+
import { stripNonAlphaCharacters } from '../../helpers/utils';
56

67
export class EntropyDetect implements IDetectProvider {
78
private _threshold: number;
@@ -15,7 +16,7 @@ export class EntropyDetect implements IDetectProvider {
1516
readonly isStopOnEval: boolean = false;
1617

1718
check(text: string): [ConfidenceLevel, string] {
18-
var entropy = this.getShannonEntropy(text.toLowerCase());
19+
var entropy = this.getShannonEntropy(stripNonAlphaCharacters(text.toLowerCase())!);
1920

2021
return [entropy > this._threshold ? ConfidenceLevel.Message : ConfidenceLevel.Unknown, `entropy=${entropy.toFixed(2)}`];
2122
}

src/commands.ts

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -194,7 +194,7 @@ export class Commands {
194194
UserDictionaryPersist.add(DictionaryType.ExcludeFolder, () => {
195195
const fullPath = path.parse((<TreeItemFile>node).uri.fsPath).dir;
196196

197-
return nameOnly ? ItemRegex.fromValue(`\\${path.sep}${path.basename(fullPath)}$`).rawValue : fullPath;
197+
return nameOnly ? ItemRegex.fromValue(`\\${path.sep}${ItemRegex.escape(path.basename(fullPath))}$`).rawValue : fullPath;
198198
});
199199
}
200200

@@ -203,7 +203,12 @@ export class Commands {
203203
return;
204204
}
205205

206-
UserDictionaryPersist.add(DictionaryType.ExcludeFile, () => nameOnly ? ItemRegex.fromValue(`\\${path.sep}${path.basename((<TreeItemFile>node).uri.fsPath)}$`).rawValue : (<TreeItemFile>node).uri.fsPath);
206+
UserDictionaryPersist.add(
207+
DictionaryType.ExcludeFile,
208+
() =>
209+
nameOnly
210+
? ItemRegex.fromValue(`\\${path.sep}${ItemRegex.escape(path.basename((<TreeItemFile>node).uri.fsPath))}$`).rawValue
211+
: (<TreeItemFile>node).uri.fsPath);
207212
}
208213

209214
static addTokenDictionary(type: DictionaryType, node: vscode.TreeItem): void {

src/constants.ts

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
export class Constants {
22
static readonly ExtensionName = 'string-checker-js';
3-
static readonly ExtensionVersion = 'v0.0.2';
3+
static readonly ExtensionVersion = 'v0.0.3';
44
static readonly ExtensionID = 'string-checker-js';
55
static readonly ItemStringPrefix = 'string:';
66
static readonly ItemRegexPrefix = 'regex:';
@@ -31,3 +31,9 @@ export class Messages {
3131
static readonly EnterString = 'Type a string and press [Enter]';
3232
static readonly PressEscapeToExit = 'Press [Escape] or [Enter] to close';
3333
}
34+
35+
export class ConfidenceLevelStr {
36+
static readonly Unknown = '?';
37+
static readonly Technical = 'Technical';
38+
static readonly Message = 'Message';
39+
}

0 commit comments

Comments
 (0)