-
Notifications
You must be signed in to change notification settings - Fork 5
Description
remark-embedder-coreversion: 3.0.3nodeversion: v20.10.0npmversion: 10.2.3
Relevant code or config
const getUrlString = (url: string): string | null => {
const urlString = url.startsWith('http') ? url : `https://${url}`
try {
return new URL(urlString).toString()
} catch (error: unknown) {
return null
}
}
const urlString = getUrlString(value)What you did: I run a simple markdown like below:
#### Output
- Fruit
- Apple
- Orange
- Banana
- Dairy
- Milk
- Cheese
I created my own version of oembed transformer that included a fallback, that if the extracted link is not an oembed link, then I use my own custom bookmark around it.
What happened:
Unfortunately @remark-emedder/core returns even simple strings as URLs:
π ~ remarkEmbedder, node: {
type: 'text',
value: 'Banana',
position: {
start: { line: 171, column: 9, offset: 3707 },
end: { line: 171, column: 15, offset: 3713 }
}
}
π ~ remarkEmbedder, isValidLink: false
π ~ remarkEmbedder, value: Banana
π ~ remarkEmbedder, urlString: Banana
π ~ shouldTransform: ~ url: https://banana/
Reproduction repository:
Problem description:
@remark-emedder/core getUrlString returns every single line string as url (with https:// appended to it. It seems to later rely on shouldTransform function to filter all such links out, but in some cases this is too late. I can't check in shouldTransform function if the link is a valid URL, because it always is, coming out of getUrlString function.
Suggested solution:
Enhance the getUrlString function to check if the given text is actually a link using some robust regex, and only return a true, viable URL
Something like this works:
const getUrlString = url => {
const urlRegex = new RegExp(
/((([A-Za-z]{3,9}:(?:\/\/)?)(?:[-;:&=+$,\w]+@)?[A-Za-z0-9.-]+(:[0-9]+)?|(?:www.|[-;:&=+$,\w]+@)[A-Za-z0-9.-]+)((?:\/[+~%/.\w-_]*)?\??(?:[-+=&;%@.\w_]*)#?(?:[\w]*))?)/
);
if (!urlRegex.test(url)) {
console.log('π© Not a valid URL!:', url);
return null;
}
const urlString = url.startsWith('http') ? url : `https://${url}`;
try {
return new URL(urlString).toString();
} catch (error) {
return null;
}
};