-
-
Notifications
You must be signed in to change notification settings - Fork 75
Spanish web content not displayed correctly '?' is putted instead of the correct character #189
Description
Spanish words with accents are not properly displayed, char with accents are being replaced with a "?" character
why is this happening? How can I tell the scrapper I'm dealing with the spanish language?
code:
$web = new \Spekulatius\PHPScraper\PHPScraper;
$web->go("https://www.marca.com");
return $web->outlineWithParagraphs;
I return the outline back to the client in json format, the result I'm getting is something like this:
[
{
"tag": "h2",
"content": "Joao F?lix: \"El Bar?a siempre ha sido mi primera opci?n\""
}
]
I have already tried to solve the problem by putting this at the beggining of the script: setlocale(LC_ALL, 'es_AR')
F?lix and opci?n are not properly displayed in the response, it should be Félix and Opción , ? is being showed instead of é and ó
When I return the result of this function the characters display correctly
utf8_encode(file_get_contents("https://www.marca.com"))
I have tried to request the document with file_get_contents , encode the result and then pass the result to $web->setContent function, I get the expected output working in this way.
$web = new PHPScraper;
$rawPageContent = utf8_encode(file_get_contents("https://www.marca.com"));
$web->setContent("https://www.marca.com",$rawPageContent);