Skip to content
This repository was archived by the owner on Aug 14, 2021. It is now read-only.

Commit f95ef87

Browse files
authored
Merge pull request #39 from andreskrey/logging
Preparing for release
2 parents 8caf03b + 43155bd commit f95ef87

File tree

5 files changed

+162
-7
lines changed

5 files changed

+162
-7
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ All notable changes to this project will be documented in this file.
55

66
- Added 'data-orig' as an URL source for images
77
- Removed 'modal' as a negative property from classes
8+
- Added option to inject a logger
89
- Removed all references to the `data-readability` tags that don't apply anymore to the new structure
910
- Merged PR #38 (Missing DOMEntityReference)
1011

README.md

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,21 @@ Then you pass this Configuration object to Readability. The following options ar
8484
- **OriginalURL**: default value `http://fakehost`, original URL from the article used to fix relative URLs.
8585
- **SummonCthulhu**: default value `false`, remove all `<script>` nodes via regex. This is not ideal as it might break things, but might be the only solution to [libxml problems with unescaped javascript](https://github.com/andreskrey/readability.php#known-issues). If you're not parsing Javascript tutorials, it's recommended to always set this option as `true`.
8686

87+
### Debug log
88+
89+
Logging is optional and you will have to inject your own logger to save all the debugging messages. To do so, use a logger that implements the [PSR-3 logging interface](https://github.com/php-fig/log) and pass it to the configuration object. For example:
90+
91+
```
92+
// Using monolog
93+
94+
$log = new Logger('Readability');
95+
$log->pushHandler(new StreamHandler('path/to/my/log.txt'));
96+
97+
$configuration->setLogger($log);
98+
```
99+
100+
In the log you will find information about the parsed nodes, why they were removed, and why they were considered relevant to the final article.
101+
87102
## Limitations
88103

89104
Of course the main limitation is PHP. Websites that load the content through lazy loading, AJAX, or any type of javascript fueled call will be ignored (actually, *not ran*) and the resulting text will be incorrect, compared to the readability.js results. All the articles you want to parse with readability.php need to be complete and all the content should be in the HTML already.
@@ -125,7 +140,7 @@ Self closing tags like `<br />` get automatically expanded to `<br></br`. No way
125140

126141
## Dependencies
127142

128-
Readability.php has no dependencies to other libraries.
143+
Readability.php uses the [PSR Log](https://github.com/php-fig/log) interface to define the allowed type of loggers.
129144

130145
## To-do
131146

composer.json

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,9 +21,13 @@
2121
"php": ">=5.6.0",
2222
"ext-dom": "*",
2323
"ext-xml": "*",
24-
"ext-mbstring": "*"
24+
"ext-mbstring": "*",
25+
"psr/log": "^1.0"
2526
},
2627
"require-dev": {
2728
"phpunit/phpunit": "^5.7"
29+
},
30+
"suggest": {
31+
"monolog/monolog": "Allow logging debug information"
2832
}
2933
}

src/Configuration.php

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,17 @@
22

33
namespace andreskrey\Readability;
44

5+
use Psr\Log\LoggerAwareTrait;
6+
use Psr\Log\LoggerInterface;
7+
use Psr\Log\NullLogger;
8+
59
/**
610
* Class Configuration.
711
*/
812
class Configuration
913
{
14+
use LoggerAwareTrait;
15+
1016
/**
1117
* @var int
1218
*/
@@ -48,6 +54,19 @@ class Configuration
4854
*/
4955
protected $originalURL = 'http://fakehost';
5056

57+
/**
58+
* @return LoggerInterface
59+
*/
60+
public function getLogger()
61+
{
62+
// If no logger has been set, just return a null logger
63+
if ($this->logger === null) {
64+
return new NullLogger();
65+
} else {
66+
return $this->logger;
67+
}
68+
}
69+
5170
/**
5271
* @return int
5372
*/

0 commit comments

Comments
 (0)