File tree Expand file tree Collapse file tree 1 file changed +8
-8
lines changed
pdfbox/src/main/java/org/apache/pdfbox/text Expand file tree Collapse file tree 1 file changed +8
-8
lines changed Original file line number Diff line number Diff line change @@ -167,14 +167,14 @@ public class PDFTextStripper extends LegacyPDFStreamEngine
167167 * The charactersByArticle is used to extract text by article divisions. For example a PDF that has two columns like
168168 * a newspaper, we want to extract the first column and then the second column. In this example the PDF would have 2
169169 * beads(or articles), one for each column. The size of the charactersByArticle would be 5, because not all text on
170- * the screen will fall into one of the articles. The five divisions are shown below
171- *
172- * Text before first article
173- * first article text
174- * text between first article and second article
175- * second article text
176- * text after second article
177- *
170+ * the screen will fall into one of the articles. The five divisions are shown below:
171+ * <ol>
172+ * <li> Text before first article</li>
173+ * <li> first article text</li>
174+ * <li> text between first article and second article</li>
175+ * <li> second article text</li>
176+ * <li> text after second article</li>
177+ * </ol>
178178 * Most PDFs won't have any beads, so charactersByArticle will contain a single entry.
179179 */
180180 protected ArrayList <List <TextPosition >> charactersByArticle = new ArrayList <>();
You can’t perform that action at this time.
0 commit comments