PDFPaperParser (Dr. Inventor Text Mining Library (UPF) 4.0 API)

java.lang.Object
- edu.upf.taln.dri.common.analyzer.pdf.PDFPaperParser

```
public class PDFPaperParser
extends Object
```
Collection of utilities to mine data from PDF documents

Constructor Summary

Constructors
Constructor Description

PDFPaperParser()

Method Summary

All Methods Static Methods Concrete Methods
Modifier and Type	Method	Description
`static List<String>`	`getHeaderSentences(InputStream PDFInputStream, boolean onlyHeaderParagraph)`	Extract from the PDF, a list of paragraph by converting the PDF file to HTML
`static List<String>`	`getHeaderSentences(URL PDF_URL, boolean onlyHeaderParagraph)`	Extract from the PDF, a list of paragraph by converting the PDF file to HTML

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - PDFPaperParser
```
public PDFPaperParser()
```
- Method Detail
  - getHeaderSentences
```
public static List<String> getHeaderSentences(InputStream PDFInputStream,
                                              boolean onlyHeaderParagraph)
```
    Extract from the PDF, a list of paragraph by converting the PDF file to HTML
    
    Parameters:
    
    PDFInputStream -
    
    onlyHeaderParagraph -
    
    Returns:
  - getHeaderSentences
```
public static List<String> getHeaderSentences(URL PDF_URL,
                                              boolean onlyHeaderParagraph)
```
    Extract from the PDF, a list of paragraph by converting the PDF file to HTML
    
    Parameters:
    
    hrefStr -
    
    onlyHeaderParagraph - set to true to extract only header paragraph
    
    Returns: