Class PDFPaperParser
- java.lang.Object
-
- edu.upf.taln.dri.common.analyzer.pdf.PDFPaperParser
-
public class PDFPaperParser extends Object
Collection of utilities to mine data from PDF documents
-
-
Constructor Summary
Constructors Constructor Description PDFPaperParser()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static List<String>
getHeaderSentences(InputStream PDFInputStream, boolean onlyHeaderParagraph)
Extract from the PDF, a list of paragraph by converting the PDF file to HTMLstatic List<String>
getHeaderSentences(URL PDF_URL, boolean onlyHeaderParagraph)
Extract from the PDF, a list of paragraph by converting the PDF file to HTML
-
-
-
Method Detail
-
getHeaderSentences
public static List<String> getHeaderSentences(InputStream PDFInputStream, boolean onlyHeaderParagraph)
Extract from the PDF, a list of paragraph by converting the PDF file to HTML- Parameters:
PDFInputStream
-onlyHeaderParagraph
-- Returns:
-
-