Package edu.upf.taln.dri.module.importer.pdf
-
Class Summary Class Description ImporterGROBID From the XML mark-up generated by GROBID, this processing resource identifies all the textual contents.ImporterPDFEXT From the XML mark-up generated by PDFEXT, this processing resource identifies all the textual contents.ImporterPDFX From the XML mark-up generated by PDFX, this processing resource identifies all the textual contents, split their sentences by exploiting a customized REGEXP Sentence Splitter of ANNIEUtilPDFX Class including static methods to normalize the XML files generated by PDFX.