Class ImporterPDFX
- java.lang.Object
-
- gate.util.AbstractFeatureBearer
-
- gate.creole.AbstractResource
-
- gate.creole.AbstractProcessingResource
-
- gate.creole.AbstractLanguageAnalyser
-
- edu.upf.taln.dri.module.importer.ImporterBase
-
- edu.upf.taln.dri.module.importer.pdf.ImporterPDFX
-
- All Implemented Interfaces:
DRIModule
,gate.creole.ANNIEConstants
,gate.Executable
,gate.LanguageAnalyser
,gate.ProcessingResource
,gate.Resource
,gate.util.FeatureBearer
,gate.util.NameBearer
,Serializable
@CreoleResource(name="DRI Modules - PDFX importer") public class ImporterPDFX extends ImporterBase
From the XML mark-up generated by PDFX, this processing resource identifies all the textual contents, split their sentences by exploiting a customized REGEXP Sentence Splitter of ANNIE- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description static String
PDFXAnnSet
static String
PDFXbibEntry
static String
PDFXbibEntry_IdFeat
static String
PDFXcitMarker
static String
PDFXcitMarker_IdFeat
static String
PDFXcitMarker_refTypeFeat
-
Fields inherited from class edu.upf.taln.dri.module.importer.ImporterBase
abstractAnnType, babelnet_AnnSet, babelnet_DisItem, babelnet_DisItem_babelnetURLfeat, babelnet_DisItem_coherenceScoreFeat, babelnet_DisItem_dbpediaURLfeat, babelnet_DisItem_golbalScoreFeat, babelnet_DisItem_numTokensFeat, babelnet_DisItem_scoreFeat, babelnet_DisItem_sourceFeat, babelnet_DisItem_synsetIDfeat, bibEntry_IdAnnFeat, bibEntryAnnType, captionAnnType, causality_AnnSet, coref_Candidate, coref_ChainAnnSet, coref_Coreference, coref_SpotAnnSet, driAnnSet, emailAnnType, figureAnnType, h1AnnType, h2AnnType, h3AnnType, h4AnnType, h5AnnType, headerAffilAnnType, headerAnnType, headerAuthorAnnType, headerDOC_Affiliation, headerDOC_AnnSet, headerDOC_Author, headerDOC_JAPEemail, headerDOC_Lookup, headerDOC_OrigDocFeat, headerDOC_Sentence, headerDOC_Token, inlineCitationAnnType, inlineCitationMarkerAnnType, langAnnFeat, metaAnnotator_AnnSet, metaAnnotator_FundingAgencyAnnType, metaAnnotator_LookupAnnType, metaAnnotator_OntologyAnnType, metaAnnotator_ProjectAnnType, outlayerAnnType, sentence_isAcknowledgement, sentence_lexRankScore, sentence_POSpatternFeat, sentence_RhetoricalAnnFeat, sentence_titleSimScore, sentenceAnnType, tableAnnType, term_AnnSet, term_CandOcc, term_CandOcc_actualPOSFeat, term_CandOcc_regexPOSFeat, titleAnnType, token_LemmaFeat, token_POSfeat, tokenAnnType
-
Fields inherited from interface gate.creole.ANNIEConstants
ANNOTATION_COREF_FEATURE_NAME, DATE_ANNOTATION_TYPE, DATE_POSTED_ANNOTATION_TYPE, DEFAULT_FILE, DOCUMENT_COREF_FEATURE_NAME, JOB_ID_ANNOTATION_TYPE, LOCATION_ANNOTATION_TYPE, LOOKUP_ANNOTATION_TYPE, LOOKUP_CLASS_FEATURE_NAME, LOOKUP_INSTANCE_FEATURE_NAME, LOOKUP_LANGUAGE_FEATURE_NAME, LOOKUP_MAJOR_TYPE_FEATURE_NAME, LOOKUP_MINOR_TYPE_FEATURE_NAME, LOOKUP_ONTOLOGY_FEATURE_NAME, MONEY_ANNOTATION_TYPE, ORGANIZATION_ANNOTATION_TYPE, PERSON_ANNOTATION_TYPE, PERSON_GENDER_FEATURE_NAME, PLUGIN_DIR, SENTENCE_ANNOTATION_TYPE, SPACE_TOKEN_ANNOTATION_TYPE, TOKEN_ANNOTATION_TYPE, TOKEN_CATEGORY_FEATURE_NAME, TOKEN_KIND_FEATURE_NAME, TOKEN_LENGTH_FEATURE_NAME, TOKEN_ORTH_FEATURE_NAME, TOKEN_STRING_FEATURE_NAME
-
-
Constructor Summary
Constructors Constructor Description ImporterPDFX()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
execute()
String
getInputSentenceASname()
String
getInputSentenceAStype()
boolean
resetAnnotations()
Delete the annotations provided by a modulevoid
setInputSentenceASname(String inputSentenceASname)
void
setInputSentenceAStype(String inputSentenceAStype)
-
Methods inherited from class edu.upf.taln.dri.module.importer.ImporterBase
getInputASname, getOutputASname, setInputASname, setOutputASname
-
Methods inherited from class gate.creole.AbstractLanguageAnalyser
getCorpus, getDocument, setCorpus, setDocument
-
Methods inherited from class gate.creole.AbstractProcessingResource
addProgressListener, addStatusListener, cleanup, fireProcessFinished, fireProgressChanged, fireStatusChanged, getRuntimeParameterValues, getRuntimeParameterValues, init, interrupt, isInterrupted, reInit, removeProgressListener, removeStatusListener
-
Methods inherited from class gate.creole.AbstractResource
checkParameterValues, flushBeanInfoCache, forgetBeanInfo, getBeanInfo, getInitParameterValues, getInitParameterValues, getName, getParameterValue, getParameterValue, getParameterValues, removeResourceListeners, setName, setParameterValue, setParameterValue, setParameterValues, setParameterValues, setResourceListeners, toString
-
-
-
-
Field Detail
-
PDFXAnnSet
public static final String PDFXAnnSet
- See Also:
- Constant Field Values
-
PDFXbibEntry
public static final String PDFXbibEntry
- See Also:
- Constant Field Values
-
PDFXbibEntry_IdFeat
public static final String PDFXbibEntry_IdFeat
- See Also:
- Constant Field Values
-
PDFXcitMarker
public static final String PDFXcitMarker
- See Also:
- Constant Field Values
-
PDFXcitMarker_refTypeFeat
public static final String PDFXcitMarker_refTypeFeat
- See Also:
- Constant Field Values
-
PDFXcitMarker_IdFeat
public static final String PDFXcitMarker_IdFeat
- See Also:
- Constant Field Values
-
-
Method Detail
-
getInputSentenceASname
public String getInputSentenceASname()
-
setInputSentenceASname
@RunTime @CreoleParameter(defaultValue="Analysis", comment="The name of the input annotation set to read sentence annotations from (sentence annotations previously added by sentence splitter execution)") public void setInputSentenceASname(String inputSentenceASname)
-
getInputSentenceAStype
public String getInputSentenceAStype()
-
setInputSentenceAStype
@RunTime @CreoleParameter(defaultValue="Sentence", comment="The name of the input annotation type to read sentence annotations (sentence annotations previously added by sentence splitter execution)") public void setInputSentenceAStype(String inputSentenceAStype)
-
execute
public void execute() throws gate.creole.ExecutionException
- Specified by:
execute
in interfacegate.Executable
- Overrides:
execute
in classgate.creole.AbstractProcessingResource
- Throws:
gate.creole.ExecutionException
-
resetAnnotations
public boolean resetAnnotations()
Description copied from interface:DRIModule
Delete the annotations provided by a module- Returns:
-
-