Package edu.upf.taln.dri.module.header
Class HeaderAnalyzer
- java.lang.Object
-
- gate.util.AbstractFeatureBearer
-
- gate.creole.AbstractResource
-
- gate.creole.AbstractProcessingResource
-
- gate.creole.AbstractLanguageAnalyser
-
- edu.upf.taln.dri.module.header.HeaderAnalyzer
-
- All Implemented Interfaces:
DRIModule
,gate.creole.ANNIEConstants
,gate.Executable
,gate.LanguageAnalyser
,gate.ProcessingResource
,gate.Resource
,gate.util.FeatureBearer
,gate.util.NameBearer
,Serializable
@CreoleResource(name="DRI Modules - Citation Aware MATE Parser") public class HeaderAnalyzer extends gate.creole.AbstractLanguageAnalyser implements gate.ProcessingResource, DRIModule
Analyze the header of a paper to extract authors affiliations and e-mails- See Also:
- Serialized Form
-
-
Field Summary
-
Fields inherited from interface gate.creole.ANNIEConstants
ANNOTATION_COREF_FEATURE_NAME, DATE_ANNOTATION_TYPE, DATE_POSTED_ANNOTATION_TYPE, DEFAULT_FILE, DOCUMENT_COREF_FEATURE_NAME, JOB_ID_ANNOTATION_TYPE, LOCATION_ANNOTATION_TYPE, LOOKUP_ANNOTATION_TYPE, LOOKUP_CLASS_FEATURE_NAME, LOOKUP_INSTANCE_FEATURE_NAME, LOOKUP_LANGUAGE_FEATURE_NAME, LOOKUP_MAJOR_TYPE_FEATURE_NAME, LOOKUP_MINOR_TYPE_FEATURE_NAME, LOOKUP_ONTOLOGY_FEATURE_NAME, MONEY_ANNOTATION_TYPE, ORGANIZATION_ANNOTATION_TYPE, PERSON_ANNOTATION_TYPE, PERSON_GENDER_FEATURE_NAME, PLUGIN_DIR, SENTENCE_ANNOTATION_TYPE, SPACE_TOKEN_ANNOTATION_TYPE, TOKEN_ANNOTATION_TYPE, TOKEN_CATEGORY_FEATURE_NAME, TOKEN_KIND_FEATURE_NAME, TOKEN_LENGTH_FEATURE_NAME, TOKEN_ORTH_FEATURE_NAME, TOKEN_STRING_FEATURE_NAME
-
-
Constructor Summary
Constructors Constructor Description HeaderAnalyzer()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static void
addOrganizationNames(gate.Document headerDoc, gate.Document originalDoc, gate.Annotation titleAnnotation)
static void
addPersonNames(gate.Document headerDoc, gate.Document originalDoc, gate.Annotation titleAnnotation)
Add the header annotations to the original documentvoid
execute()
static void
extractOrganizationAddresses(gate.Document headerDoc, gate.Document originalDoc, gate.Annotation titleAnnotation)
String
getInputTitleAS()
String
getInputTitleAStype()
gate.Document
getOriginalDocument()
String
getUseBibsonomy()
String
getUseGoogleScholar()
static void
matchPersonOrganization(gate.Document headerDoc, gate.Document originalDoc, gate.Annotation titleAnnotation)
static void
parseHeaderDoc(gate.Document headerDoc, gate.Document originalDoc, gate.Annotation titleAnnotation)
Parse the header of a documentboolean
resetAnnotations()
Delete the annotations provided by a modulevoid
setInputTitleAS(String inputTitleAS)
void
setInputTitleAStype(String inputTitleAStype)
void
setOriginalDocument(gate.Document originalDocument)
void
setUseBibsonomy(String useBibsonomy)
void
setUseGoogleScholar(String useGoogleScholar)
-
Methods inherited from class gate.creole.AbstractLanguageAnalyser
getCorpus, getDocument, setCorpus, setDocument
-
Methods inherited from class gate.creole.AbstractProcessingResource
addProgressListener, addStatusListener, cleanup, fireProcessFinished, fireProgressChanged, fireStatusChanged, getRuntimeParameterValues, getRuntimeParameterValues, init, interrupt, isInterrupted, reInit, removeProgressListener, removeStatusListener
-
Methods inherited from class gate.creole.AbstractResource
checkParameterValues, flushBeanInfoCache, forgetBeanInfo, getBeanInfo, getInitParameterValues, getInitParameterValues, getName, getParameterValue, getParameterValue, getParameterValues, removeResourceListeners, setName, setParameterValue, setParameterValue, setParameterValues, setParameterValues, setResourceListeners, toString
-
-
-
-
Method Detail
-
getInputTitleAS
public String getInputTitleAS()
-
setInputTitleAS
@RunTime @CreoleParameter(defaultValue="Analysis", comment="The name of the input annotation set for the title") public void setInputTitleAS(String inputTitleAS)
-
getInputTitleAStype
public String getInputTitleAStype()
-
setInputTitleAStype
@RunTime @CreoleParameter(defaultValue="Title", comment="The name of the input annotation type for the title") public void setInputTitleAStype(String inputTitleAStype)
-
getOriginalDocument
public gate.Document getOriginalDocument()
-
setOriginalDocument
@RunTime @CreoleParameter(comment="The original GATE document to get data from") public void setOriginalDocument(gate.Document originalDocument)
-
getUseGoogleScholar
public String getUseGoogleScholar()
-
setUseGoogleScholar
@RunTime @CreoleParameter(defaultValue="true", comment="Set to true to parse biblio entries by Google Scholar") public void setUseGoogleScholar(String useGoogleScholar)
-
getUseBibsonomy
public String getUseBibsonomy()
-
setUseBibsonomy
@RunTime @CreoleParameter(defaultValue="true", comment="Set to true to parse biblio entries by Bibsonomy") public void setUseBibsonomy(String useBibsonomy)
-
execute
public void execute() throws gate.creole.ExecutionException
- Specified by:
execute
in interfacegate.Executable
- Overrides:
execute
in classgate.creole.AbstractProcessingResource
- Throws:
gate.creole.ExecutionException
-
parseHeaderDoc
public static void parseHeaderDoc(gate.Document headerDoc, gate.Document originalDoc, gate.Annotation titleAnnotation)
Parse the header of a document- Parameters:
headerDoc
- The parsed header documentoriginalDoc
- The original documenttitleAnnotation
- The title annotations of the original document to be enriched with features
-
addPersonNames
public static void addPersonNames(gate.Document headerDoc, gate.Document originalDoc, gate.Annotation titleAnnotation)
Add the header annotations to the original document- Parameters:
headerDoc
-originalDoc
-titleAnnotation
-
-
addOrganizationNames
public static void addOrganizationNames(gate.Document headerDoc, gate.Document originalDoc, gate.Annotation titleAnnotation)
-
extractOrganizationAddresses
public static void extractOrganizationAddresses(gate.Document headerDoc, gate.Document originalDoc, gate.Annotation titleAnnotation)
-
matchPersonOrganization
public static void matchPersonOrganization(gate.Document headerDoc, gate.Document originalDoc, gate.Annotation titleAnnotation)
-
resetAnnotations
public boolean resetAnnotations()
Description copied from interface:DRIModule
Delete the annotations provided by a module- Specified by:
resetAnnotations
in interfaceDRIModule
- Returns:
-
-