Class MateParser

  • All Implemented Interfaces:
    DRIModule, gate.creole.ANNIEConstants, gate.Executable, gate.LanguageAnalyser, gate.ProcessingResource, gate.Resource, gate.util.FeatureBearer, gate.util.NameBearer, Serializable

    @CreoleResource(name="DRI Modules - Citation Aware MATE Parser")
    public class MateParser
    extends gate.creole.AbstractLanguageAnalyser
    implements gate.ProcessingResource, Serializable, DRIModule
    Citation-aware Mate-tools parser (pos tagger, lemmatizer, dep parser and semantic role labeller) REF: https://code.google.com/p/mate-tools/
    See Also:
    Serialized Form
    • Constructor Detail

      • MateParser

        public MateParser()
    • Method Detail

      • getSentenceAnnotationSetToAnalyze

        public String getSentenceAnnotationSetToAnalyze()
      • setSentenceAnnotationSetToAnalyze

        @RunTime
        @CreoleParameter(defaultValue="Analysis",
                         comment="Set name of the annotation set where the sentences to parse are annotated")
        public void setSentenceAnnotationSetToAnalyze​(String sentenceAnnotationSetToAnalyze)
      • getSentenceAnnotationTypeToAnalyze

        public String getSentenceAnnotationTypeToAnalyze()
      • setSentenceAnnotationTypeToAnalyze

        @RunTime
        @CreoleParameter(defaultValue="Sentence",
                         comment="The type of sentence annotations")
        public void setSentenceAnnotationTypeToAnalyze​(String sentenceAnnotationTypeToAnalyze)
      • getTokenAnnotationSetToAnalyze

        public String getTokenAnnotationSetToAnalyze()
      • setTokenAnnotationSetToAnalyze

        @RunTime
        @CreoleParameter(defaultValue="Analysis",
                         comment="Set name of the annotation set where the token of the sentences to parse are annotated")
        public void setTokenAnnotationSetToAnalyze​(String tokenAnnotationSetToAnalyze)
      • getTokenAnnotationTypeToAnalyze

        public String getTokenAnnotationTypeToAnalyze()
      • setTokenAnnotationTypeToAnalyze

        @RunTime
        @CreoleParameter(defaultValue="Token",
                         comment="The type of token annotations")
        public void setTokenAnnotationTypeToAnalyze​(String tokenAnnotationTypeToAnalyze)
      • getExcludeThreshold

        public Integer getExcludeThreshold()
      • setExcludeThreshold

        @RunTime
        @CreoleParameter(defaultValue="0",
                         comment="The value of exclude threshold of the parser.")
        public void setExcludeThreshold​(Integer excludeThreshold)
      • getCitancesEnabled

        public Boolean getCitancesEnabled()
      • setCitancesEnabled

        @CreoleParameter(defaultValue="false",
                         comment="Make the parser aware of the presence of cite span annotations in order to properly manage while parsing sentences. If set to false, the parameters citeSpanAnnotationSetToExclude and citeSpanAnnotationTypeToExclude have no validity.")
        public void setCitancesEnabled​(Boolean citancesEnabled)
      • getCiteSpanAnnotationSetToExclude

        public String getCiteSpanAnnotationSetToExclude()
      • setCiteSpanAnnotationSetToExclude

        @RunTime
        @CreoleParameter(defaultValue="Analysis",
                         comment="The name of the annotation set that includes cite span annotaitons. Valid only if citancesEnabled is true.")
        public void setCiteSpanAnnotationSetToExclude​(String citeSpanAnnotationSetToExclude)
      • getCiteSpanAnnotationTypeToExclude

        public String getCiteSpanAnnotationTypeToExclude()
      • setCiteSpanAnnotationTypeToExclude

        @RunTime
        @CreoleParameter(defaultValue="CitSpan",
                         comment="The name of the annotation type of cite span annotaitons. Valid only if citancesEnabled is true.")
        public void setCiteSpanAnnotationTypeToExclude​(String citeSpanAnnotationTypeToExclude)
      • getLemmaModelPath

        public String getLemmaModelPath()
      • setSentenceIdsToAnalyze

        @RunTime
        @CreoleParameter(defaultValue="",
                         comment="The ids of all the sentence type annotations to parse. If empty or null all annotations of sentence type will be parsed.")
        public void setSentenceIdsToAnalyze​(Set<String> sentenceIdsToAnalyze)
      • getSentenceIdsToAnalyze

        public Set<String> getSentenceIdsToAnalyze()
      • setLemmaModelPath

        @RunTime
        @CreoleParameter(comment="Full path to the lemmatizer model.")
        public void setLemmaModelPath​(String lemmaModelPath)
      • getPostaggerModelPath

        public String getPostaggerModelPath()
      • setPostaggerModelPath

        @RunTime
        @CreoleParameter(comment="Full path to the POS tagger model.")
        public void setPostaggerModelPath​(String postaggerModelPath)
      • getParserModelPath

        public String getParserModelPath()
      • setParserModelPath

        @RunTime
        @CreoleParameter(comment="Full path to the dep parser model.")
        public void setParserModelPath​(String parserModelPath)
      • getSrlModelPath

        public String getSrlModelPath()
      • setSrlModelPath

        @RunTime
        @CreoleParameter(comment="Full path to the semantic role labeller model.")
        public void setSrlModelPath​(String srlModelPath)
      • init

        public gate.Resource init()
        Specified by:
        init in interface gate.Resource
        Overrides:
        init in class gate.creole.AbstractProcessingResource
      • execute

        public void execute()
        Specified by:
        execute in interface gate.Executable
        Overrides:
        execute in class gate.creole.AbstractProcessingResource
      • annotateSentences

        public int annotateSentences​(List<gate.Annotation> sentencesSorted,
                                     gate.Document doc,
                                     int t,
                                     gate.AnnotationSet citeAnnotationSet)
        Annotate by means of the parser a set of sentences
        Parameters:
        sentencesSorted - list of sentences to annotate
        doc - document the sentences belong to
        t - threshold for the parser
        Returns:
      • sortSetenceList

        public List<gate.Annotation> sortSetenceList​(gate.AnnotationSet sentences)
        Given a AnnotationSet instance, returns a sorted list of its elements. Sorting is done by position (offset) in the document.
        Parameters:
        sentences - Annotation
        Returns:
        Sorted list of Annotation instances.
      • languageAwareAnnotationParsing

        public static void languageAwareAnnotationParsing​(boolean isLangAware,
                                                          Map<LangENUM,​MateParser> parsersLangMap,
                                                          gate.Document doc,
                                                          List<gate.Annotation> sentenceAnnList,
                                                          String sentAnnSet,
                                                          String sentAnnType,
                                                          String tokenAnnSet,
                                                          String tokenAnnType)
        Given a map of parsers instances for different languages, determine the majority language from the list of sentence annotations and parse all the sentence annotations by means of the parser for that language. If the list of sentence annotations is null or empty all the annotation of the sentence type specified are parsed.
        Parameters:
        isLangAware - if false, always the English parser will be used
        parsersLangMap -
        doc -
        sentenceAnnList -
        sentAnnSet -
        sentAnnType -
        tokenAnnSet -
        tokenAnnType -
        Throws:
        ResourceAccessException
      • languageAwareAnnotationParsing

        public static void languageAwareAnnotationParsing​(boolean isLangAware,
                                                          Map<LangENUM,​MateParser> parsersLangMap,
                                                          gate.Document doc,
                                                          gate.Annotation sentenceAnn,
                                                          String sentAnnSet,
                                                          String sentAnnType,
                                                          String tokenAnnSet,
                                                          String tokenAnnType)
      • main

        public static void main​(String[] args)
      • resetAnnotations

        public boolean resetAnnotations()
        Description copied from interface: DRIModule
        Delete the annotations provided by a module
        Specified by:
        resetAnnotations in interface DRIModule
        Returns: