All Classes (Dr. Inventor Text Mining Library (UPF) 4.0 API)

All Classes Interface Summary Class Summary Enum Summary Exception Summary
Class	Description
AbstractDictionary
ActionLexicon
AnimacyENUM
AnnotationPositionInDocumentC	Get a Double from 0 to 100 that reflects the position of the annotation (central offset ID) with respect to the whole document.
AnnotationPositionInSectionC	Get a Double from 0 to 100 that reflects the position of the annotation (central offset ID) with respect to the section the sentence belongs to.
AnnotationTextC	Get the text of the annotation.
AnnotatorAgreementC	Get the agreement type of the specific kind of annotaiton.
AnnotatorAgreementC.AnnotationTypeENUM
AppositionSieve	Co-reference by apposition spotting.
Author	Author of a paper
AuthorImpl	IMPORTANT: Never instantiate directly this class! Sentence of a document together with its descriptive features.
BabelfyUtil	Collection of utility methods to access Babelfy Disambiguation service.
BabelnetAnnotator	This module enrich the textual contents of a paper by applying WSD, by invoking BabelNet REFERENCE: http://babelnet.org/
BabelSynsetOcc	Interface to access a Babelnet synset occurrence mined from the document.
BabelSynsetOccImpl	IMPORTANT: Never instantiate directly this class! A Babelnet synset occurrence mined from the document.
BaseDocumentElem	Base class for all document element objects, including a reference to the DocCacheManager object they are contained in
BiblioEntryParser	Parse the contents of bibliographic entries by querying Bibsonomy, FreeCite, CrossRef and GoogleScholar
BibsonomyStandaloneConn	Collection of static methods to retrieve bibliographic entries by Bibsonomy REF: http://www.bibsonomy.org/help/doc/api.html
BibTexWrap	Extension of class that represent a BibTeX entry
CandidateTermOcc	Interface to access a candidate term mined from the document.
CandidateTermOccImpl	IMPORTANT: Never instantiate directly this class! A term mined from the document.
Citation	Interface to access a citation of a paper
CitationImpl	IMPORTANT: Never instantiate directly this class! CitationImpl represents a citation of the document
CitationLinker	Link InlineCitation with the corresponding bibliographic entry
CitationMarker	Interface to access an citation marker of a paper, that is an inline references to a bibliography citation
CitationMarkerImpl	IMPORTANT: Never instantiate directly this class! CitationMarkerImpl represents an inline reference to a citation in a paper
CitationSourceENUM	Type of external Web Service used to gather bibliographic entry metadata
ConceptLexicon
Const	Utility in-memory constant
ContainSubjectivityCueC	Check if the text contains any subjectivity cue
ContainTextInListC	Match the words in the annotation against a list provided by the constructor and return the number of words of the list matched
ContainWordsInListC	Match the words in the annotation against a list provided by the constructor and return the number of tokens that matched one word in the list
CorefChainBuilder	Spot mention of entities and build coreference chains
CrossRefConn	Collection of static methods to parse bibliographic entries by CrossRef REF: http://search.crossref.org/help/api
CrossRefResult	CrossRef search result - data model
DependencyGraph	Dependency Graph data structure and methods
DERIV_CITSprevNextSentencesC	Retrieve if the main verb of the sentence is active or passive
DERIV_CITSprevNextSentencesC.REF_SENT_AND_TYPE
DERIV_ContainTextInList_AL_negC	Match the words in the annotation against a list provided by the constructor and return the number of words of the list matched
DERIV_PassiveModalTenseC	Retrieve if the main verb of the sentence is active or passive
DERIV_PassiveModalTenseC.PASSIVE_MODAL_TENSE
DERIV_RelativePositionOfFirstMatchOfAnnotationNominalC	Get the sentence relative position of the first match of a certain annotation
DERIV_RelativePositionOfFirstMatchOfWordsInListC	Get the sentence relative position of the first match of the words in the annotation against a list of words provided by the constructor
DERIV_SentLengthNominalC	Get the number of intersecting annotations from a given annotation set, of a given type and eventyally evin a specific feature with a specific String value.
DERIV_TFIDFsimilarityC	Retrieve if the main verb of the sentence is active or passive
DERIV_TFIDFsimilarityC.TYPE_SIM
DictCollections	Co-reference mention spotting in-memory dictionaries
DivClassCSSProperties	Captures some class properties of a div tag * bottom (y..) * font-size (fs..) * font family (ff..) * height (h..) * left (x..)
DocCacheManager
DocGraphTypeENUM	Types of graph generation approaches
DocParse	Collection of utility methods extract specific data from document annotations
Document	Interface to access a Document processed by Dr Inventor. To get an instance of a Document by the `Document interface`, you have always to use one of the `Factory` methods: - `Factory.createNewDocument()` - `Factory.createNewDocument(String absoluteFilePath)` - `Factory.createNewDocument(File file)`
DocumentCtx	Hold sentence-context for rhetorical sentence classification.
DocumentIdentifierC	Get the name of the document (or a random integer if no name can be retrieved)
DocumentImpl	IMPORTANT: Never instantiate directly this class! To get an instance of a Document by the `Document interface`, you have always to use one of the `Factory` methods: - `Factory.createNewDocument()` - `Factory.createNewDocument(String absoluteFilePath)` - `Factory.createNewDocument(File file)` - `Factory.getEmptyDocument()`
DocumentJSON
DocumentSectionIdentifierC	Get the id and title of the document section that contains the sentence.
DocumentSectionIdentifierC.SectionTypeENUM
DocumentSectionPositionC	Get the order number of the sentence inside the document.
DocumentSectionPositionC.SectionPositionENUM
DocumentSentenceIdentifierC	Get the order number of the sentence inside the document (the number 1 is assigned to the first sentence of the abstract).
DRIexception	Base class for the Exception hierarchy of Dr.
DRIModule	Interface that every module should implement
DummyItem	Just a dumb item to test whether or not LexRank actually works.
ExactMatchSieve
ExampleBibsonomy	Example of bibliographic entry search by Bibsonomy REF: http://www.bibsonomy.org/help/doc/api.html
ExampleCrossRef	CrossRef connector example
ExampleFreeCite	FreeCite connector example
ExampleGoogleScholar	Google Scholar connector example
ExamplePDFproxyConn	PDFX connector example
ExamplePDFX	PDFX connector example
Factory	Factory class to get the instance of PDFimporter by the interface `PDFimporter` and to get instances of new Documents by the interface `Document`s.
FeatG
FeatureFilter
FeatureValueOfFirstFilteredMatchC	Get a feature value of the feature with name equal to featureName of the first annotation that matches the filters (name, filterName, filterValue and filterStartsWith) of the annotation
FormulaicExpressionMatcher
FreeCiteConn	Collection of static methods to parse bibliographic entries by FreeCite REF: http://freecite.library.brown.edu/welcome/api_instructions
FreeCiteResult	FreeCite search result - data model
GateUtil	Collection of utility methods to interact with GATE documents.
GenderENUM
GenericDirectedGraph	Graph interaction methods to interact with directed graphs
GenericDirectedGraphGRPHimpl	IMPORTANT: Never instantiate directly this class! Generic graph implementation.
GoogleScholarConn	Utility methods to parse bibliographic entries by Google Scholar REF: https://scholar.google.com/
GoogleScholarResult	Google Scholar search result - data model
GraphToStringENUM	Types of graph to string serializations
GROBIDloaderImpl	IMPORTANT: Never instantiate directly this class! Implementation of the PDF loading methods of Dr Inventor.
Header	Interface to access the metadata of the header of the paper.
HeaderAnalyzer	Analyze the header of a paper to extract authors affiliations and e-mails
HeaderImpl	IMPORTANT: Never instantiate directly this class! HeaderImpl represents the header of a paper
HyphenWordsDictionary
ImporterBase	Base GATE processing resource to build importers from different data sources (PDF, XML schemas, etc.)
ImporterGROBID	From the XML mark-up generated by GROBID, this processing resource identifies all the textual contents.
ImporterJATS	From the JATS XML mark-up, this processing resource identifies all the textual contents, split their sentences by exploiting a customized REGEXP Sentence Splitter of ANNIE and store the new sentence annotations
ImporterPDFEXT	From the XML mark-up generated by PDFEXT, this processing resource identifies all the textual contents.
ImporterPDFX	From the XML mark-up generated by PDFX, this processing resource identifies all the textual contents, split their sentences by exploiting a customized REGEXP Sentence Splitter of ANNIE
InlineCitationSpotter	Validate the annotations of CandidateInlineCitation and CandidateInlineCitationMarker by generating the annotations InlineCitation and InlineCitationMarker
InstanceWeightGetterC	Get the name of the document (or a random integer if no name can be retrieved)
Institution	Interface to access the metadata of an institution (university, institute, etc.).
InstitutionImpl	IMPORTANT: Never instantiate directly this class! Institution representing an institution, usually mentioned in the header of the document as the affiliation of the authors
InternalProcessingException	Exceptions related to some internal data processing issue.
IntersectingAnnotationBooleanCountC	Get the number of intersecting annotations from a given annotation set, of a given type and if feature name is not null, with a given value for a boolean feature
IntersectingAnnotationStringCountC	Get the number of intersecting annotations from a given annotation set, of a given type and eventyally evin a specific feature with a specific String value.
IntersectingAnnotationTextC	Get the text of intersecting annotations (separated by a space) from a given annotation set, of a given type and eventually in a specific feature with a specific String value.
IntersectingGroupsOfCitationCountC	Get the number of intersecting annotations from a given annotation set, of a given type and eventyally evin a specific feature with a specific String value.
InvalidParameterException	Exception related to invalid parameters passed to methods.
JATSloader	Interface to access the JATS XML importing methods of Dr.
JATSloaderImpl	IMPORTANT: Never instantiate directly this class! Implementation of the JATS loading methods of Dr Inventor.
JSONgenerator	This class is useful to generate JSON representations / serializations of the core data stuctures that characterize a scientific document.
LangENUM	Language of paper text excerpt
LanguageDetector	Detect language of annotations of set and type defined in the input settings
Levenshtein	Levenshtein string distance
LexRanker	An Implementation of the LexRank algorithm described in the paper "LexRank: Graph-based Centrality as Salience in Text Summarization", Erkan & Radev '04, with some of our own modifications.
LexRankResults<T>	A dumb container class that holds results from the LexRank algorithm.
LexRankSummarizer	LexRank summarizer
MapUtil
MapValueComparator
MateParser	Citation-aware Mate-tools parser (pos tagger, lemmatizer, dep parser and semantic role labeller) REF: https://code.google.com/p/mate-tools/
MetaAnnotator	Refine and unify Meta-annotations (projects, funding agencies, ontologies, etc.) added by the related GATE Application.
MetaEntityTypeENUM	Type of meta-entity identified by the meta-entity annotator.
ModuleConfig	Class to configure how a scientific publication will be processed by the Dr.
NgramsC	Generate skipgrams from the text or lemmatized text of a sentence
NullPrintStream	Stream redirection utility
NumberENUM
NumberOfDependencyRelationsByTypeC	Count the number of dependency relations eventually by type.
ObjectGenerator	Generates objects of the library starting from the original GATE document annotations
PDFextConn	Converting papers in PDF format to XML by means of PDFext (http://pdfext.taln.upf.edu/).
PDFEXTloaderImpl	IMPORTANT: Never instantiate directly this class! Implementation of the PDF loading methods of Dr Inventor.
PDFEXTparser	import org.grobid.core.; import org.grobid.core.data.; import org.grobid.core.factory.; import org.grobid.core.mock.; import org.grobid.core.utilities.*; import org.grobid.core.engines.Engine;
PDFEXTparser.State
PDFEXTresult
PDFEXTserver	import org.grobid.core.; import org.grobid.core.data.; import org.grobid.core.factory.; import org.grobid.core.mock.; import org.grobid.core.utilities.*; import org.grobid.core.engines.Engine;
PDFextStatic
PDFloader	Interface to access the PDF importing methods of Dr.
PDFPaperParser	Collection of utilities to mine data from PDF documents
PDFproxyConn	Converting papers in PDF format to XML by means of PDFX (http://pdfx.cs.man.ac.uk/).
PDFtoTextConvMethod	Available types of PDF-to-text converters
PDFXConn	Converting papers in PDF format to XML by means of PDFX (http://pdfx.cs.man.ac.uk/).
PDFXloaderImpl	IMPORTANT: Never instantiate directly this class! Implementation of the PDF loading methods of Dr Inventor.
PersonENUM
PlainTextLoader	Interface to access the plain text importing methods of Dr.
PlainTextLoaderImpl	IMPORTANT: Never instantiate directly this class! Implementation of the plain text importing methods of Dr Inventor.
PosENUM
PredicateNominativeSieve
PriorPolarityENUM
PronounSieve	Co-reference by pronoun spotting.
PropertyManager	Classes to manage the configuration orpoerties of the Dr.
PubIdENUM	Type of publication ID
RDFparse	Collection of utility methods to generate an RDF dataset that represents the contents of a document
RegexpMatcher
RelativePositionOfFirstMatchOfAnnotationC	Get the sentence relative position of the first match of a certain annotation
RelativePositionOfFirstMatchOfWordsInListC	Get the sentence relative position of the first match of the words in the annotation against a list of words provided by the constructor
RelativePronounSieve	Co-reference by relative pronoun spotting.
RelaxedMatchSieve	Co-reference by relaxed string match spotting.
ReplaceImage	Utility class to replace images in PDF files in order to decrease the final size of the same file.
ResourceAccessException	Exception related to the access of files and resources.
RhetoricalClassENUM	The set of rhetorical classes that can be assigned to each sentence
RhetoricalClassGetterC	Get the text of the annotation.
RhetoricalClassifier	Associate to each sentence a rhetorical class
RhetoricalGSGetterC	Get the GS rhetorical anntoation of the sentence
RhetoricalGSGetterC.CLASS_TYPE_RHET
Section	Section of a paper
SectionImpl	IMPORTANT: Never instantiate directly this class! Sentence of a document together with its descriptive features.
SectionNumberC	Get a Double from 0 to 100 that reflects the position of the annotation (central offset ID) with respect to the section the sentence belongs to.
SectionTypeC	Get the type of the section this sentence is in
Sentence	Interface to access the sentence of a document together with its descriptive features.
SentenceImpl	IMPORTANT: Never instantiate directly this class! Sentence of a document together with its descriptive features.
SentenceJSON
SentenceSelectorENUM	The type of sentences to select.
SentGraphTypeENUM	Types of graph generation approaches
Sieve	Generic co-reference sieve.
SieveTypeEnum
Similar<T>	An interface describing things that can have similarity measures.
SimLangENUM	List of available languages to compute text similarity.
SkipgramsC	Generate skipgrams from the text or lemmatized text of a sentence
SourceENUM
SpanishParser	EXPERIMENTAL!!!
SpotlightUtil	Collection of utility methods to access the DBpedia Spotlight Disambiguation & Entity Linking service.
StaticLists
StopWordList	Loader of stop word lists from external files
StopWords
SubjectivityElem
SubjectivityReader
SubjectivityTypeENUM
Summarizer	Generic summarizer.
SummaryTypeENUM	Types of summarization approaches
TermAnnotator	Annotate sequences of tokens as terms on the basis ofthe following set of POS sequence patterns that identifies these terms: [JN]N <-- Best one [NV]J?N+ J+N+ [JN]ID?N+ [JN].*?N
TFIDFVectorWiki	Utility class to compute TD-IDF vectors from textual excerpts.
TitleSimSummarizer	Title similarity summarizer
Token	Interface to access the tokens of a sentence of a document together with their descriptive features.
TokenFilterInterface	Interface to define custom token filters
TokenFromDependencyRelationsC	Generate a string with the following info: DEPrelNAME_FROMlemma_TOlemma
TokenImpl	IMPORTANT: Never instantiate directly this class! Sentence of a document together with its descriptive features.
TreeMaxDepthOfDependencyRelationsC	Count the max depth of the dependency relation tree
TripleJSON
Util
Util	Collection of comparison and sotr utility methods.
Util	Co-reference spotting utilities.
UtilPDFX	Class including static methods to normalize the XML files generated by PDFX.
WikipediaLemmaPOSfFrequency	Utility class to retrieve lemma document frequency in Wikipedia.