Interface Document

    • Method Detail

      • loadXML

        void loadXML​(String absoluteFilePath)
              throws DRIexception
        Load the XML string-serialized contents of the document (UTF-8) from a file, by specifying the file's absolute path
        Parameters:
        absoluteFilePath - the absolute path of the file with the XML string-serialized contents of the document to load
        Throws:
        DRIexception
      • loadXML

        void loadXML​(File file)
              throws DRIexception
        Load the XML string-serialized contents of the document (UTF-8) from a file
        Parameters:
        file - the file with the XML string-serialized contents of the document to load
        Throws:
        DRIexception
      • loadXMLString

        void loadXMLString​(String XMLStringContents)
                    throws DRIexception
        Load the XML string-serialized contents of the document from a string (UTF-8 char encoding)
        Parameters:
        XMLStringContents - the String with the XML serialized contents to load
        Throws:
        InternalProcessingException
        DRIexception
      • extractSentences

        List<Sentence> extractSentences​(SentenceSelectorENUM sentenceSel)
                                 throws InternalProcessingException
        Load the list of sentences of the document, ordered by their occurrence in the document. If sentences have not been extracted, the first time this method is executed the document text is split into sentences.
        Parameters:
        sentenceSel - the type of sentence to select
        Returns:
        the set of sentences in document order
        Throws:
        InternalProcessingException
      • extractTerminology

        List<CandidateTermOcc> extractTerminology()
                                           throws DRIexception
        Load the list of terms extracted from the document. If the terminology has not been extracted from the document, the first time this method is executed relevant terms are extracted from the document.
        Returns:
        the set of sentences in document order
        Throws:
        DRIexception
      • extractSentenceGraph

        DependencyGraph extractSentenceGraph​(int sentenceId,
                                             SentGraphTypeENUM graphType)
                                      throws DRIexception
        Get the graph representing a sentence. The id of the sentence can be retrieved by the method extractSentences() NB: experimental sentence graphs merging approach implemented
        Parameters:
        sentenceId -
        graphType -
        Returns:
        Throws:
        DRIexception
      • extractDocumentGraph

        DependencyGraph extractDocumentGraph​(SentenceSelectorENUM sentenceSel)
                                      throws DRIexception
        Get the graph representing a portion of a document. The nodes of the graph are merged by relying on co-reference chains.
        Parameters:
        sentenceSel -
        Returns:
        Throws:
        DRIexception
      • resetDocumentExtractionData

        void resetDocumentExtractionData()
                                  throws InternalProcessingException
        This method deletes all the data extracted from the original document including sentences, terminology, citations, etc. After calling this method on a Document object, the next time sentences, terminology, citations, etc. from the document are accessed, they are extracted again and not read from the output of a previous extraction process execution.
        Throws:
        InternalProcessingException
      • cleanUp

        void cleanUp()
              throws InternalProcessingException
        Call this method only WHEN YOU ARE SURE YOU WILL NOT USE THE DOCUMENT NO MORE IN YOUR DATA. This method will clean all the document data structures made the memory occupied by these data ready for garbage collection. Note that, if you try to access / call methods of the document after calling this method an exception will be raised to state that the resource has been already closed and its data cleaned.
        Throws:
        InternalProcessingException
      • isCleanUp

        boolean isCleanUp()
                   throws InternalProcessingException
        Check if the document data structures has been cleaned by calling the cleanUp() method. A cleaned up document cannot be used no more; if you try to access / call methods of the document after calling this method an Exception will be raised to state that the resource has been already closed and its data cleaned.
        Returns:
        true if the document data structures has been cleaned.
        Throws:
        InternalProcessingException