Class PDFproxyConn


  • public class PDFproxyConn
    extends Object
    Converting papers in PDF format to XML by means of PDFX (http://pdfx.cs.man.ac.uk/). Also utility methods to compress their images are added.
    • Constructor Summary

      Constructors 
      Constructor Description
      PDFproxyConn()  
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static int convertFilesAndStore​(String fileOrDirFullPath, String tags, boolean enablePDFcompression, float compressionFactor, boolean recursiveDir, int timeout)
      Convert by means of PDFX a PDF file or recursively all PDF files in a directory (http://pdfx.cs.man.ac.uk/).
      static byte[] pdfCompress​(byte[] inputPDF, float compressionFactor, boolean greyImages)
      Compress the images included in a PDF file in order to reduce the file size.
      static String processPDF​(byte[] inputBytes, String tags, int timeout)
      Get an PDF file (max 5Mb) as a byte array and transform it to an XML annotated file by means of the PDFX Web Service (http://pdfx.cs.man.ac.uk/).
      static Map<String,​String> processPDF​(String inputFilePath, String tags, int timeout)
      Get an PDF file (max 5Mb) by means of its path and transform it to an XML annotated file by means of the PDFX Web Service (http://pdfx.cs.man.ac.uk/).
    • Field Detail

      • useProxy

        public static boolean useProxy
      • proxyScheme

        public static String proxyScheme
      • proxyHost

        public static String proxyHost
      • proxyPort

        public static Integer proxyPort
    • Constructor Detail

      • PDFproxyConn

        public PDFproxyConn()
    • Method Detail

      • pdfCompress

        public static byte[] pdfCompress​(byte[] inputPDF,
                                         float compressionFactor,
                                         boolean greyImages)
        Compress the images included in a PDF file in order to reduce the file size. The input and output file names are includes full file paths.
        Parameters:
        inputPDF -
        compressionFactor -
        greyImages -
        Returns:
      • processPDF

        public static Map<String,​String> processPDF​(String inputFilePath,
                                                          String tags,
                                                          int timeout)
        Get an PDF file (max 5Mb) by means of its path and transform it to an XML annotated file by means of the PDFX Web Service (http://pdfx.cs.man.ac.uk/).
        Parameters:
        inputFilePath -
        timeout - set the socket timeout in milliseconds
        Returns:
      • processPDF

        public static String processPDF​(byte[] inputBytes,
                                        String tags,
                                        int timeout)
        Get an PDF file (max 5Mb) as a byte array and transform it to an XML annotated file by means of the PDFX Web Service (http://pdfx.cs.man.ac.uk/).
        Parameters:
        inputBytes - input byte array
        timeout - set the socket timeout in milliseconds
        Returns:
      • convertFilesAndStore

        public static int convertFilesAndStore​(String fileOrDirFullPath,
                                               String tags,
                                               boolean enablePDFcompression,
                                               float compressionFactor,
                                               boolean recursiveDir,
                                               int timeout)
        Convert by means of PDFX a PDF file or recursively all PDF files in a directory (http://pdfx.cs.man.ac.uk/). Compression of PDF files images can be activated, specifying also a compression factor.
        Parameters:
        fileOrDirFullPath -
        tags -
        enablePDFcompression -
        compressionFactor -
        recursiveDir -
        Returns:
        The number of PDF files correctly converted and stored