Pdfbox to image

  java

Order

If you use soffice’s command to transfer pdf to image, you will only be able to transfer one page by default. This is a bit laborious. Therefore, you do not intend to follow soffice’s plan and use pdfbox instead.

maven

        <dependency>
            <groupId>org.apache.pdfbox</groupId>
            <artifactId>pdfbox</artifactId>
            <version>2.0.4</version>
        </dependency>

        <dependency>
            <groupId>org.apache.pdfbox</groupId>
            <artifactId>pdfbox-tools</artifactId>
            <version>2.0.4</version>
        </dependency>

Conversion

public static List<BufferedImage> convertToImage(File file) throws IOException {
        PDDocument document = PDDocument.load(file);
        PDFRenderer pdfRenderer = new PDFRenderer(document);
        List<BufferedImage> bufferedImageList = new ArrayList<>();

        for (int page = 0;page<document.getNumberOfPages();page++){
            BufferedImage img = pdfRenderer.renderImageWithDPI(page, 300, ImageType.RGB);
            bufferedImageList.add(img);
        }
        document.close();

        return bufferedImageList;
    }

concat

public static BufferedImage concat(BufferedImage[] images) throws IOException {
        int heightTotal = 0;
        for(int j = 0; j < images.length; j++) {
            heightTotal += images[j].getHeight();
        }

        int heightCurr = 0;
        BufferedImage concatImage = new BufferedImage(images[0].getWidth(), heightTotal, BufferedImage.TYPE_INT_RGB);
        Graphics2D g2d = concatImage.createGraphics();
        for(int j = 0; j < images.length; j++) {
            g2d.drawImage(images[j], 0, heightCurr, null);
            heightCurr += images[j].getHeight();
        }
        g2d.dispose();

        return concatImage;
}

Summary

In this way, we have basically completed the task. The disadvantage is that the performance is too low and needs to be optimized.