Convert Word DOCX Document to Scanned PDF using Java | Save Word Pages into Images | Convert Protected Images to PDF

I want to convert a docx document into a scanned PDF document wherein i should be able to set the resolution of the scanned PDF document.
I know how to convert a docx to text PDF but, i want it converted to a scanned PDF document.
How can i do this?

@PrachiDongre,

Can you please clarify a bit, what do you exactly mean by “scanned PDF document” and “resolution of the scanned PDF document”?

Aspose.Words provides the PdfCompliace enumeration to support the conversion of document to various PDF format standards (such as PDF 1.7, PDF 1.5, etc.). It can be specified via PdfSaveOptions class when rendering. The default value is Pdf17.

If you are unable to convert DOCX to PDF (seeing any exception etc), then please ZIP and upload your input Word document (you are getting this problem with) and piece of source code here for testing. We will then investigate the issue on our end and provide you more information.

test.pdf (21.0 KB)

I have mentioned an example of scanned pdf.
A scanned pdf is something from which we cannot select or copy text.
By resolution I mean DPI(Dots per inch)

@PrachiDongre,

You can first convert all pages in Word document to Images in memory and then append them in PDF by using Aspose.Words. Please try using the following code:

Document document = new Document("E:\\Temp\\input.docx");

ArrayList images = new ArrayList();
ImageSaveOptions options = new ImageSaveOptions(SaveFormat.Jpeg);
options.JpegQuality = 100;
// options.Resolution = 300;
options.PageCount = 1;
for (int i = 0; i < document.PageCount; i++)
{
    options.PageIndex = i;
    MemoryStream stream = new MemoryStream();
    document.Save(stream, options);
    stream.Position = 0;

    images.Add(stream);
}

Document finalDoc = new Document();
finalDoc.RemoveAllChildren();
foreach (MemoryStream stream in images)
{
    using (Image image = Image.FromStream(stream))
    {
        Document imageDoc = new Document();
        DocumentBuilder builder = new DocumentBuilder(imageDoc);

        PageSetup ps = builder.PageSetup;
        ps.PageWidth = ConvertUtil.PixelToPoint(image.Width, image.HorizontalResolution);
        ps.PageHeight = ConvertUtil.PixelToPoint(image.Height, image.VerticalResolution);

        // Insert the image into the document and position it at the top left corner of the page.
        builder.InsertImage(
            image,
            RelativeHorizontalPosition.Page,
            0,
            RelativeVerticalPosition.Page,
            0,
            ps.PageWidth,
            ps.PageHeight,
            WrapType.None);

        finalDoc.AppendDocument(imageDoc, ImportFormatMode.KeepSourceFormatting);
    }
}

finalDoc.Save("E:\\Temp\\20.4.pdf"); 

Hope, this helps.

Yes this was helpful. I was writing this code in java but could not find an equivalent for MemoryStream that works for this code. I tried using ByteArrayOutputStream & ByteArrayInputStream but it didnt work for this code.

@PrachiDongre,

You can build logic on the following Java code to get the desired output:

Document document = new Document("E:\\Temp\\input.docx");

ArrayList images = new ArrayList();
ImageSaveOptions options = new ImageSaveOptions(SaveFormat.JPEG);
options.setJpegQuality(100);
// options.Resolution = 300;
options.setPageCount(1);
for (int i = 0; i < document.getPageCount(); i++) {
    options.setPageIndex(i);
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    document.save(baos, options);

    images.add(baos);
}

Document finalDoc = new Document();
finalDoc.removeAllChildren();
for (int i = 0; i < images.size(); i++) {
    ByteArrayOutputStream baos = (ByteArrayOutputStream) images.get(i);
    InputStream inputStream = new ByteArrayInputStream(baos.toByteArray());
    BufferedImage image = ImageIO.read(inputStream);

    Document imageDoc = new Document();
    DocumentBuilder builder = new DocumentBuilder(imageDoc);

    PageSetup ps = builder.getPageSetup();
    ps.setPageWidth(ConvertUtil.pixelToPoint(image.getWidth()));
    ps.setPageHeight(ConvertUtil.pixelToPoint(image.getHeight()));

    // Insert the image into the document and position it at the top left corner of the page.
    builder.insertImage(
            image,
            RelativeHorizontalPosition.PAGE,
            0,
            RelativeVerticalPosition.PAGE,
            0,
            ps.getPageWidth(),
            ps.getPageHeight(),
            WrapType.NONE);

    finalDoc.appendDocument(imageDoc, ImportFormatMode.KEEP_SOURCE_FORMATTING);
}

finalDoc.save("E:\\Temp\\awjava20.4.pdf"); 

Hope, this helps.