Tagged PDF disabled after docx to pdf conversion

Hi,

When I convert docx document to pdf using this code snippet:

Aspose.Words.Document docx = new Aspose.Words.Document(@“C:\Testdoc.docx”);
var saveOptions = new Aspose.Words.Saving.PdfSaveOptions
{ SaveFormat = Aspose.Words.SaveFormat.Pdf };
docx.Save(@“C:\Testdoc.pdf”, saveOptions);

The result pdf has “Tagged PDF: No” (when looking at generated PDF properties in Aspose.
When doing the same conversion in Word it is possible to define if tagged pdf should be enabled or not. By default it is enabled.
Tagged PDF are supported by PDF format since version 1.4. Aspose generates version 1.5 so it should be possible.

Question is what API setting should be provided (maybe to SaveOptions?) to make this Tagged PDF enabled?

/Bartek

@jhas

Please use the PdfSaveOptions.ExportDocumentStructure property as shown below to get the “Tagged PDF” output.

Document doc = new Document(MyDir + "input.docx");
PdfSaveOptions pdfSaveOptions = new PdfSaveOptions();
pdfSaveOptions.ExportDocumentStructure = true;
doc.Save(MyDir + "19.5.pdf", pdfSaveOptions);

Hi @tahir.manzoor,

Thanks, your suggestion works for PDF conversion. However, if I have to merge a couple of Tagged PDFs (after the conversion) using the following code:

void MergePdfs(List streams, Stream outputStream)
{
var output = new Aspose.Pdf.Document();

foreach (var stream in streams)
{
    var doc = new Aspose.Pdf.Document(stream);

    output.Pages.Add(doc.Pages);
}	

output.Save(outputStream);

}

The result is not a tagged PDF. Do you have a solution for this? I had a look at the Aspose.PDF.Saving.SaveOptions class, but it seems of no help here.

Thanks in advance!

@jhas

To ensure a timely and accurate response, please attach the following resources here for testing:

  • Your input Word and PDF documents.
  • Please create a standalone console application ( source code without compilation errors ) that helps us to reproduce your problem on our end and attach it here for testing.

As soon as you get these pieces of information ready, we will start investigation into your issue and provide you more information. Thanks for your cooperation.

PS: To attach these resources, please zip and upload them.

Aspose.PDF: 18.3.0.0
Aspose.Words: 18.4.0.0

Attached console app works on trial, but the outcome is the same when I set the license as well.

The app takes one file - testDocument.docx. It’s a simple Word document. Please adjust the path to the document in source code of attached app.

The app produces 3 files:

  • taggedPDF1.pdf - tagged PDF: Yes
  • taggedPDF2.pdf - tagged PDF: Yes
  • merged.pdf - tagged PDF: No

Our expectation was that if we open and close PDFs that are already tagged, the outcome would also be tagged. Is there any way to achieve that?

Also, we would like to be able to achieve merged Tagged PDF for content that is only partially tagged - e.g. 3 PDFs that are tagged, and one that is not. Is this possible?

@jhas

We could not find any attached project or sample file with your post. Would you please make sure to upload them using upload button in post editor. In case of larger files, please upload them to Google Drive or Dropbox and share the link with us. We will test the scenario in our environment and address it accordingly.

You’re right, the file was too large to upload it here. I put it on Dropbox Dropbox - AsposeTaggedPDFTestConsoleApp.zip - Simplify your life

@jhas

Thank you for sharing requested data.

We have worked with the data shared by you and have been able to reproduce the issue in our environment. A ticket with ID PDFNET-46554 has been logged in our issue management system for further investigation and resolution. The ticket ID has been linked with this thread so that you will receive notification as soon as the ticket is resolved.

We are sorry for the inconvenience.

Hi @Farhan.Raza,

Thanks. Is there any way you could prioritize this? We have a customer waiting for this functionality.

What is expected time of investigation/resolution?

@jhas

Please note that the ticket has been logged under free support model and will be investigated on first come first serve basis. Therefore, it may take some months to resolve. As soon as we have some definite updates or ETA regarding ticket resolution, we will let you know.

Moreover, we also offer Paid Support , where issues are used to be investigated with higher priority. Our customers, who have paid support subscription, report their issue there which are meant to be investigated urgently. In case your reported issue is a blocker, you may please consider subscribing for Paid Support. For further information, please visit Paid Support FAQs.

Hello Fahran,

Could you please tell me the status of this ticket?

Thanks,
Maciej

@jhas

We are afraid it is still pending owning to previously logged and critical tickets. We will let you know once any update will be available in this regard.

Any update regarding this one. I am also facing same issue. While merging two tagged pdf I lost tagging after saving the pdf.

@Venkateshwaran

We are afraid that the earlier logged ticket has not been yet resolved due to other pending issues in the queue. Nevertheless, we will surely inform you as soon as we make some progress towards ticket resolution. Please be patient and spare us some time.

We apologize for your inconvenience.