PDF-to-HTML : Paragraph lines are breaking & creating individual tags

Requirement - Convert PDF into HTML page.
Issue - Paragraph text is not coming in single element. It’s breaking per line & adding respected style to that individual element. We want to have single element with all required style for that paragraph.
Tech - .Net, ASPOSE PDF

@yuvraj.kale

Can you please share your sample source PDF file for our reference along with the code snippet that you are using for conversion? We will test the scenario in our environment and address it accordingly.

Requirement Example - pdf-to-html-conversion-requirement.png (304.8 KB)

Additional information about source file - This PDF is layered & has been exported from Adobe InDesign. Referred ‘paragraph’ is single text element placed on ‘description’ layer. Our requirement is to treat this ‘paragraph’ element as a single element & convert into HTML with single div/span tag for entire ‘paragraph’ text content.

Source PDF file - sampleLayout.pdf (75.8 KB)

Code Snippet - `// layered PDF--------------------------
Aspose.Pdf.Document doc = new Aspose.Pdf.Document(Input);
// Instantiate HTML SaveOptions object
HtmlSaveOptions htmlOptions = new HtmlSaveOptions();

// Specify to render PDF document layers separately in output HTML
htmlOptions.ConvertMarkedContentToLayers = true;
htmlOptions.FixedLayout = true;
htmlOptions.UseZOrder = true;

// Save the document
doc.Save(Output + “LayersRendering_out.html”, htmlOptions);`

@yuvraj.kale

An issue as PDFNET-53056 has been logged in our issue tracking system for further investigation on this scenario. We will look into its details and keep you posted with the status of its correction. Please be patient and spare us some time.

We are sorry for the inconvenience.

Hi Team,

Any update on this issue?

Thanks,

@yuvraj.kale

The ticket has recently been logged in our issue tracking system and it is pending for investigation. We will surely investigate and resolve it on a first come first serve basis and let you know as soon as it is resolved. Please be patient and spare us some time.

We are sorry for the inconvenience.