Text split up in spans after convert to html

Hi,

my test document contains 2 text boxes which both contain a phrase. When I convert to the document to HTML the first word of this phrase is split up into multiple span elements.

The resulting HTML:

<div class="stl_03 stl_04">
	<div class="stl_01" style="left:1.6667em;top: 0.6559em; "><span class="stl_05 stl_06 stl_07">Evaluation Only. Created with Aspose.PDF. Copyright 2002-2018 Aspose Pty Ltd. &nbsp;</span></div>
	<div class="stl_01" style="left:8.3268em;top: 6.9276em; z-index:2; "><span class="stl_08 stl_09 stl_10">W</span><span class="stl_08 stl_09 stl_07">a</span><span class="stl_08 stl_09 stl_11" style="word-spacing:0.0053em;">arde in tekstvak 1 &nbsp;</span></div>
	<div class="stl_01" style="left:8.3268em;top: 16.4355em; z-index:22; "><span class="stl_08 stl_09 stl_10">W</span><span class="stl_08 stl_09 stl_07">a</span><span class="stl_08 stl_09 stl_11" style="word-spacing:0.0053em;">arde in tekstvak 2 &nbsp;</span></div>
</div>

The document is structured like this (extracted with Adobe Acrobat)

<Document xml:lang="nl-NL">
<Article>
<Artikel>
<NormalParagraphStyle>Waarde in tekstvak 1</NormalParagraphStyle>
</Artikel>
<Artikel>
<NormalParagraphStyle>Waarde in tekstvak 2</NormalParagraphStyle>
</Artikel>
</Article>
</Document>

The code I used to convert the document to HTML:

Document pdfDoc = new Document(sourcePdfPath);
HtmlSaveOptions htmlSaveOptions = new HtmlSaveOptions();
htmlSaveOptions.SplitIntoPages = false;
pdfDoc.Save(targetHtmlPath, htmlSaveOptions);

Can I use the SDK to ensure that each phrase is converted to a single span element?

Kind regards,

Stefaan

@stefaan.vandewinkel

Thanks for contacting support.

Would you please share your sample PDF document with us. We will test the scenario in our environment and address it accordingly.

Test document.pdf (7.1 KB)

This document is generated by exporting an InDesign document to pdf (for print)

@stefaan.vandewinkel

Thanks for sharing sample PDF document.

We have tested the scenario in our environment using Aspose.PDF for .NET 19.1 and were able to replicate the issue that you have mentioned. We have logged this issue in our issue tracking system as PDFNET-46021 for the sake of correction. We will further look into details of the issue and keep you posted with the status of its correction. Please be patient and spare us little time.

We are sorry for the inconvenience.