ArialSpan.pdf (867.0 KB)
ArialSpan_Issue.pdf (585.6 KB)
I am converting this 2 pdf pages to html. When i convert the first pdf (ArialSpan.pdf), I get the correct result which only have 1 span for 1 line.
<div class="stl_ stl_02">
<div class="stl_03">
<img src="_files/img_01.png" alt="" class="stl_04" />
</div>
<div class="stl_view">
<div class="stl_05 stl_06">
<div class="stl_01 stl_07" style="left:29.6462em;top:83.5988em;"><span class="stl_08 stl_09 stl_10" style="word-spacing:-0em;">The tin tips. </span></div>
</div>
</div>
</div>
When I convert ArialSpan_Issue.pdf, one single line is broken into multiple span.
<div class="stl_ stl_02">
<div class="stl_03">
<img src="_files/img_01.png" alt="" class="stl_04" />
</div>
<div class="stl_view">
<div class="stl_05 stl_06">
<div class="stl_01 stl_07" style="left:39.2406em;top:82.8005em;"><span class="stl_08 stl_09 stl_10">A</span></div>
<div class="stl_01 stl_07" style="left:45.5379em;top:82.9104em;"><span class="stl_08 stl_09 stl_10">t</span></div>
<div class="stl_01 stl_07" style="left:47.3896em;top:82.9427em;"><span class="stl_08 stl_09 stl_10">i</span></div>
<div class="stl_01 stl_07" style="left:48.8704em;top:82.9686em;"><span class="stl_08 stl_09 stl_10">n</span></div>
<div class="stl_01 stl_07" style="left:52.5771em;top:83.0333em;"><span class="stl_08 stl_09 stl_10">.</span></div>
</div>
</div>
</div>
Can you please advice why the behavior is not consistent?
Technology: .NET
HtmlSaveOptions htmlOptions = new HtmlSaveOptions();
htmlOptions.SplitIntoPages = true;
htmlOptions.FixedLayout = true;
htmlOptions.FontSavingMode = HtmlSaveOptions.FontSavingModes.AlwaysSaveAsTTF;
htmlOptions.RasterImagesSavingMode = HtmlSaveOptions.RasterImagesSavingModes.AsEmbeddedPartsOfPngPageBackground;
htmlOptions.HtmlMarkupGenerationMode = HtmlSaveOptions.HtmlMarkupGenerationModes.WriteOnlyBodyContent;
htmlOptions.SaveTransparentTexts = true;