Pdf to Html - Text selection and svg

Hi apose team,

I’ve been evaluating a number of different pdf to html tools. Aspose.pdf has been working pretty great except for one major issue.

Some parts of the html output text is not selectable / highlightable. It appears that some of the text gets embedded into the svg instead of being html text nodes. Is there any way to control this? I’ve been looking at buildvu and they seem to handle this better. Here are some example screenshots.

aspose.png

apose-svg.png

This screenshot has the svg generated by buildvu. So I know its at least possible to not embed this text in the svg.

buildvu.png

Thanks!

@walterbyersmackin,

Thanks for contacting support.

Can you please share source file along with generated result and sample code so that we may help you out.

3016752.9781575653846-36.pdf (355.2 KB)

Sure thing. Here’s a single page. I have other examples. It appears most of the problems we have with apose is due to embedding text into the svg.

Here’s another page where vertical text selection.

3314100.ckla_gk_u9_tg-9.pdf (88.9 KB)

@walterbyersmackin,

I have worked with source file shared by you using Aspose.PDF and unable to observe the issue. I have also shared my generated result with you for your kind reference. Can you please share source code along with generated result to further investigate to help you out.36.zip (2.6 KB)

@Adnan.Ahmad Can you share all the files for your test conversion? I only see the html file in the zip.

I’ve attached the program I’m using and the output its generating. I’m using the version of Aspose.PDF that’s on nuget.

3016752.9781575653846-36.zip (202.9 KB)
Program.zip (534 Bytes)