HtmlFragment throwing ArrayIndexOutOfBounds on unicode combination

Aspose.PDF throws an error in some of the HTML templates I am trying to convert to PDF, after digging into the problem, I’ve noticed that it has to do with a certain sequence of unicodes placed sequentially (even if placed into different tags). Below a super simple example that reproduces the issue even on the latest Aspose.PDF version:

var doc = new Document();
var page = doc.getPages().add();
page.getParagraphs().add(new HtmlFragment("<span>\u2009\u202Aa\u202C</span>"));

doc.save("test.pdf");

Exception:

ption in thread "main" java.lang.ArrayIndexOutOfBoundsException: Index -1 out of bounds for length 4
	at com.aspose.pdf.internal.l42h.le.lf(Unknown Source)
	at com.aspose.pdf.internal.l42h.le.lI(Unknown Source)
	at com.aspose.pdf.internal.l42h.le.lI(Unknown Source)
	at com.aspose.pdf.internal.l42h.le.lI(Unknown Source)
	at com.aspose.pdf.internal.l42h.le.lI(Unknown Source)
	at com.aspose.pdf.internal.l42h.le.lI(Unknown Source)
	at com.aspose.pdf.internal.l42h.lt.lI(Unknown Source)
	at com.aspose.pdf.internal.l42h.lt.lj(Unknown Source)
	at com.aspose.pdf.internal.l42h.lj.lI(Unknown Source)
	at com.aspose.pdf.internal.l42h.lj.lI(Unknown Source)
	at com.aspose.pdf.internal.l46u.l4if.l2n(Unknown Source)
	at com.aspose.pdf.internal.l42u.lb.lf(Unknown Source)
	at com.aspose.pdf.internal.l44if.lk.l0p(Unknown Source)
	at com.aspose.pdf.internal.l43u.lf.lI(Unknown Source)
	at com.aspose.pdf.internal.l43u.lf.lI(Unknown Source)
	at com.aspose.pdf.internal.l43u.lf.lI(Unknown Source)
	at com.aspose.pdf.internal.l42v.lj.lI(Unknown Source)
	at com.aspose.pdf.internal.l42p.lI.lI(Unknown Source)
	at com.aspose.pdf.internal.l51l.lI.lI(Unknown Source)
	at com.aspose.pdf.internal.l43l.lt.lI(Unknown Source)
	at com.aspose.pdf.internal.l43l.lf.lj(Unknown Source)
	at com.aspose.pdf.internal.html.collections.lj.lj(Unknown Source)
	at com.aspose.pdf.internal.html.collections.lj.hasNext(Unknown Source)
	at com.aspose.pdf.internal.l51l.lI.lI(Unknown Source)
	at com.aspose.pdf.internal.html.rendering.HtmlRenderer.render(Unknown Source)
	at com.aspose.pdf.internal.html.rendering.HtmlRenderer.render(Unknown Source)
	at com.aspose.pdf.internal.html.rendering.Renderer.render(Unknown Source)
	at com.aspose.pdf.internal.html.rendering.Renderer.render(Unknown Source)
	at com.aspose.pdf.l7f.lI(Unknown Source)
	at com.aspose.pdf.HtmlFragment.lI(Unknown Source)
	at com.aspose.pdf.FormattedFragment.lI(Unknown Source)
	at com.aspose.pdf.l13v.lI(Unknown Source)
	at com.aspose.pdf.l13v.le(Unknown Source)
	at com.aspose.pdf.Page.lf(Unknown Source)
	at com.aspose.pdf.Page.lc(Unknown Source)
	at com.aspose.pdf.ADocument.processParagraphs(Unknown Source)
	at com.aspose.pdf.Document.processParagraphs(Unknown Source)
	at com.aspose.pdf.ADocument.lf(Unknown Source)
	at com.aspose.pdf.ADocument.lf(Unknown Source)
	at com.aspose.pdf.ADocument.save(Unknown Source)
	at com.aspose.pdf.Document.save(Unknown Source)

Aspose.PDF version: 24.1

If I add a space after “\u2009”, it works. If I remove the “a” after “\u202A”, it also works.

Can you please advise what’s wrong?

@samuelmartinucci

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): PDFJAVA-43560

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.