Khmer line wrapping

Hi. I’m currently trying to render a PDF with Khmer in it. The text is rendering once I change to Khmer OS but, when using an HtmlFragment, the text isn’t wrapping unless a space is included, which you don’t get in Khmer. Any ideas?



I’m using v10.4 of Aspose PDF.NET



Attached is an example rendering.



The code where the HtmlFragment is generated is below:



string wrap = "

0 ? “padding-right:” + rightPadding.ToString() + "mm; " : “”) + “color:rgb(” + color.ToRgb().R + “,” + color.ToRgb().G + “,” + color.ToRgb().B + “); text-align:” + textAlign + “; line-height:” + lineHeight + “; font-size:{1}pt;font-weight:{3};">{0}
”;



HtmlFragment newText = new HtmlFragment(string.Format(wrap, text, fontSize.ToString(), fontName, (bold ? “bold” : “normal”)));



newText.HorizontalAlignment = (textAlign == “right” ? HorizontalAlignment.Right : HorizontalAlignment.Left);



container.Add(newText);



Thanks.

Hi Neil,


Thanks for contacting support.

Can you please share some sample project so that we can test the scenario in our environment. We are sorry for this inconvenience.

I’ve attached some Khmer text.



In terms of code, my project is very much part of a much larger project that I can’t upload. To reproduce it, add an HtmlFragment to a FloatingBox which is 50mm wide. The text for the HTML element should be the attached, surrounded by the following:



##ATTACHED FILE##





Thanks.

Hi Neil,


Thanks for sharing the resource file.

I have tried replicating the issue using following code snippet and I am unable to replicate the problem. As per my observations, the khmer content is appearing as small blocks. Can you please take a look over following code snippet and help us in reproducing the issue. We are sorry for this delay and inconvenience.

[C#]

//
Load source PDF document

Aspose.Pdf.Document doc = new Aspose.Pdf.Document();

doc.Pages.Add();

System.IO.TextReader tr = new StreamReader("c:/pdftest/Khmer+Text.txt");

Aspose.Pdf.HtmlFragment html = new HtmlFragment(String.Format(tr.ReadToEnd(), 12, "Arial Unicode MS", "bold"));

Aspose.Pdf.FloatingBox floatbox = new Aspose.Pdf.FloatingBox(25, 25);

floatbox.Paragraphs.Add(html);

doc.Pages[1].Paragraphs.Add(floatbox);

// Save updated document containing table object

doc.Save(<span style=“font-size:9.5pt;
line-height:115%;font-family:Consolas;mso-fareast-font-family:“Malgun Gothic”;
mso-fareast-theme-font:minor-fareast;color:#A31515;background:white;mso-highlight:
white;mso-ansi-language:EN-US;mso-fareast-language:KO;mso-bidi-language:AR-SA”>“c:/pdftest/FloatingBox_with_table.pdf”);

Hi,



Attached are the updated txt file I used + the resulting PDF. The updated code snippet is below (the font name is different):



// Load source PDF document

Aspose.Pdf.Document doc = new Aspose.Pdf.Document();

doc.Pages.Add();

System.IO.TextReader tr = new StreamReader(@“D:\Temp\Khmer Text.txt”);

Aspose.Pdf.HtmlFragment html = new HtmlFragment(String.Format(tr.ReadToEnd(), 12, “Khmer UI”, “bold”));

Aspose.Pdf.FloatingBox floatbox = new Aspose.Pdf.FloatingBox(80, 250);

floatbox.BackgroundColor = Aspose.Pdf.Color.Yellow;

floatbox.Paragraphs.Add(html);

doc.Pages[1].Paragraphs.Add(floatbox);

// Save updated document containing table object

doc.Save(@“D:\Temp\FloatingBox_with_table.pdf”);

}



Thanks,

Also, I’ve updated the DLL to v10.7.0 to rule out this being a problem which has subsequently been resolved.

It also seems that certain characters aren’t rendering properly. I’ve updated Khmer Text.txt (attached) to include an example. You’ll see that the last line of Khmer has a dotted circle in it instead of appearing as a marker on the preceding character as it does in the text file (when viewed in Notepad on Windows 7).

n.docherty:
Also, I've updated the DLL to v10.7.0 to rule out this being a problem which has subsequently been resolved.
Hi Neil,

Thanks for sharing the feedback.

From your above statement, it appears that your issue related to line wrap for Khmer text is resolved after upgrading to version 10.7.0. Please correct me if you are still facing the same issue.

n.docherty:
It also seems that certain characters aren’t rendering properly. I’ve updated Khmer Text.txt (attached) to include an example. You’ll see that the last line of Khmer has a dotted circle in it instead of appearing as a marker on the preceding character as it does in the text file (when viewed in Notepad on Windows 7).
Hi Neil,

Thanks for sharing the details.

I
have tested the scenario and I am able to reproduce the same problem. For the
sake of correction, I have logged it in our issue tracking system as PDFNEWNET-39248. We
will investigate this issue in details and will keep you updated on the status
of a correction. <o:p></o:p>

We apologize for your inconvenience.

I can confirm that it is still an issue with v10.7.0. I was merely stating I had updated the DLL and that I’d confirmed the problem still existed with the latest version of the DLL.

Thanks.

[quote user=“n.docherty”]I can confirm that it is still an issue with v10.7.0. I was merely stating I had updated the DLL and that I’d confirmed the problem still existed with the latest version of the DLL.

Thanks.[/quote]Hi Neil,


Thanks for sharing the feedback.

In the beginning of this thread, you stated that khmer text is not being wrapped and I can see that text in PDF file is also being wrapped to subsequent lines. However I can see that text is being wrapped but Khmer text is not properly being rendered inside PDF file. For your reference, I have also attached the resultant file generated over my end.

[C#]

Aspose.Pdf.Document doc = new Aspose.Pdf.Document();<o:p></o:p>

doc.Pages.Add();

System.IO.TextReader tr = new StreamReader(@"c:\pdftest\Khmer+Text (2).txt");

Aspose.Pdf.HtmlFragment html = new HtmlFragment(String.Format(tr.ReadToEnd(), 12, "Khmer UI", "bold"));

Aspose.Pdf.FloatingBox floatbox = new Aspose.Pdf.FloatingBox(80, 250);

floatbox.BackgroundColor = Aspose.Pdf.Color.Yellow;

floatbox.Paragraphs.Add(html);

doc.Pages[1].Paragraphs.Add(floatbox);

// Save updated document containing table object

doc.Save(@"c:\pdftest\MyTest_FloatingBox_with_table.pdf");

Hi. Is there any update on this? I’m getting chased about getting Khmer PDFs working and some kind of timeframe would be useful.

Thanks.

Hi Neil,


Thanks for your patience.

The issue reported earlier is still pending for review as the team has been busy fixing other previously reported issues. However your concerns have been shared with product team and as soon as we have some definite updates regarding its resolution, we will let you know.

This is how the Khmer word for address is supposed to look
image.png (26.8 KB)
This is how it looks in the PDF
Khmer From PDF.png (3.1 KB)

Notice it is just plain wrong

Here is the PDF in question: temp7.pdf (47.2 KB)

4 years and still no fix is very disappointing. Burmese is similarly unreliable as well as that’s a similar language (i.e. multiple characters form a single visual element)

Conjunctive letters are awesome :slight_smile:

@BenWillson

We have replied you in the other topic created by you. Kindly follow up in respective thread.

@n.docherty

We are afraid PDFNET-39248 is still unresolved owing to other priorirty tickets in the queue. However, we have recorded your concerns and will try to schedule it soon. We will let you know once any update will be available in this regard.