Word to Epub with Persian Arabic Text

Hi
I was eavluating words and pdf for converting a word file with persian and arabic text to a pdf and epub file

converting to pdf is ok by to epub not worked nad all non english chacters was wrong and mixed

I try converting direct from word to epub not worked
then try convert word to pdf and pdf to epub not worked

here is simple code that I tested

Aspose.Words.Document wdoc = new Aspose.Words.Document(path);
EpubSaveOptions options = new EpubSaveOptions();
options.ContentRecognitionMode = EpubSaveOptions.RecognitionMode.Fixed;
wdoc.Save("1.pdf");
Aspose.Pdf.Document pdf = new Aspose.Pdf.Document("1.pdf");
pdf.Save("1.epub", options);

Hi there,

Thanks for your inquiry. We have created a sample Word document which contains Persian and Arabic text. The text does not display correctly in output epub. Please check the attached image for detail. We have attached the input and output documents with this post for your reference.

For the sake of correction, we have logged this problem in our issue tracking system as WORDSNET-13376. You will be notified via this forum thread once this issue is resolved. We apologize for your inconvenience.

If you are facing different issue, please share your input Word document here for testing purposes. We will investigate the issue on our side and provide you more information.

Hi there,

Please use HtmlSaveOptions.ExportFontResources property to specify whether font resources should be exported to HTML, MHTML or EPUB. Please set the value of this property as true to get the required output. We are closing WORDSNET-13376 as ‘Not a bug’.

Document doc = new Document(MyDir + "in.docx");
HtmlSaveOptions so = new HtmlSaveOptions();
so.ExportFontResources = true;
so.SaveFormat = SaveFormat.Epub;
doc.Save(MyDir + "Out.epub", so);

Now Its showing fonts
bu there are two wrong displays
all non english characters are seperated
and rtl line shown as ltr lines
I found those

Hi there,

Thanks for your inquiry. To ensure a timely and accurate response, please attach the following resources here for testing:

  • Your input Word document
  • Please attach the output epub file that shows the undesired behavior.

As soon as you get these pieces of information ready, we’ll start investigation into your issue and provide you more information. Thanks for your cooperation.

PS: To attach these resources, please zip them and Click ‘Reply’ button that will bring you to the ‘reply page’ and there at the bottom you can include any attachments with that post by clicking the ‘Add/Update’ button.

ok

I prepared a simple word file with english and non english words

how ever you could create that simply

here there are 5 line with different format and styles in RTL LTR Bold and so on

after converting to epub as you see all non english word are seperated to characters

and all lines are LTR

and fonts are not as same as word.docx

and here is the code

Document wdoc = new Document(file);
HtmlSaveOptions so = new HtmlSaveOptions();
so.ExportFontResources = true;
so.SaveFormat = SaveFormat.Epub;
wdoc.Save("word converted.epub", so);

Hi there,

Thanks for sharing the document. We will inform you via this forum thread once this issue is resolved. We apologize for your inconvenience.

Hi there,

We have noticed that FBReader and Sony Reader for PC show the generated epub correctly. However, ADE 2 and ADE 3 can’t render some characters and have some problems with RTL text. This seems to be an issue with ADE.

Could you please share which epub readers you are using at your end?

thank you
Ive tested with ADE
so do you have any plan for ADE

Hi there,

Thanks for your feedback. This issue seems to be related to ADE. However, we will update you via this forum thread once there is any update available on this issue. Thanks for your patience.