Hello,
Kindly try to convert this sample Chm to other formats, start with Pdf: chm.zip (207.4 KB)
The resulting saved Pdf’s title will be taken from the last html page inside chm file, I think it’s much better to get the Pdf’s title from the Chm’s title:
The above applies to all other formats that can have Titles, like Html itself.
See the top bar in the chm:
When converted to other formats, it’s not converted correctly and is defaced, also wrapped into 2 lines!
I’v faced the black background for transparent images in the conversion again, will update it
As you see, the style of 1st page inside chm is different as the other pages, but the following pages’ style is applied to the 1st page too, not serious, if you think it’s time consuming or might corrupt other parts simply disregard it.
OK this one surely is a bug, load a Chm and get the title (no idea if BuiltInDocumentProperties.Title is the only correct way to get the document title, confirm if there are other properties too):
Dim MyDocument As Words.Document = New Words.Document(SourceFile, LoadOptions)
MsgBox( MyDocument.BuiltInDocumentProperties.Title )
This will return the last html page’s title inside the chm file, wrong.
Must return the actual title of Chm help itself, as shown above: title.png (64.8 KB)
@australian.dev.nerds
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.
Issue ID(s): WORDSNET-25800,WORDSNET-25801
You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.
@australian.dev.nerds
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.
Issue ID(s): WORDSNET-25802
You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.
I have opened WORDSNET-25800 ticket for the issue with title.
I have opened WORDSNET-25801 ticket for the issue bar text outline effect.
I could not reproduce this issue on CHM->HTML->RTF conversion. Could you please share the code to reproduce the issue?
I have opened WORDSNET-25801 ticket for first page style issue.
I cannot reproduce the RTF size issue. On my side both CHM->RTF and CHM->HTML->RTF output size is 16MB. Could you please share the code to reproduce the issue?
Although still thinking why produce a Word file larger than 512 MB that even my high end rig won’t open? Seems Words format limit, should Aspose Words pass that limit?
@australian.dev.nerds There is no file size limit of MS Word documents. The only limit is resources available on your machine and behavior how large documents are handled by MS Word can differ from machine to machine. There are recommendation not to use huge documents. Normal MS Word document size is about 100-200 pages. Larger documents might cause issues in MS Word, again depending on available resource on the machine.
Thanks, but this case has less than 10 pages!
Anyway, kindly confirm the CHM->HTML->RTF conversion issue by running the sample, as a side thing, if you found the root of strange large size, kindly let me know
@australian.dev.nerds The issue is caused by images in your document. RTF is not a compact document format and images stored in RTF takes quite large amount of size. In your HTML save options you specify high resolution for images this causes RTF document size increasing.
@australian.dev.nerds The problem occurs because RtfSaveOptions.SaveImagesAsWmf is set to true in your code. If set it to false the images look correct.
Thanks, if I permanently set RtfSaveOptions.SaveImagesAsWmf to false, it won’t be compatible with WordPad right?
About the large size issue, ImageResolution property itself was not the root alone, the most part was by RtfSaveOptions.ExportImagesForOldReaders 160mb vs 13mb
Do you recommend to permanently set ExportImagesForOldReaders to False? Because the default is True
@australian.dev.nerdsRtfSaveOptions.SaveImagesAsWmf option might help to avoid WordPad warning messages. But even without this option the RTF should be compatible with WordPad.
When RtfSaveOptions.ExportImagesForOldReaders is set the image is written into RTF document twice, that is why size dramatically increases. Setting this option depend on your needs. If it is not required to support old/simplified RFT readers then you can disable this option.
@australian.dev.nerds The problem with the bar text occurs because IE’s specific CSS property filter: progid:DXImageTransform.Microsoft.Glow(color= 'Blue' , Strength= '2') is used in CHM. This feature is deprecated now and our development team is not going to work on this feature support in the nearest future. The ticket WORDSNET-25801 will be closed as “Won’t Fix”.
oops, ancient -ms-filter DXImageTransform.Microsoft.Glow
yep, agreed, it’s not wise to support it, I though it’s html5.
OK just one thing remains from Chm conversion issues: the TITLE
The resulting saved Pdf’s title will be taken from the last html page inside chm file, I think it’s much better to get the Pdf’s title from the Chm’s title:
The above applies to all other formats that can have Titles, like Html itself.