Free Support Forum - aspose.com

Save a Word file into HTML format- with issue of duplicate pages

Hi


I use Aspose Word 17.6 to save a Word file into HTML format pages with classes on Github.
In the result , I found there one more the same page generated.
Please check the files in the attachment.

P.S. here is my code:

@Test
public void testForAspose() {
String fileName = “linyanjun_3.docx”;
try {
Document doc = new Document(“input/”+fileName);
Document pageDoc;
LayoutCollector layoutCollector;
DocumentPageSplitter splitter;
ByteArrayOutputStream output = new ByteArrayOutputStream();
HtmlSaveOptions saveOp = new HtmlSaveOptions();
saveOp.setExportImagesAsBase64(true);
saveOp.setExportTextInputFormFieldAsText(false);
saveOp.setExportTocPageNumbers(true);
saveOp.setExportPageSetup(true);
saveOp.setExportDocumentProperties(true);
saveOp.setExportRelativeFontSize(false);
saveOp.setUpdateFields(true);
layoutCollector = new LayoutCollector(doc);
doc.updatePageLayout();
splitter = new DocumentPageSplitter(layoutCollector);

byte[] outputContent;
String outputPath = “output/docx”;
String dirName = fileName;

File outputDir = new File(outputPath + “/” + dirName + “/”);
if (!outputDir.exists())
outputDir.mkdir();
ByteArrayOutputStream testOut = new ByteArrayOutputStream();

for (int page = 1; page <= doc.getPageCount(); page++) {
System.out.println(“page:” + page);
pageDoc = splitter.getDocumentOfPage(page);

testOut.reset();
output.reset();

pageDoc.save(output, saveOp);
outputContent = output.toByteArray();
IOUtils.write(outputContent, new FileOutputStream(outputPath + “/” + dirName + “/” + page + “.html”));
}

} catch (Exception e) {
e.printStackTrace();
}
}
Hi there,

Thanks for your inquiry.

Please note that Aspose.Words requires TrueType fonts when rendering documents to fixed-page formats (JPEG, PNG, PDF or XPS). You are using PageSplitter utility. Your issue is related to page layout. You need to install fonts on the machine where you're using this utility. Please refer to the following article:

How Aspose.Words Uses True Type Fonts

If you still face problem, please share following fonts here for testing. We will investigate the issue and provide you more information on this.

  • DFKai-SB
  • 新細明體
  • 標楷體

Hi

These fonts is pre-installed in my Windows, this problem still exists.
Please check the font files

fonts.001.zip (3 MB)
fonts.002.zip (3 MB)
fonts.003.zip (3 MB)
fonts.004.zip (3 MB)
fonts.005.zip (3 MB)
fonts.006.zip (657.9 KB)

Please rename these zip file like “*.zip.001” to open.

@ChengHuang,

Thanks for sharing the fonts. Unfortunately, we are unable to extract the shared zip files. Could you please re-attach the correct ZIP files? Thanks for your cooperation.

Hi
Please check these file again

(rename it to fonts.zip.001)
fonts.001.zip (3 MB)

(rename it to fonts.zip.002)
fonts.002.zip (3 MB)

(rename it to fonts.zip.003)
fonts.003.zip (3 MB)

(rename it to fonts.zip.004)
fonts.004.zip (3 MB)

(rename it to fonts.zip.005)
fonts.005.zip (3 MB)

(rename it to fonts.zip.006)
fonts.006.zip (657.9 KB)

And then use 7-zip or something else to open fonts.zip.001

@ChengHuang,

Thanks for sharing the detail. We have tested the scenario and have managed to reproduce the same issue at our side. For the sake of correction, we have logged this problem in our issue tracking system as WORDSNET-15707. You will be notified via this forum thread once this issue is resolved.

We apologize for your inconvenience.

@ChengHuang,

The issues you have found earlier (filed as WORDSNET-15707) have been fixed in this Aspose.Words for .NET 17.9 update and this Aspose.Words for Java 17.9 update.