Empty checkboxes in DOCM files are ticked after processing the files through Aspose.Words 22.4

Empty checkboxes in DOCM files are ticked after processing the files through Aspose.Words 22.4.
Things that we know are:

  • The bug is introduced in Aspose.Words 15.1.
  • Build prior Aspose.Words 15.1 (e.g. Aspose.Words 14.12) do not automatically check those checkboxes.
  • Super old builds (e.g. Aspose.Words 10.0 or 11.0) are not able to load this file at all and report is as corrupt.

We created a console application, and you can access it via this link, https://files.axiomint.com/external/fa374ce289ca51a1017a26783b76f8891d7cea9e91039de6046f9e91c3df46a6 :

  • The test file is in the “DOCM File” folder.
  • After running the application it will produce “BEFORE” and “AFTER” files in the “bin” folder. “BEFORE” file does not have these checkboxes ticket. “AFTER” files has.
  • The project runs with the Aspose.Words 22.5 (latest build). I’ve also included Aspose.Words 15.1 and Aspose.Words 14.12 DLLs in the “Aspose DLL“ folder. We can rename those to “Aspose.Words.DLL” and see that the issue is also reproducible with Aspose.Words 15.1 but is not reproducible with Aspose.Words 14.12 - meaning that the bug is introduced in the Aspose.Words 15.1 release.
  • DLLs included in the project are trial versions of the DLLs. However, we also reproduced the issue with the licensed version of Aspose.Words 22.4 DLL.

@vikram.venugopal This is not a bug, by default Aspose.Words updates Structured Document Tags upon saving the document. If you investigate your input document, you can notice the following representation of the problematic checkbox:

<w:sdt>
	<w:sdtPr>
		<w:rPr>
			<w:rFonts w:ascii="Arial" w:hAnsi="Arial"/>
			<w:sz w:val="24"/>
			<w:szCs w:val="24"/>
		</w:rPr>
		<w:id w:val="-1708708501"/>
		<w14:checkbox>
			<w14:checked w14:val="1"/>
			<w14:checkedState w14:val="2612" w14:font="MS Gothic"/>
			<w14:uncheckedState w14:val="2610" w14:font="MS Gothic"/>
		</w14:checkbox>
	</w:sdtPr>
	<w:sdtEndPr/>
	<w:sdtContent>
		<w:r w:rsidR="005225D2">
			<w:rPr>
				<w:rFonts w:ascii="MS Gothic" w:eastAsia="MS Gothic" w:hAnsi="MS Gothic" w:hint="eastAsia"/>
				<w:sz w:val="24"/>
				<w:szCs w:val="24"/>
			</w:rPr>
			<w:t>☐</w:t>
		</w:r>
	</w:sdtContent>
</w:sdt>

As you can see is is checked <w14:checked w14:val="1"/>, but it’s value is shown as unchecked box <w:t>☐</w:t>.

You can disable updating Structured Document tags content using SaveOptions.UpdateSdtContent property. See the following code:

Document doc = new Document(@"C:\Temp\in.docm");
OoxmlSaveOptions opt = new OoxmlSaveOptions();
opt.SaveFormat = SaveFormat.Docm;
opt.UpdateSdtContent = false;
doc.Save(@"C:\Temp\out.docm", opt);

Thanks for the great explanation, Alexey.
Now we can see that the file itself contains ambiguous data.
Where can we find more information about the behavior/ramifications of SaveOptions.UpdateSdtContent setting? We use Aspose to update some text and hyperlinks inside the files. Can disabling this settings result to not updating certain text and hyperlinks? Can Structured Document tags contain any of those?
Or, in short, what are the possible side-effects of setting UpdateSdtContent to false in overall?

@vahem @vikram.venugopal SaveOptions.UpdateSdtContent affects only StructuredDocumenttags. So disabling this option will not affect other objects in the document.
Hyperlinks in MS Word document are represented by Fields, SaveOptions.UpdateSdtContent does not affect the fields. So you can safely use it in your code.

A post was split to a new topic: Disable updating content controls in office documents (Word, Excel, PowerPoint etc.)