Hi, I am facing an issue when converting a document from Html to Word (docx) with Aspose.Words for .Net nuget package version (19.11.0) and seeing this error being logged for hundreds of documents:
Message: The document appears to be corrupted and cannot be loaded.,
InnerException: Specified argument was out of the range of valid values.
Parameter name: distanceFromText,
StackTrace: at Aspose.Words.Document.(Stream , LoadOptions )
at Aspose.Words.Document.(Stream , LoadOptions )
I am using the following code to convert documents:
var htmlLoadOptions = new HtmlLoadOptions();
htmlLoadOptions.PreferredControlType = HtmlControlType.StructuredDocumentTag;
// Setting the Enconding
htmlLoadOptions.Encoding = Encoding.UTF8;
// Create a new class implementing IWarningCallback which collect any warnings produced during document save.
var callback = new HandleDocumentWarnings();
// We assign the callback to the appropriate save options class. In this case, we are going to save to Word
// so we create a HtmlLoadOptions class and assign the callback there.
htmlLoadOptions.WarningCallback = callback;
// Load the Html document into memory
var document = new Document(info.File, htmlLoadOptions);
foreach (Table table in document.GetChildNodes(NodeType.Table, true))
{
foreach (Row row in table.Rows)
{
row.RowFormat.AllowBreakAcrossPages = false;
}
}
info.File.Close();
// Convert the document to a different format and save to stream
var streamResult = new MemoryStream();
var options = SaveOptions.CreateSaveOptions(SaveFormat.Docx);
options.TempFolder = WorkingDirectory;
document.Save(streamResult, options);
I have reason to believe that the HTML files being loaded are not corrupt and it is hard for us to keep track of file validity / correctness when they are hundreds and thousands to convert.
Can you please check if this is a bug in your component or the data?. A sample is attched for your reference.
Thanks in advance.aspose_error_details.zip (89.4 KB)