Hi there,
I am attempting to open a word document and save it as a rich text file, I am using the same code that I have been using for months to perform this operation on standard .doc files that are produced by humans and has not failed after processing around 12,000 documents.
The new file has been produced by a ‘word printer’, basically an export from a medical system, and is machine generated, absolutely everything on the input document has been placed in textboxes and absolutely positioned. When processing, the output is garbled as in the files below:
Input.zip (13.1 KB)
Output.zip (30.5 KB)
The code I’m using to create the RTF is a fairly straightforward clone as follows:
//Aspose Licensing
License lic = new License();
lic.SetLicense(new MemoryStream(Properties.Resources.Aspose_Words));
Document _wordDoc = new Document(Path.Combine(SourceFilePath, FileName), new LoadOptions(LoadFormat.Docx, string.Empty, string.Empty));
LoadOptions _lo = new LoadOptions();
_lo.LoadFormat = LoadFormat.Doc; //***
MemoryStream _template = new MemoryStream(Properties.Resources.QIRTFTemplate);
Document _rtfDoc = new Document(_template, _lo);
DocumentBuilder _db = new DocumentBuilder(_rtfDoc);
_db.Font.Name = m_StandardFontName;
_db.Font.Size = m_StandardFontSize;
Node _insertAfterNode = _db.CurrentParagraph;
// Make sure that the node is either a paragraph or table.
if ((!_insertAfterNode.NodeType.Equals(NodeType.Paragraph)) & (!_insertAfterNode.NodeType.Equals(NodeType.Table)))
throw new ArgumentException("The destination node should be either a paragraph or table.");
// We will be inserting into the parent of the destination paragraph.
CompositeNode dstStory = _insertAfterNode.ParentNode;
// This object will be translating styles and lists during the import.
NodeImporter _importer = new NodeImporter(_wordDoc, _insertAfterNode.Document, ImportFormatMode.KeepSourceFormatting);
// Loop through all sections in the source document.
foreach (Section _srcSection in _wordDoc.Sections)
{
// Loop through all block level nodes (paragraphs and tables) in the body of the section.
foreach (Node _srcNode in _srcSection.Body)
{
// This creates a clone of the node, suitable for insertion into the destination document.
Node _newNode = _importer.ImportNode(_srcNode, true);
// Insert new node after the reference node.
dstStory.InsertAfter(_newNode, _insertAfterNode);
_insertAfterNode = _newNode;
}
}
//Take the first part of the filename and set the extension to rtf for the new file name
string _newFileName = string.Format("{0}.rtf", FileName.Split('.')[0]);
_rtfDoc.Save(Path.Combine(OutputFilePath, _newFileName), SaveFormat.Rtf);
Any help would be great
Thanks
Jason