Data from word document is not displayed correctly in Aspose (customxml- header)

Hi,

I am having troubles loading a document with Aspose.Words. The document contains custom xml that doesn’t seem to be loading properly with the document.
Our document is created in word, we are using Aspose to convert it to pdf.

In the header the data we have changed is not displayed when loading with Aspose. It displays the data we had before changing the document.

Word is able to load the document correctly, but in Aspose it is missing our new data.
We’ve seem from the warningcallback that there are some things that are not supported, but we have not seen anything relevant to our problem here.

Import of element ‘hdrShapeDefaults’ is not supported in Docx format by Aspose.Words.
Import of element ‘rsids’ is not supported in Docx format by Aspose.Words.
Import of element ‘doNotAutoCompressPictures’ is not supported in Docx format by Aspose.Words.
Import of element ‘shapeDefaults’ is not supported in Docx format by Aspose.Words.
Import of element ‘decimalSymbol’ is not supported in Docx format by Aspose.Words.
Import of element ‘listSeparator’ is not supported in Docx format by Aspose.Words.
Import of element ‘docId’ is not supported in Docx format by Aspose.Words.
Tag with name ‘effectStyleLst’ is not supported.
Tag with name ‘objectDefaults’ is not supported.
Tag with name ‘extraClrSchemeLst’ is not supported.

Here is our current code for loading the document.

var loadOptions = new Aspose.Words.LoadOptions();
loadOptions.LoadFormat = Aspose.Words.LoadFormat.Docx;
loadOptions.WarningCallback = new WarningCallbackClass();
var doc = new Aspose.Words.Document(stream, loadOptions);

Hi Christian,

Thanks for your inquiry. I would suggest you please upgrade to the latest version of Aspose.Words for .NET 13.11.0 and see how it goes on your side. I hope, this helps. In case the problem still remains, please attach your Word/PDF documents here for testing. I will investigate the issue on my side and provide you more information.

Best regards,

Hi,

I updated the dll to the new version (13.11.0), sadly this did not solve the problem.
I am attaching the word document that exhibits the problem.

The problem seems to exist no matter what format the document is saved as, indicating that the problem lies within Aspose.Words’ loading of the docx.

Spesifically; the header contains fields bound to CustomXML. In the word XML the field s seem to contain both the databinding information, and also some “default content”. It is the “default content” that’s displayed by Aspose, while MS Words displays the databound content.

Further; we had the exact same problem back in 2007 with an earlier version of Aspose.Words (I was working for a different company then), but at that time it was solved by an update (I could also find another forum thread on this when google’ing). However, the problem seems to have resurfaced in a later update.

Thank you for helping!

Hi Christian,

Thanks for the additional information. I am afraid, I could not see any issue with the Aspose.Words generated output document, could you please clarify where the issue is? I have generated and attached a .docx file here for your reference. This document was generated using Aspose.Words for .NET 13.11.0 on my side. I used the following simple code to generate this document:

Document doc = new Document(@"C:\Temp\ruh5.docx");
doc.Save(@"C:\Temp\out.docx");

Please also see the attached screen shot which demonstrates when opening the out.docx with Microsoft Word 2013, all content controls display the data correctly. Please attach the output .docx which show the undesired behavior here for our reference. Also, please share Aspose.Words’ version number for which there were no problems on your side previously?

Best regards,

Hi again,

I’m attaching three documents, one docx that is saved before Aspose is involved in the process, one docx that is saved after loading the stream into a Aspose docx-format (as shown in our code above), and the pdf-result from when we have saved our document as pdf with aspose.

As you can see, the headers in the docx-files are not the same as what we see in the pdf, and this is essentially our problem.

We are using the newest version of aspose.words (13.11.0) and the newest versions of aspose.pdf (13.10.0).

Thank you for the quick replies, I hope this clarifies the issue a bit.

Hi Christian,

Thanks for the additional information. I tested the scenario and have managed to reproduce the same problem on my side during converting “savedWithoutAspose.docx” to PDF format using Aspose.Words for .NET 13.11.0. For the sake of correction, I have logged this problem in our issue tracking system as WORDSNET-9411. Our development team will further look into the details of this problem and we will keep you updated on the status of correction. We apologize for your inconvenience.

Best regards,

Thanks :slight_smile:

Hi,

Any update on this? We have a deadline coming up and we’re trying to get an overview over the situation

Thanks!

Hi Christian,

Thanks for your inquiry. Unfortunately, your issue is not resolved yet. Support for such custom XML data bound SDT controls is not implemented yet. We will inform you via this thread as soon as this issue is resolved. We apologize for your inconvenience.

Best regards,

Thank you!

However - not implemented yet? This was working with previous versions of Aspose.Words…

/C

Hi Christian,

Thanks for your inquiry. But, could you please provide a little more information about the Aspose.Words’ version number for which there were no problems on your side previously?

Best regards,

Hi,

Unfortuatly, I cannot remember which version this was. However - we wrote some demo code for a temporary workaround. It seems to work, and adds support for databound sdt-controls. Perhaps it will help:

private void ImplementWorkaround(Stream stream)
{
    try
    {
        var package = Package.Open(stream);
        var customPartsMap = new Dictionary<string, CustomXml>();
        var doc = WordprocessingDocument.Open(package);

        foreach (var header in doc.MainDocumentPart.HeaderParts)
        {
            using (var hStream = header.GetStream())
            {
                var document = new XmlDocument();
                var nsmgr = new XmlNamespaceManager(document.NameTable);

                foreach (var pair in Utility.WORD_NAMESSPACES)
                    nsmgr.AddNamespace(pair.Key, pair.Value);

                var nodes = document.SelectNodes("//w:sdt", nsmgr);
                if (nodes.Count == 0) continue;

                foreach (XmlNode sdt in nodes)
                {
                    var dataBinding = sdt.SelectSingleNode("w:sdtPr/w:dataBinding");
                    var custom = CustomXml.ForDataBinding(dataBinding);
                    if (!customPartsMap.ContainsKey(custom.Namespace))
                    {
                        custom.Ensure(doc);
                        customPartsMap.Add(custom.Namespace, custom);
                    }

                    var textNode = sdt.SelectSingleNode("w:p/w:r/w:t");
                    textNode.InnerText = customPartsMap[custom.Namespace].GetValue(dataBinding);
                }

                hStream.Position = 0;
                document.Save(hStream);
                hStream.SetLength(hStream.Position);
                hStream.Flush();
                header.FeedData(hStream);
            }

        }

    }
    catch (Exception exception)
    {
        System.Diagnostics.Debug.WriteLine(exception);
    }
    finally
    {
        stream.Position = 0;
    }
}

internal class CustomXml
{
    private static readonly Regex TARDIS = new Regex("^xmlns:(?[^=])=’(?[^’]+)’$");
    public XmlDocument Document { get; set; }
    public XmlNamespaceManager Manager { get; set; }
    public string Prefix { get; set; }
    public string Namespace { get; set; }

    internal CustomXml(string ns, string prefix)
    {
        this.Namespace = ns;
        this.Prefix = prefix;
    }

    internal void Ensure(WordprocessingDocument doc)
    {
        var customXmlPart =
        doc.MainDocumentPart.CustomXmlParts.Where(
        p => p.CustomXmlPropertiesPart.DataStoreItem.SchemaReferences.NamespaceUri.Equals(Namespace)).First();

        using (var stream = customXmlPart.GetStream())
        {
            Document = new XmlDocument();
            Manager = new XmlNamespaceManager(Document.NameTable);
            Manager.AddNamespace(Prefix, Namespace);
        }
    }

    internal string GetValue(XmlNode node)
    {
        return Document.SelectSingleNode(node.Attributes["xpath"].Value, Manager).InnerText;
    }

    internal static CustomXml ForDataBinding(XmlNode node)
    {
        var match = TARDIS.Match(node.Attributes["prefixMappings"].Value);
        return new CustomXml(match.Groups["namespace"].Value, match.Groups["prefix"].Value);
    }
}

Hi Christian,

It’s great you were able to find what you were looking for. Rest assured, we will notify you as soon as your issue is resolved in Aspose.Words. We apologize for your inconvenience.

Best regards,

The issues you have found earlier (filed as WORDSNET-9411) have been fixed in this .NET update and this Java update.

This message was posted using Notification2Forum from Downloads module by aspose.notifier.