Aspose.word issue related to the pdf converter

Hey Tom:
You report to us there are three outstanding issue related to aspose,words while convert doc to pdf. ( please see the attached doc from you).
Would you please provide a update for them. Specially for the first one. Do you log a ticket for it?
Thanks,
Dong.

Hello!
Thank you for your inquiry.
#4171 – not fixed yet. You can suggest altering formatting.
#4166 – also not fixed. You can call doc.AcceptAllRevisions() as s workaround.
Some heading’s attributes are not set probably – I don’t remember this. That’s very strange if the issue was reported but wasn’t logged. Please refer to the original thread or attach the document reproducing this problem.
Regards,

Here is the doc that reported the error. The date March 20 after converted is wrong.
Also, I have question here:
When aspose.pdf convert doc to pdf, what is the pdf version is convert to? Is this can be configurable?
Thanks,
Dong.

Any update of them? please make sure the heading error also be tracked.
Thanks,
Dong.

Hi
Unfortunately these issues are unresolved yet. We will notify you as soon as it is done.
Best regards.

Hello Dong!
Here is some update for you.
I’m sorry for misunderstanding. We need some clarification on the last sent document. You mean that “heading error” is incorrect date in the first table, don’t you? This date is represented by a DATE field:
{ DATE @ “MMMM d, yyyy” }
MS Word updates fields when it opens a document. That’s why you always see the current date here. In opposite Aspose.Words doesn’t update fields on open or conversion. And there is no way to update DATE field by request. So the field is output as the last saved field result. Most probably this is the time when you last saved the document with MS Word.
This known issue was registered as #3292. I have linked it to this thread so we won’t forget to notify you on progress. Note that updating fields in general is a complex task. That’s why it hasn’t been solved yet and why I don’t promise any time frame to fix it in the future.
Here is the MS knowledge base article where a workaround is described:
https://support.microsoft.com/en-us/topic/the-filename-field-does-not-automatically-update-when-you-open-a-document-in-word-de2bfb95-d990-1ced-a618-5ac0a2ec1be4
See “Method 2: Create a macro to automatically update the field.”
If you mean anything different please let us know.
About the other issues:
#4171 - For outlined paragraph LeftIndent is missing.
This is a technically severe issue. I can advise not using indentation for outlined paragraphs. And I can help you with document refactoring if you’d like me to.
#4166 - Deleted document elements are converted to several formats.
This issue won’t be fixed. It was closed by our team leader with the following disposition:
“At the moment this is by design. I’m closing the issue.”
You can use a simple workaround. Call this method before you convert to PDF:
doc.AcceptAllRevisions();
In general document quality could be much improved if you take my advice with refactoring. I saw many other issues with layout in your documents. At last we can avoid repeating paragraph breaks by putting page breaks where it is appropriate. This will eliminate issues with illogical pagination. Next I would prefer table formatting rather than overusing tabs. Table layout could guarantee proper horizontal alignment. What’s your opinion? We are ready to help you at every step.
Regards,

I wanted to followup on issue #3292, when saving or printing via AsposeWord the DATE fields that in MS Word would get automatically updated when opening the document, do not get updated.
I have attached a sample document.
Thanks,
Humberto

Hi

Thanks for your request. You can try using the following code to update DATE fields in your document:

public void Test121()
{
    Document doc = new Document(@"Test121\Check+Letter.rtf");
    UpdateDateFields(doc);
    doc.Save(@"Test121\out.doc");
}
private void UpdateDateFields(Document doc)
{
    DocumentBuilder builder = new DocumentBuilder(doc);
    // Get collection of FieldStart nodes
    NodeCollection starts = doc.GetChildNodes(NodeType.FieldStart, true);
    // Loop through FieldStart nodes
    foreach (FieldStart start in starts)
    {
        // Check whether current field start is start of DATE field
        if (start.FieldType == FieldType.FieldDate)
        {
            // We should get field code
            string fieldCode = string.Empty;
            Node currentNode = start.NextSibling;
            // Get Field code
            while (currentNode.NodeType != NodeType.FieldSeparator)
            {
                if (currentNode.NodeType == NodeType.Run)
                    fieldCode += (currentNode as Run).Text;
                currentNode = currentNode.NextSibling; ;
            }
            currentNode = currentNode.NextSibling;
            // Remove curretn fieldvalue
            while (currentNode.NodeType != NodeType.FieldEnd)
            {
                Node nextNode = currentNode.NextSibling;
                currentNode.Remove();
                currentNode = nextNode;
            }
            // We should get format of date from field code
            // Structure of the DATe field is the following:
            // DATE \@ "MMMM d, yyyy" 
            // We need to get anly format of date
            Regex regex = new Regex("\\@\\s+\"(?[^\"]+)\"");
            Match match = regex.Match(fieldCode);
            string dateFormat = match.Groups["format"].Value;
            // Get current date
            string currentDate = DateTime.Now.ToString(dateFormat);
            // Move documentbuidler cursor to the field end and insert new field value
            builder.MoveTo(currentNode);
            builder.Write(currentDate);
        }
    }
}

I hope this could help you.
Best regards.

Alexey,
Thanks for you prompt reply. Is it possible to get that workaround in VB ?
Humberto

Sure, here is the same code in VB.

Public Sub Test007()
Dim doc As Document = New Document("C:\Temp\Check+Letter.rtf")
UpdateDateFields(doc)
doc.Save("C:\Temp\out.doc")
End Sub
Private Sub UpdateDateFields(ByVal doc As Document)
Dim builder As DocumentBuilder = New DocumentBuilder(doc)
'Get collection of FieldStart nodes
Dim starts As NodeCollection = doc.GetChildNodes(NodeType.FieldStart, True)
'Loop through FieldStart nodes
For Each start As Fields.FieldStart In starts
'Check whether current field start is start of DATE field
If (start.FieldType = Fields.FieldType.FieldDate) Then
'We should get field code
Dim fieldCode As String = String.Empty
Dim currentNode As Node = start.NextSibling
'Get Field code
While (currentNode.NodeType <> NodeType.FieldSeparator)
If (currentNode.NodeType = NodeType.Run) Then
fieldCode = fieldCode & DirectCast(currentNode, Run).Text
End If
currentNode = currentNode.NextSibling
End While
currentNode = currentNode.NextSibling
'Remove curretn fieldvalue
While (currentNode.NodeType <> NodeType.FieldEnd)
Dim nextNode As Node = currentNode.NextSibling
currentNode.Remove()
currentNode = nextNode
End While
'We should get format of date from field code
'Structure of the DATe field is the following:
' DATE \@ "MMMM d, yyyy" 
'We need to get anly format of date
Dim regex As Regex = New Regex("\@\s+""(?[^""]+)""")
Dim match As Match = regex.Match(fieldCode)
Dim dateFormat As String = match.Groups("format").Value
'Get current date
Dim currentDate As String = DateTime.Now.ToString(dateFormat)
'Move documentbuidler cursor to the field end and insert new field value
builder.MoveTo(currentNode)
builder.Write(currentDate)
End If
Next
End Sub

Best regards.

Alexey,
Yes that did it, I modified it so it also works with time fields.
Thanks !!!
Humberto

Alexey,
As I said before, the workaround you gave works to resolve the date values issue but in some circumstances the date value now looses formatting. I am enclosing one template with the correct formatting and also a file that was run thru the process which in come cases looses the formatting.
Is there a setting I have to use to preserve the formatting?
Please help.
Humberto

Hi

Thanks for your request. As I can see in both documents, date format is the same. It shows “may 8, 2009” as expected. You should note that when you open document in MS Word, MS Word automatically updates DATE fields and depending from culture you can see different format. On my side it looks like “май 8, 2009”.
Please clarify what do you mean when say “date value now looses formatting”. You can attach screenshots to demonstrate the issue.
Best regards.

Alexey,
I understand about the culture setting but that is not what I meant… ,sorry if I miscomunicated, by losing the format I meant that the date field if is in lets say Courier new font, if I check the Preserve formatting during updates box in the Field edit window when writing the template, in the process of merging after running the workaround that field would turn into Times New Roman 12
I have enclosed two screen shots, one from the template, which shows fonts as should be and the one generated by the merging process which clearly shows that the font is not what it was originally(see green hightlight). These two screen shots are from the documents previously submitted.
Thanks,
Humberto

Hi Humberto,

Thank you for your explanation. Now I understand what you mean. This occur because we completely remove field value. I modified the code, not all formation should be preserved.

Private Sub UpdateDateFields(ByVal doc As Document)
Dim builder As DocumentBuilder = New DocumentBuilder(doc)
'Get collection of FieldStart nodes
Dim starts As NodeCollection = doc.GetChildNodes(NodeType.FieldStart, True)
'Loop through FieldStart nodes
For Each start As Fields.FieldStart In starts
'Check whether current field start is start of DATE field
If (start.FieldType = Fields.FieldType.FieldDate) Then
'We should get field code
Dim fieldCode As String = String.Empty
Dim currentNode As Node = start.NextSibling
'Get Field code
While (currentNode.NodeType <> NodeType.FieldSeparator)
If (currentNode.NodeType = NodeType.Run) Then
fieldCode = fieldCode & DirectCast(currentNode, Run).Text
End If
currentNode = currentNode.NextSibling
End While
'Node that represents displayed text of the field
'it is needed to preserve original format
Dim fieldVal As Node
currentNode = currentNode.NextSibling
'Remove curretn fieldvalue
While (currentNode.NodeType <> NodeType.FieldEnd)
Dim nextNode As Node = currentNode.NextSibling
If (currentNode.NodeType = NodeType.Run) Then
DirectCast(currentNode, Run).Text = String.Empty
fieldVal = currentNode
Else
currentNode.Remove()
End If
currentNode = nextNode
End While
'We should get format of date from field code
'Structure of the DATe field is the following:
' DATE \@ "MMMM d, yyyy" 
'We need to get anly format of date
Dim regex As Regex = New Regex("\@\s+""(?[^""]+)""")
Dim match As Match = regex.Match(fieldCode)
Dim dateFormat As String = match.Groups("format").Value
'Get current date
Dim currentDate As String = DateTime.Now.ToString(dateFormat)
'Move documentbuidler cursor to the field end and insert new field value
If (fieldVal Is Nothing) Then
builder.MoveTo(currentNode)
Else
builder.MoveTo(fieldVal)
End If
builder.Write(currentDate)
End If
Next
End Sub

Please let me know in case of any issues.
Best regards.

Alexey,
I tested the additional code you provided and it works great!.
Once again, Thanks!
Humberto

Use Aspose.Words.Document.UpdateFields before saving a document. It works starting with Aspose.Words for .NET 7.0. Also use the Aspose.Words direct to PDF conversion method instead of Aspose.Words + Aspose.Pdf conversion.