We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

Removing Corrupted DocProperties from a document

Hi,

In the attached document, you can see that a couple of the document properties have been corrupted.

They have the text: Error! Unknown document property name.

Is it possible to scan the document and remove these properties?

Cheers

Paul

Hi

<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thanks for your request. Please try using the following code:

// Open document.

Document doc = new Document("YYYY_0_1.doc");

// Get all FieldStart from the document.

Node[] fieldStarts = doc.GetChildNodes(NodeType.FieldStart, true).ToArray();

// Loop through all FieldStart.

foreach (FieldStart fieldStart in fieldStarts)

{

if (fieldStart.FieldType == FieldType.FieldDocProperty)

{

string fieldCode = string.Empty;

Node currentNode = fieldStart;

//Get Field code

while (currentNode.NodeType != NodeType.FieldSeparator)

{

if (currentNode.NodeType == NodeType.Run)

fieldCode += (currentNode as Run).Text;

currentNode = currentNode.NextSibling;

}

currentNode = fieldStart;

//We should get Property name from field code

Regex regex = new Regex(@"\s*(?\S+)\s+(?\S+)\s+(?.+)");

Match match = regex.Match(fieldCode);

string propertyName = match.Groups["propname"].Value;

// Check if this Property exist in BuiltInDocumentProperties or CustomDocumentProperties

if (doc.BuiltInDocumentProperties[propertyName] == null && doc.CustomDocumentProperties[propertyName] == null)

{

// Remove this field

while (currentNode.NodeType != NodeType.FieldEnd)

{

currentNode = currentNode.NextSibling;

currentNode.PreviousSibling.Remove();

}

currentNode.Remove();

}

}

}

doc.Save("Out.doc");

Best regards,

Excellent.

Thanks a million for that.