We're sorry Aspose doesn't work properply without JavaScript enabled.

# GetText adding a From Feed (ASCII 12)?

I have a process which opens a Word doc using DocumentBuilder, gets the contents using GetText() and updates a database record.
I am seeing an extra form feed character in the updated data. If I watch this in debug, I can see that there is an ASCII 12 as the last character. If I open the Word doc in binary mode in TextPad, there are two “0D” values immediately following my last visible character.
Here is the meat of my code:

string[] fileEntries = Directory.GetFiles(txtChargeLangPath.Text, "*_LANGUAGE.DOC");
foreach (string fileName in fileEntries)
{
langDoc = new Document(fileName);
chargeLanguage = langDoc.GetText();
lastChar = chargeLanguage[chargeLanguage.Length - 1];
lastCharInt = lastChar;
MessageBox.Show(lastCharInt.ToString()); // Shows "12"
myUpdateCommand.CommandText = "Update tblCsCharge Set ChargeLanguage = '" + chargeLanguage.Substring(0, chargeLanguage.Length - 1) + "' Where FileNumber = '" + fileNumber + "' and CountNumber = " + countNumber;
myUpdateCommand.ExecuteNonQuery();
}


Hi
Thanks for your request. “\f” or 12 character is section break character. GetText returns text with all Microsoft Word control characters including field codes. If you need to get text without MS Word control characters you should use ToTxt method.

Document doc = new Document(@"C:\Temp\in.doc");
string text = doc.ToTxt();


Hope this helps.
Best regards.