We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

Read Formatted Text from Bookmark to Database

Hi

I am having a document which has several bookmarks. Each bookmark may contain some text either plain text or formatted text(HTML). Now The text for each bookmark is extracted from database. And I download the document. I made several changes such as changing font color, font size and underline the font etc.,. Then I extract each bookmark text to database and I update each column where from I got the previous value for the respective bookmark. After the updation, Again I load the document with updated formatted text but I only got the plain text instead of formatted text.

For Example I have 3 bookmarks in a document
Name: Vinoth whose bookmark.Text is Vinoth
City: Chennai whose bookmark.Text is Chennai
Country: India whose bookmark.Text is India
I used DocumentBuilder.MoveToBookmark(Bookmark.Name) and InsertHtml(Bookmark.Text) to get the document with formatted text and I download the document.

Now I am updating the formatted text in the downloaded document as follows
Name: Vinoth -font size has changed
City: Chennai-font fore color has changed and underline is removed
Counrty: India-font style has changed

After making changes in the formatted text I update all the bookmark texts to database using c# coding.I can save all bookmark’s text as plain text to database. Again I am loading the document with updated formatted text. But I only got plain text in resultant document as follows
Name: Vinoth
City: Chennai
Country: India
Could you tell me the solution to get the updated formatted text in the bookmark’s text using c#.

Hi Allan,


Thanks for your inquiry. You can extract Bookmark’s content into a temporary Document and then get HTML representation of that content by using the following code:

string text = doc.FirstSection.Body.ToString(SaveFormat.Html);

I hope, this helps.

Best regards,

Thank you for your reply,


string text = doc.FirstSection.Body.ToString(SaveFormat.Html);

reads all the text into a single string variable. I need each bookmark text in individual string variable.

Need solution soon…

Hi Allan,


Thanks for your inquiry. A complete Bookmark in a Microsoft Word document is consisting of a bookmark start character and bookmark end character. As mentioned earlier, you need to extract content enclosed between these bookmark start and bookmark end markers into a temporary Document and then get HTML representation of that document. I hope, this helps.

Best regards,

Thanks for reply,


It’ll be helpful if you give the code snippet in C#.

Regards
Allan

Hi Allan,


Thanks for your inquiry. The code mentioned in these articles is already in .NET (C#). If we can help you with anything else, please feel free to ask.

Best regards,

Hi:
Thanks again for your reply. However, I am getting a little frustrated because my manager is putting a lot of pressure on me and I have still not found (or received an answer) to my question. The documentation and all the examples (including the pointers you have provided) clearly show how to retrieve plain text or insert formatted text into a bookmark. However, there is nothing I can find to read formatted text from a bookmark when it contains

  1. New bookmark
  2. Hyperlink
  3. End of a paragraph

In going to your examples and documentation I finally stumbled upon this link


which briefly mentions the problem that I am having - but no solution (at least that I can see). Am I am missing something here ? and perhaps you can still help.

I have attached a file describing the problem I am having.
Any help you can provide is appreciated.

Hi Allan,


Thanks for your inquiry. I have attached a sample project here for your reference. I hope, this helps in getting the formatted HTML string representation of a particular Bookmark’s content. Please let me know if I can be of any further assistance.

Best regards,

Thank you very much for your help. The sample program is working fine for me and I get the formatted text for all bookmarks. However there are two issues again.

1. The starting point of each bookmark goes to next line when the bookmark is being filled with the formatted text.How to remove the paragraph nodes in order to avoid this issue?
2. If I have the bookmarks inside a table, I can’t read those bookmarks and when doing so it throws an NULL REFERENCE ERROR since your code reads the text from paragraph nodes rather than table node.
Kindly let me know where I need to change the code in sample program in order to get bookmarks inside the table node.
Find the attachment for your reference.

Hi Allan,


Thanks for your inquiry. After an initial test with Aspose.Words 14.7.0, I was unable to reproduce these issues on my side. I would suggest you please upgrade to the latest version of Aspose.Words. You can download it from the following link. I hope, this helps.
http://www.aspose.com/community/files/51/.net-components/aspose.words-for-.net/default.aspx

Best regards,

Hi,

Thanks for reply. As you told I’ve upgraded my Aspose version. Afterwards I could read the bookmark text which are inside a table. But there is another issue arising as follows.

I am creating a data table with bookmark names as column names and bookmark text as column values for the respective column names. I noticed that bookmark text of each bookmark inside a table is the text of last row of document’s table.
The bookmark text of Age , Gender and Street are getting only the bookmark text of Street.I added another row to the tail end of the table and created another bookmark inside the row of the table and process the document. Eventually I got the same issue.
I could read bookmark text of all bookmarks inside a table as the bookmark text of last row of the table.
Besides It reads entire text inside a row including labels.
I can’t find the logic behind your sample project. May be I should do some changes in order to get the proper result on your sample code, but I am stumbled to break the code. kindly let me know where can i edit the code to solve the above mentioned issue.

Hi Allan,


Thanks for your inquiry. It would be great if you please create a standalone runnable simple console application that helps us reproduce the same problem on our end and attach it here for testing. As soon as you get this simple application ready, we’ll start further investigation into your issue and provide you more information. Please also attach the Word document you are getting this problem with here for testing. Thanks for your cooperation.

Best regards,

Hi:

Attached please find the console project with the problem.
As you can see the data in bookmarks outside the table is OK.
As soon as it is inside the table, the same text “software” is read from all the bookmarks.
I am sure i am doing something wrong here.
Your help is appreciated
thanks

Hi Allan,


Thanks for your inquiry. We are working over your query and will get back to you soon.

Best regards,

Hi,

I have been waiting for your reply. Kindly make it ASAP.

Regards
Allan.

Hi Allan,


Thanks for being patient. Please spare us some time for the investigation of this issue. We will reply you as soon as we can.

Best regards,

Hi Allan,


Thanks for being patient. You can use the following simple code to get html representation of Bookmark’s content:

string Basepath = @“Documents”;

Document doc = new Document(Basepath+"BMDemo_Edit.doc");

BookmarkCollection bmCollection = doc.Range.Bookmarks;

DataTable dtUnitCurriculum = new DataTable();

foreach (Bookmark bm in bmCollection)

{

ArrayList nodes = ExtractContent1(bm.BookmarkStart, bm.BookmarkEnd);

StringBuilder sb = new StringBuilder();

for (int i = 0; i < nodes.Count; i++)

{

Node node = (Node)nodes[i];

if (node.IsComposite)

{

sb.Append(node.ToString(SaveFormat.Html));

i = nodes.IndexOf(((CompositeNode)node).LastChild) + 1;

continue;

}

sb.Append(node.ToString(SaveFormat.Html));

}

string bookmarkName = bm.Name;

string bookmarkHtml = sb.ToString();

}


public static ArrayList ExtractContent1(Node startNode, Node endNode)<o:p></o:p>

{

ArrayList nodes = new ArrayList();

for (Node node = startNode; node != null && node != endNode; node = node.NextPreOrder(node.Document))

{

nodes.Add(node);

}

return nodes;

}


I hope, this helps.

Best regards,

This code works fine. Thank you for reply.

Hi There is another issue I have encountered recently. The code works fine for single line text inside the bookmark. If I have a paragraph and when try to read the paragraph text inside the bookmark the loop goes indefinitely. I have attached the file which has the bookmark with paragraph text. Need solution to read both paragraph text and single line text. Your help will be appreciated.


Regards
Allan