We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

Extract text inside TextBox in docx files

Hi, we have docx files where the contents are inside textboxes. We do not get any content when converting into HTML. It converts the textboxes as pictures. Is there a way to resolve this? We are using latest Aspose WordtoHtml for .Net

@Shishir_Khadka,

Thanks for your inquiry. Please use HtmlSaveOptions.ExportTextBoxAsSvg property to save textboxes as SVG. When the value of this property is set to true, Aspose.Words exports textboxes as inline <svg> elements. When false, textboxs are exported as <img> elements.

Hi @tahir.manzoor,
I have attached sample word document with text box and expected output. We tried setting SVG as true but we want to be able to extract text out of the textbox which we are not able to do from SVG element.
Let me know if there is an a way to extract text from the SVG or directly from word document without converting to HTML.
text_box.zip (65.0 KB)

@Shishir_Khadka,

Thanks for sharing the detail. Please use the following code example to get the text of Shape node (text box). Hope this helps you.

Document doc = new Document(MyDir + "sample_doc_txt_box.docx");
Shape shape = (Shape)doc.GetChild(NodeType.Shape, 0, true);

TxtSaveOptions options = new TxtSaveOptions();
options.ExportHeadersFooters = false;
String txt = shape.ToString(options);