Run.toTxt() pops exception

krvss · February 13, 2012, 7:52am

This is a problem I get on any document I process, but you can use as an example one of your files from the Java package. For this post I used SalesInvoiceDemo.doc (docx documents do not work as well).

Make a simple tree-walk algorithm that takes each node of the document and calls toTxt if node type is Run, like this

public void traverseAllNodes(CompositeNode parentNode) throws Exception
{
    for (Node childNode = parentNode.getFirstChild(); childNode != null; childNode = childNode.getNextSibling())
    {
        Run run = (Run) childNode;
        String runStr = run.toTxt();
        if (childNode.isComposite())
            traverseAllNodes((CompositeNode)childNode);
    }
}

Calling to run.toTxt will eventually (sometimes at the first iteration) cause exception:

java.util.EmptyStackException
at java.util.Stack.peek(Stack.java:85)
at com.aspose.words.awm.visitRun(TxtWriter.java: 109)
at com.aspose.words.Run.accept(Run.java: 89)
at com.aspose.words.awm.v(TxtWriter.java: 68)
at com.aspose.words.Document.a(Document.java: 1378)
at com.aspose.words.Node.toTxt(Node.java: 588)

For the document I' ve mentioned exception pops at the run with text "PAGE * MERGEFORMAT"

Strange thing is Document.toTxt() works fine.

Aspose.Words version: 11.0.0

tahir.manzoor · February 13, 2012, 11:30am

Hi Stanislav,

Thanks for your query. It would be great, If you share what you want to do by using Aspose.Words? The code shared by you will not work because you are casting each childNode to Run, which is incorrect. Please see the different types of nodes in document explorer.

alexey.noskov · February 13, 2012, 2:51pm

Hi
Thanks for your request. Yes, we are aware of this issue. We will let you know once this problem is resolved.
You can use Run.getText() instead of Run.toTxt() while you are waiting for a fix.
Best regards,

krvss · February 13, 2012, 3:39pm

Tahir,

Yes, I know this exact code is not going to work, but this is not my actual code -
just an example to give you a clue how to reproduce the problem.

Thanks,
Stanislav

krvss · February 13, 2012, 3:41pm

Thanks Alexey!

I’d like to use toTxt to get the run text without control chars, but for now I have workaround - I use getText and replace them via regex.

adam.skelton · February 13, 2012, 11:27pm

Hi there,

Thanks for this additional information.

Normally Run nodes don’t have any control characters. There are some such as new line break, but from memory these also appear when using ToTxt.

Thanks,

aspose.notifier · March 1, 2012, 12:54am

The issues you have found earlier (filed as WORDSNET-4812) have been fixed in this .NET update and this Java update.

This message was posted using Notification2Forum from Downloads module by aspose.notifier.
(3)