How does it work

Dear forum,

some time ago we bought aspose.net. But we are seeing some rather weird things going on.

We use this lib for reading a word doc and saving some content of it in a database table.

It all looked like it was working right. But then i found a word document which gave errors. After examining the 2 word documents in Word we saw no difference.

After debugging i saw one document had my text in one run inside a paragraph but the other had multiple runs in a paragraph. I thought a run was just one line of text but this varies sometimes it is some line of text and sometimes it is just a word or a couple of words, but i don’t see any difference or special characters in word.

In debug i also see that one document uses \r\n for every line and the other uses only \r.

What i also find strange is that a line like:

[Ict distributeur]

is translated in two Runs. Run 1 is [I and run 2 is ct distributeur]

I hope you understand my problem.

With kind regards,

Hans Jacobs

Communited

Hi,

Thanks for your request. Runs in MS Word documents represent text with different formatting. However, multiple Runs can represent text with the same formatting. Usually this occurs when you edit document multiple times in MS Word.

There is JoinRunsWithSameFormatting method, which concatenates runs with same formatting. So you can try just calling this method.
https://reference.aspose.com/words/net/aspose.words/document/joinrunswithsameformatting/

Best regards.

You mean this?

doc = new Document(fullFilePath);
doc.JoinRunsWithSameFormatting();
foreach (Paragraph par in doc.Sections[0].Body.Paragraphs)
{
    // your code
}

This returns 0, so no runs are joined.

Hi

Thanks for your request. Could you please attach your document here for testing? I will check the issue and provide you more information.

Best regards.

I added both files.

We developed our code with Actebis in mind but then we saw that Carta works differently.

With kind regards,

Hans Jacobs
Communited

Hi

Thanks for your inquiry. You are right, in the document “teksten tp Carta.doc” there is no Runs to concatenate. However in the document ”Actebis.doc” there are a lot of Runs, which can be concatenated. Here is code I used for testing:

Document doc = new Document(@"Test001\Actebis.doc");
int joined = doc.JoinRunsWithSameFormatting();
doc.Save(@"Test001\out.doc");
Console.WriteLine(joined); // returns 105.

Also, you can see difference if you open both documents using DocumentExplorer (Aspose.Words demo application). I attached screenshot to show you what I mean.

Best regards.

I think this is not a solution for us. I will look into it later. You will hear from me again.

Hi

Could you please explain your requirements? How do you use Runs in your application? Why is it so important for you that text to be represented by only one Run?

Maybe there is another way to achieve what you need, without using Runs. If so, I will be glad to help you to find correct way to achieve what you need.

Best regards.

Here i am again!

I just analyzed the difference between the two documents.

The actebis document sure has runs which could be concatenated, but that was not the problem. Actebis is the document our code was built for. It has a paragraph for each carriage return. So this are the real paragraphs.

e.g:

[Ict distributeur] (searchword)

Actebis Computers B.V. is een ICT distributeur voor de Nederlandse, Consumer Electronics- en retailmarkt. Met ongeveer 70 medewerkers bedient Actebis de wederverkopers vanuit de sales- & marketingorganisatie in Nieuwegein en haar magazijn in Utrecht. Sinds 1994 kan men bij Actebis terecht voor onder andere componenten, complete pc’s, netwerk- en retailproducten. Samen met Actebis Duitsland, Frankrijk, Denemarken, Noorwegen, Zweden en Oostenrijk vormt ze de Actebis Groep.

Your Partner for Success

This are 3 paragraphs + 3 paragraphs for the empty lines.

In Carta in paragraph[1] it has the complete text of one searchword

But in paragraph[4] i expted the compete text of searchword two. But the complete text for this is partially in paragraph 4 but also in paragraph 5 , 6 etc.

Ps: the text between brackets is the searchword.

Hi

Thank you for additional information. I suppose, you have jumbled Paragraph and Run. Please see Aspose.Words DOM to see the difference:
https://docs.aspose.com/words/net/aspose-words-document-object-model/

Runs represent pieces of text inside paragraphs. Runs are used because Paragraphs can contain text with different formatting.

Best regards.

I’m using regexp on the complete text now. It looks like it is working!

But i see another problem. It outputs “This document was truncated here because it was created using Aspose.Words in Evaluation Mode.” But we have the registered version. How can we change this?

Hi

Thanks for your inquiry. Please make sure that you applied license as described here:
https://docs.aspose.com/words/net/licensing/

Also, please check the points described in “Licensing” section in FAQ:
https://forum.aspose.com/t/2711

Hope this helps.
Best regards,

Thank you very much!

With kind regards,

Hans Jacobs
Communited