some time ago we bought aspose.net. But we are seeing some rather weird things going on.
We use this lib for reading a word doc and saving some content of it in a database table.
It all looked like it was working right. But then i found a word document which gave errors. After examining the 2 word documents in Word we saw no difference.
After debugging i saw one document had my text in one run inside a paragraph but the other had multiple runs in a paragraph. I thought a run was just one line of text but this varies sometimes it is some line of text and sometimes it is just a word or a couple of words, but i don’t see any difference or special characters in word.
In debug i also see that one document uses \r\n for every line and the other uses only \r.
What i also find strange is that a line like:
[Ict distributeur]
is translated in two Runs. Run 1 is [I and run 2 is ct distributeur]
Thanks for your request. Runs in MS Word documents represent text with different formatting. However, multiple Runs can represent text with the same formatting. Usually this occurs when you edit document multiple times in MS Word.
Thanks for your inquiry. You are right, in the document “teksten tp Carta.doc” there is no Runs to concatenate. However in the document ”Actebis.doc” there are a lot of Runs, which can be concatenated. Here is code I used for testing:
Document doc = new Document(@"Test001\Actebis.doc");
int joined = doc.JoinRunsWithSameFormatting();
doc.Save(@"Test001\out.doc");
Console.WriteLine(joined); // returns 105.
Also, you can see difference if you open both documents using DocumentExplorer (Aspose.Words demo application). I attached screenshot to show you what I mean.
Could you please explain your requirements? How do you use Runs in your application? Why is it so important for you that text to be represented by only one Run?
Maybe there is another way to achieve what you need, without using Runs. If so, I will be glad to help you to find correct way to achieve what you need.
I just analyzed the difference between the two documents.
The actebis document sure has runs which could be concatenated, but that was not the problem. Actebis is the document our code was built for. It has a paragraph for each carriage return. So this are the real paragraphs.
e.g:
[Ict distributeur] (searchword)
Actebis Computers B.V. is een ICT distributeur voor de Nederlandse, Consumer Electronics- en retailmarkt. Met ongeveer 70 medewerkers bedient Actebis de wederverkopers vanuit de sales- & marketingorganisatie in Nieuwegein en haar magazijn in Utrecht. Sinds 1994 kan men bij Actebis terecht voor onder andere componenten, complete pc’s, netwerk- en retailproducten. Samen met Actebis Duitsland, Frankrijk, Denemarken, Noorwegen, Zweden en Oostenrijk vormt ze de Actebis Groep.
Your Partner for Success
This are 3 paragraphs + 3 paragraphs for the empty lines.
In Carta in paragraph[1] it has the complete text of one searchword
But in paragraph[4] i expted the compete text of searchword two. But the complete text for this is partially in paragraph 4 but also in paragraph 5 , 6 etc.
I’m using regexp on the complete text now. It looks like it is working!
But i see another problem. It outputs “This document was truncated here because it was created using Aspose.Words in Evaluation Mode.” But we have the registered version. How can we change this?