Hi Team,
I need to extract all formatted content from a word document which has track changes.
I am able to extract revisions , but it is not returning the formatted description, whether bold, italic, strikethrough…
I too need to get page number from the below code.
Hi Tahir,
I have a question, I need to extract only the content formatted in word document along with track changes.
How to extract it, like Font was bold, Italic, highlighted.
I am able to extract the revisions, but it does not return page number and formatted type…
Please assist.
rev.getGroup().getText() does not exist.
revision object does not have method named ‘getGroup()’.
In addition, I need to what kind of change formatting**, had done, like bold, italic, strikeThrough,…**…
Hi Please respond.
I need assistance on this.
I see, there are like ParagraphFormat, and check the format.
It will be better if it could return what type changes have been done during track changes.
And this line of code is consuming more time, excess time. LayoutCollector collector = new LayoutCollector(doc);
** System.out.println(collector.getStartPageIndex(run));**
It is takeing close to one sec for each iterations, which means, for each revision, a second, which means, it will take a minute for 60 revisions, i have around 40 revisions, which is
Thanks for the update. I already figured it out, of placing the instance outside the loop. Now my concern is on extracting what type of formatting was performed revisioned text.
a) I am unable to fetch it. Like, I need to what kind of change formatting, had done, like bold, italic, strikeThrough
b) rev. getGroup ().getText() does not exist.
revision object does not have method named ‘ getGroup ()’.
c) The below code some times returns wrong page number, where in page with such number does not exist. How to resolve it.
LayoutCollector collector = new LayoutCollector(doc);
** System.out.println(collector.getStartPageIndex(run));**
d) How to use LayoutEnumerator to get page Index. I tried and it returned always ‘1’. Please assist.
Hi R u there, Please assit.
a) Page index wrong
b) How to get what type of format change was performed on a content.
Hi Team,
I have the latest JAR .
I have major request.
While extracting the Track changes from Revision object, there are multiple entries for same changes.
I would want that to be consolidated, as it looks like duplicate entries.
For instance, the document which we are working on, 2900 revisions, when we view the Review Pan in the document,
however, while using Aspose, it extracts around 4000 entries, agains 2900, because,
I understand, every singe SAVE action is being treated as track changes, hence Aspose returns Count as 4000.
I need to logic to consolidate the revision counts excluding duplicates.
Hi Awais,
As reported earlier, LayoutCollecter is returning wrong page number, where the page it self doesnot exist.
It is actually not just returning wrong number, it some how mistaking while returning page number, which does not exist.
Please assist.
I took latest JAR**. aspose-words-19.7-jdk17.jar**, still it returns wrong page number which does not exist.
Please ZIP and upload your input Word document (you are getting this problem with) here for testing. We will then investigate the issues on our end and provide you more information.
Hi Team,
I am afraid, we will be unable to share the confidential document, as it belongs to our client.
There are 3 concerns,
Page number - returns wrong page number at few instance and returns few** page numbers, which does not exist**,
Track Changes or Revisions, are duplicating or multiple entries of same or similar changes are returned,
Below is the code snippet RevisionCollection revisionCollections = doc.getRevisions();
We need to consolidate these multiple entries, to single Revision, as the extracted data are reviewed for final edition of the document. As these multiple entries are causing issues for final conclusion of the document.
Please assist on how to consolidate the Revisions with multiple same entries into single entry. When we view the Review pane, it shows less number of revisions, where while extracting from Revision collections, it returns 1.5 times higher revisions.
Please assist on how to get through this.
Regarding 1 & 2, as requested earlier, please ZIP and upload your input Word document (you are getting this problem with) here for testing. Unfortunately, it is difficult to say what the problem is without the document. We need your document to reproduce the problem on our end. Please note that it is safe to attach files in the forum. If you attach your document here, only you and Aspose staff members can download it. You can also remove any sensitive information by replacing it with dummy data instead.
Regarding 3, please provide the following resources here for testing:
Your simplified input Word document
Aspose.Words 19.7 generated output document showing the undesired behavior
Your expected document showing the correct output. You can create expected document by using MS Word. Please also list the steps that you performed in MS Word to create the expected document.
As soon as you get these pieces of information ready, we will start investigation into your issue and provide you more information. Thanks for your cooperation.
Hi Team, here with I am attaching the document. Kindly assist.
I am pretty sure, Aspose is returning duplicates on extracting Revisions. And on further investigating, it reveals, most of the duplications are caused by RUN. That is I verified the Parent node of the revisions,and I see, for every entry of a Revision with Parent Node RUN, there is another entry with out a parent Node. By filtering all Revisions with ‘RUN’ as parent node, I still see duplications.
The page numbers are too incorrect with documents with more number of pages, which goes beyond 100.
The document attached shows, number of Revisions as 29, however, Aspose returns Revisions as 206. On excluding RUNS, it returns Revions as 96. How is there this much difference.
Further, the page number issue could not re-produced with the dummy document which I am sharing. May be you could try it at yourJavaIntro-CommentsAndTC.zip (187.3 KB)
end.
MS Word 2019 says that there are 8 pages and 16 revision groups in your Word document. The following Aspose.Words for Java 19.7 code returns 252 revisions, 13 revision groups and 10 pages.
Regarding WORDSNET-18983, it seems you are expecting Aspose.Words layout to match “Simple Markup” MS Word review option. The attached screenshot in my previous post shows that “Simple Markup” is chosen in MS Word.
This option is not stored in the document. The option is a viewing option in MS Word and it may affect the document layout and the number of pages in the document as displayed in MS Word. This is not the default reviewing option in MS Word, so the default Aspose.Words layout options do not match it.
In order to emulate MS Word simple markup in Aspose.Words’ layout, the following options should be set to Document.LayoutOption.RevisionOptions before updating layout or requesting page count:
Document doc = new Document("E:\\Temp\\JavaIntro-CommentsAndTC.docx");
doc.getLayoutOptions().getRevisionOptions().setShowRevisionMarks(false);
doc.getLayoutOptions().getRevisionOptions().setShowRevisionBars(true);
doc.getLayoutOptions().getRevisionOptions().setShowOriginalRevision(false);
System.out.println(doc.getPageCount()); // shows page count as 8
So, please use layout options to emulate MS Word simple markup. Hope, this helps.