We are trying to extract comments from word document
we are having around 15 comments, in the document but we see only 8 comments using the below code, can someone correct the code to view all 15 comments
comment_nodes = doc.get_child_nodes(aw.NodeType.COMMENT, True)
comment_data = []
for node in comment_nodes:
comment = node.as_comment()
if comment is None:
continue
parent_para = comment.get_ancestor(aw.Paragraph)
if parent_para is None:
continue
@tesapsoe
Cause
It appears that your code is correctly attempting to extract comments from the Word document, but you are only retrieving a subset of the comments. This may occur if some comments are not directly associated with the main body of the document or if there are issues with how the comments are structured.
Solution
To ensure you retrieve all comments, you can simplify your code by directly collecting all comments without checking for their parent paragraphs. Here’s an updated version of your code that should help you extract all comments:
Code
# Assuming 'doc' is your Document object
comment_nodes = doc.get_child_nodes(aw.NodeType.COMMENT, True)
comment_data = []
for node in comment_nodes:
comment = node.as_comment()
if comment is not None:
comment_data.append({
'author': comment.get_author(),
'date': comment.get_date_time(),
'text': comment.to_string(aw.SaveFormat.TEXT)
})
# Print all comments
for comment in comment_data:
print(f"Author: {comment['author']}, Date: {comment['date']}, Comment: {comment['text']}")
This code collects all comments without filtering based on their parent paragraphs, which should help you retrieve all 15 comments.
Next Steps
- Run the updated code to see if it retrieves all comments.
- If you still encounter issues, check if the comments are properly formatted in the Word document.
- Ensure that the document is not corrupted and that all comments are visible in the Word application.
Please proceed with these steps and let me know if you need further assistance!
thanks for the code,
we are looking comments along with the paragraph/s on to which comments are written
so we are looking for something of this sort
comments:comments_text
author:author_name
para:paragraph/s.text
@tesapsoe Could you please provide the input file you are using to process the comments?