I am currently working on a project where I need to convert Word documents (doc/docx) to Markdown using Python and Aspose. However, I am facing an issue with extracting the coordinates of images within the document. By default, when converting a Word document to Markdown, images are converted to a format like ![](res_md/Images/filename.001.png).
Is there a way to extract the coordinates of these images? For instance, if I split a document into pages and save them as PNG images and individual doc/docx files, can I retrieve the coordinates of the images within those pages?
I would appreciate any guidance or suggestions on how to achieve this. Thanks in advance!
To extract the coordinates of images when converting Word documents to Markdown using Python and Aspose, you will need to follow a few steps, as the Aspose.Words library does not directly provide image coordinates during the Markdown conversion process.
Load the Document: First, load your Word document using Aspose.Words.
Extract Images: You can iterate through the document’s shapes to find images. Each shape can provide its position and size, which can be used to determine the coordinates.
Convert to Markdown: After extracting the image coordinates, you can proceed to convert the document to Markdown format.
Here’s a code example demonstrating how to achieve this:
import aspose.words as aw
# Load the Word document
doc = aw.Document("your_document.docx")
# Prepare to store image coordinates
image_coordinates = []
# Iterate through all shapes in the document
for shape in doc.get_child_nodes(aw.NodeType.SHAPE, True):
if shape.has_image:
# Get the position of the image
x = shape.absolute_horizontal_position
y = shape.absolute_vertical_position
width = shape.width
height = shape.height
# Store the coordinates and dimensions
image_coordinates.append({
"x": x,
"y": y,
"width": width,
"height": height
})
# Convert the document to Markdown
markdown_options = aw.saving.MarkdownSaveOptions()
doc.save("output.md", markdown_options)
# Output the image coordinates
for coord in image_coordinates:
print(f"Image at (x: {coord['x']}, y: {coord['y']}, width: {coord['width']}, height: {coord['height']})")
Explanation:
The code loads a Word document and iterates through all shapes to find images.
For each image, it retrieves the absolute position and dimensions, which are stored in a list.
Finally, it saves the document in Markdown format.
This approach allows you to extract the coordinates of images before or after converting the document to Markdown. If you need to split the document into pages and save them as PNG images, you can use the PageSaveOptions class to achieve that, but you will need to handle the image extraction separately.
If you have further questions or need additional assistance, feel free to ask!
import aspose.words as aw
import aspose.pydrawing as pydraw
doc = aw.Document("C:\\Temp\\in.docx")
# Use LayoutCollector and LayoutEnumerator to calculate coordinates of the shapes.
collector = aw.layout.LayoutCollector(doc)
enumerator = aw.layout.LayoutEnumerator(doc)
# Get the shapes.
shapes = doc.get_child_nodes(aw.NodeType.SHAPE, True)
for s in shapes :
shape = s.as_shape()
# skip shapes from header/footer.
if shape.get_ancestor(aw.NodeType.BODY) == None :
continue
# process only toplevel
if not shape.is_top_level :
continue
enumerator.set_current(collector, shape)
# Process only the first page.
if enumerator.page_index>1 :
break
rect = enumerator.rectangle
print(f"X={rect.left}; Y={rect.top}; Width={rect.width}; Width={rect.height}")
I tried your code and made an observation: when converting a single-page document to an image, the coordinates used for drawing on the image have a 1.333 scaling relationship. Here’s a pseudocode representation of that.
img = Image.open(img_path)
width, height = img.size
draw = ImageDraw.Draw(img)
for line in coordinates:
x, y, w, h = map(float, line.strip().split(' '))
draw.rectangle([1.33333 * x, 1.33333 * y, 1.33333*(x + w), 1.33333*(y + h)], outline="red", width=2)
Lastly, I have one more question: if I want to convert a single-page PPT/PPTX to Markdown, how can I extract the coordinates of images within the PPT/PPTX? Would the code be similar to this one?
My goal is to embed these images directly into a Markdown document using their coordinates rather than inserting image paths. I would like to replace the image paths with the coordinates to achieve this.
Could you please provide guidance on how to map the coordinates obtained from your API to the Markdown text? Specifically, I need to understand how to use these coordinates to embed images within the Markdown content.
Your assistance in this matter would be greatly appreciated. @alexey.noskov
@kyrieqi I am afraid, there is no direct way to map between images written to Markdown and shapes in the original MS Word document. Also, as far as I know, The Markdown syntax for images doesn’t allow you to specify the size and position of images.
Thank you very much for your prompt and detailed response.
I will proceed with the implementation based on your recommendations and will reach out if I have any further questions.
Wishing you a pleasant day ahead.