Convert DOC DOCX to PDF using Java | Retain Japanese Text Vertical Direction Rotation | Get Characters Width Height

Dear asad.ali,

This is Shin at XLsoft.

The customer is trying to output PDF from a Word document using Aspose.Words for Java.

I got five additional questions from a customer.
I’m sorry many times, can you look it up?

The source code for verification and the verification result are attached.
2020-04-15_2225.zip (1.2 MB)

1.Is there a way to get the width and height of a character in Aspose.Words?
※ Thet want to get information on the state in which fonts and character sending have been applied.
→ Sample: No sample because the API is unknown

※ The intended result is not always returned.
→ Sample: ApplyTextScalingWithFixedWidth (Sample that applies horizontal ratio to the character string. It was set to fit within the width of the text box)

2.Is there a way to detect overflow (character overflow, character break at frame edge) in the text box?
※ They want to detect when the text box is full of characters.
When multiple text boxes are linked, the overflowing characters will be poured into the next text box, so they think that there is at least an internal judgment mechanism.
→ Sample: WdCheckOverflow (text box that overflows)

→ Execution result: WdCheckOverflowOutput1.pdf

3.If a text box is placed and TextBoxWrapMode.NONE is specified, the width of the Shape will be automatically expanded up to “Page width-Total left and right margins”.
However, if the characters do not fit within that width, they are wrapped.
Is there a way to prevent it from wrapping if it doesn’t fit?
→ Sample: WdApplyTextWrap (sample for loopback settings)

→ Execution result: WdApplyTextWrapOutput2.pdf

4.When a character string is placed with ShapeType.TEXT_PLAIN_TEXT, some characters are not output. Is there a workaround? Is it possible to detect characters that are not output?

At this time, it is output when the character string is output with Full-width “$ = ~ |” and ShapeType.TEXT_BOX.
Is there any limitation for ShapeType.TEXT_PLAIN_TEXT?
※ Characters other than the above may not be output depending on the specified font.
However, even that font is output as ShapeType.TEXT_BOX.
→ Sample: WdOutlineText
→ Execution result: WdOutlineTextOutput1.pdf

5.Place a text box with ShapeType.TEXT_BOX and make the text vertical. When horizontal ratio is applied, only full-width characters are rotated. Is there a workaround?
→ Sample: WdApplyTextOrientation (output result of process2)
→ Execution result: WdApplyTextOrientationOutput2.pdf

For vertical writing, can they set the ratio vertically?

Please continue to help us.

Best regards, Shin

@xlsoftkk

Thanks for posting your inquiry.

Your inquiry is related to Aspose.Words and it has been moved under respective category where you will be assisted accordingly.

Dear asad.ali,

This is Shin at XLsoft.

Thank you for moving.
Please continue to help about this issue.

Best regards, Shin

@xlsoftkk,

Please also attach the source input Word documents (you have generated these PDF files from) and the expected PDF files showing the desired behavior here for further testing. You may please create expected PDF files by using MS Word on your end.

Generally, Aspose.Words mimics the behavior of MS Word i.e. if you convert a Word file to PDF by using Aspose.Words, the output will look similar to what MS Word might have produced.

So, you do not have to adjust content by yourself. Aspose.Words should render it correctly during converting Word to PDF.

Dear asad.ali,

This is Shin at XLsoft.

Thank you for contacting us.
I will ask the customer. Please wait a moment.

Best regards, Shin

Dear awais.hafeez,

This is Shin at XLsoft.

I get the expected PDF files showing the desired behavior.
Please refer to the following. ※ These were made with MS Word.

Question 1.
OutlineText.pdf (136.6 KB)

Question 2.
TextNoWrap.pdf (83.9 KB)

Question 3.
TextOrientation.pdf (80.2 KB)

Question 4.
expected.pdf (318.6 KB)

Question 5.
TextWidthAndHeight.pdf (83.1 KB)

However there is no input Word document.
This is because the data is created using the Aspose library.

Can I ask you for support again ?

Best regards, Shin

@xlsoftkk,

We are checking these scenarios and will get back to you soon.

Dear awais.hafeez,

This is Shin at XLsoft.

Thank you for following us.
Please continue to give us your support.

Best regards, Shin

Dear awais.hafeez,

This is Shin at XLsoft.

I’m sorry that you are busy.
Has there been progress in the matter you are investigating?

Best regards, Shin

@xlsoftkk,

One way to get character height and width is as follows.
Sample input Word document: Character Height Width Test.zip (9.4 KB)

Document doc = new Document("E:\\temp\\Character Height Width Test.docx");
DocumentBuilder builder = new DocumentBuilder(doc);

Node[] runs = doc.getChildNodes(NodeType.RUN, true).toArray();
for (int i = 0; i < runs.length; i++) {
    Run run = (Run) runs[i];
    int length = run.getText().length();

    Run currentNode = run;
    for (int x = 1; x < length; x++) {
        currentNode = SplitRun(currentNode, 1);
    }
}

NodeCollection smallRuns = doc.getChildNodes(NodeType.RUN, true);
for (int i = 0; i < smallRuns.getCount(); i++) {
    Run run = (Run) smallRuns.get(i);
    builder.moveTo(run);
    builder.startBookmark("bm_" + i);
    BookmarkEnd end = builder.endBookmark("bm_" + i);
    run.getParentNode().insertAfter(end, run);
}

LayoutCollector collector = new LayoutCollector(doc);
LayoutEnumerator enumerator = new LayoutEnumerator(doc);

double totalWidth = 0;
for (Bookmark bm : doc.getRange().getBookmarks()) {
    if (bm.getName().startsWith("bm_")) {
        enumerator.setCurrent(collector.getEntity(bm.getBookmarkStart()));
        enumerator.moveNext();

        String width = String.format("%.2f", enumerator.getRectangle().getWidth());
        String height = String.format("%.2f", enumerator.getRectangle().getHeight());

        System.out.println(enumerator.getText() + " has Width=" + width + " and Height=" + height);

        totalWidth += enumerator.getRectangle().getWidth();
    }
}

System.out.println("Total Width = " + totalWidth);

private static Run SplitRun(Run run, int position) throws Exception {
    Run afterRun = (Run) run.deepClone(true);
    afterRun.setText(run.getText().substring(position));
    run.setText(run.getText().substring(0, position));
    run.getParentNode().insertAfter(afterRun, run);
    return afterRun;
}

Regarding 2 & 3, you can build logic on the above code to meet this requirement. You can compare Shape’s width with the total width of characters in line to make decision.

Regarding 4 & 5, we will get back to you soon. Thanks for being patient.

Dear awais.hafeez,

This is Shin at XLsoft.

Thank you for reply.
We will tell the customers about regarding 1, 2 & 3.

Please let me know if there is an update for regarding 4 & 5.

Best regards, Shin

Dear awais.hafeez,

This is Shin at XLsoft.

They were able to detect the overflow by using the contents of No1 and No2.

However, it was able to detect when no attributes of characters and paragraphs were applied, or attributes such as Scaling that transform the character itself, but it could not be detected when the values of CharacterSpacing and LineSpacing were changed. did.
(Even if you adjust the character spacing and line spacing so that all characters fit within the frame, the width and height of the characters will be the same value, so it will be detected as an overflow.)

Could you tell me how to get the width and height that reflect the values of attributes such as character spacing and line spacing?

We will inform you of the information necessary for the survey.
AsposeForJavaQA-2020-05-15.zip (1.2 MB)


・sample code:src/pmx0035/2/WdCheckOverflow.java
・execution result:/actual/OverflowWdCheckOverflowOutput_200.0_100.0_LineSpacing10.0_2.pdf 、/actual/Overflow/WdCheckOverflowOutput_200.0_110.0_CharacterSpacing5.0_2.pdf
・expected value:expected/pdf/SampleComparePdfResult1.pdf


Thank you for your continued support.

Best regards, Shin

@xlsoftkk,

Regarding 4, after an initial test with the latest (20.5) version of Aspose.Words for Java, we were unable to reproduce this issue on our end. Please see the output PDF document that we generated on our end by using the following simple Java code:

Java code:

Document document = new Document();
DocumentBuilder builder = new DocumentBuilder(document);

Paragraph paragraph = (Paragraph) document.getFirstSection().getBody().appendChild(new Paragraph(document));
int joinStyle = JoinStyle.MITER;

// シェイプ作成
createShape(document, builder, paragraph, Color.white, Color.orange, 10.0, 0, joinStyle);
createShape(document, builder, paragraph, Color.white, Color.black, 5.0, 1, joinStyle);
createShape(document, builder, paragraph, Color.cyan, Color.cyan, 0, 2, joinStyle);

document.save("E:\\temp\\2020-04-15_2225\\awjava-20.5.docx");
document.save("E:\\temp\\2020-04-15_2225\\awjava-20.5.pdf");

public static Shape createShape(Document document, DocumentBuilder builder, Paragraph paragraph,
                          Color fillColor, Color strokeColor, double strokeWeight, int zOrder, int joinStyle) throws Exception {

    String str = "WabABcde\nあいうえお\n漢字\n!\"#$%&'()=~|'";

    // シェイプ作成
//        Shape shape = builder.insertShape(ShapeType.TEXT_BOX, 150, 50);
    Shape shape = new Shape(document, ShapeType.TEXT_PLAIN_TEXT);
//        shape.setWrapType(WrapType.INLINE);
    shape.setAllowOverlap(true);
    paragraph.appendChild(shape);

    shape.setLeft(50.0);
    shape.setTop(50.0);
    shape.setWidth(150);
    shape.setHeight(50);

    shape.setFillColor(fillColor);
    shape.setStrokeColor(strokeColor);
    shape.setStrokeWeight(strokeWeight);
    shape.setZOrder(zOrder);

    com.aspose.words.Stroke stroke = shape.getStroke();
    stroke.setJoinStyle(joinStyle);

    TextPath textPath = shape.getTextPath();
    textPath.setSize(30);
    textPath.setText(str);
    textPath.setFontFamily("Meiryo UI Bold");

    return shape;
}

So, please upgrade to the latest 20.5 version of Aspose.Words for Java. hope, this helps.

@xlsoftkk,

We tested this scenario and have managed to reproduce the same problem on our end. For the sake of correction, we have logged this problem in our issue tracking system. The ID of this issue is WORDSNET-20452. We will further look into the details of this problem and will keep you updated on the status of correction. We apologize for your inconvenience.

We are also working on your most recent query and will get back to you soon.

Dear awais.hafeez,

This is Shin at XLsoft.

Thank you for answer.
I will tell the trial user about 4.&5.

Please contact me if there is an update.

Best regards, Shin

Dear awais.hafeez,

This is Shin at XLsoft.

The user contacted me again.
The user will continue to confirm the Aspose’s operation.

I would like to ask for your support regarding how to get the width and height that reflect the values of attributes such as character spacing and line spacing that I contacted several days ago.

Best regards, Shin

@xlsoftkk,

Thanks for being patient. Can you please summarize the outstanding (unresolved) problem? Please provide again only the related source Word document (if any), Aspose.Words’ generated output document showing the undesired behavior, the expected document showing the desired output and related code file here for testing. Thanks for your cooperation.

Dear awais.hafeez,

This is Shin at XLsoft.

Thank you for your support.
The unanswered questions are as follows, except for regarding 5.

They seemed to be able to detect the overflow using the answers they got in No1 and No2.

However, it was able to detect when no attributes of characters and paragraphs were applied, or attributes such as Scaling that transform the character itself, but it could not be detected when the values of CharacterSpacing and LineSpacing were changed. did.
(Even if you adjust the character spacing and line spacing so that all characters fit within the frame, the width and height of the characters will be the same value, so it will be detected as an overflow.)

Could you tell me how to get the width and height that reflect the values of attributes such as character spacing and line spacing?

We will inform you of the information necessary for the survey.
AsposeForJavaQA-2020-05-15.zip (1.2 MB)


・sample code:src/pmx0035/2/WdCheckOverflow.java
・execution result:/actual/OverflowWdCheckOverflowOutput_200.0_100.0_LineSpacing10.0_2.pdf 、/actual/Overflow/WdCheckOverflowOutput_200.0_110.0_CharacterSpacing5.0_2.pdf
・expected value:expected/pdf/SampleComparePdfResult1.pdf


Thank you for your continued support.

Best regards, Shin

@xlsoftkk,

I am afraid, we are unable to locate this file (SampleComparePdfResult1.pdf) in the “AsposeForJavaQA-2020-05-15.zip” archive. Please provide the expected PDF file here for our reference. Thanks for your cooperation.

Dear awais.hafeez,

This is Shin at XLsoft.

I’m sorry. I made a mistake.
“expected value:expected/pdf/SampleComparePdfResult1.pdf” was not included in the request from the customer. (There was no expected PDF)

Is it difficult to continue the survey without the PDF ?

Best regards, Shin