Java USES TextFragment. Set the Chinese gibberish code

654280938 · December 30, 2018, 9:05am

//以下是我的代码
var pdfDocument = new Document(filePath);
ParagraphAbsorber absorber = new ParagraphAbsorber();
absorber.visit(pdfDocument);
for (PageMarkup markup : absorber.getPageMarkups()) {
var list = markup.getTextFragments();
for (TextFragment textFragment : list) {
textFragment.setText(“604年肝动脉解剖变异选择性腹腔”);
}
}
pdfDocument.save(filePath + “Translation_ok.pdf”);
test.pdf (269.1 KB)
test.pdfTranslation_ok.pdf (374.3 KB)

Farhan.Raza · December 30, 2018, 7:45pm

@654280938

Thank you for contacting support.

We have checked the files shared by you and it appears like a problem with fonts. Please note that, if you need to use any font other than the 14 core fonts supported by Adobe Reader, than you must embed the font description while generating PDF file. If font information is not embedded, Adobe Reader will take it from the Operating System if it’s installed over the system, or it will construct a substitute font according to the font descriptor in the PDF.

You may specify the font while replacing text, as under:

textFragment.getTextState().setFont(FontRepository.findFont("XYZ"));

For further information about embedding fonts, please visit Embedding Fonts while creating PDF for your kind reference.

654280938 · December 31, 2018, 10:42am

Hello!
When I use textFragment. GetTextState (). The setFont (FontRepository. FindFont (" XYZ ")); Font XYZ was not found. Where can I download the Font XYZ

Farhan.Raza · December 31, 2018, 6:34pm

@654280938

Please note that XYZ has been mentioned as an example font name. You may replace XYZ with any font name that can display Chinese characters.

654280938 · January 1, 2019, 2:01am

Hello!
Thank you for your help. I have solved the problem of Chinese.

654280938 · January 1, 2019, 3:17am

您好!
当我使用textFragment.setText进行替换文本后格式全部乱了
test.pdf (269.1 KB)
test.pdfTranslation_ok.pdf (617.5 KB)

Farhan.Raza · January 1, 2019, 2:12pm

@654280938

如果有的话，请你分享代码片段和相应的字体文件。我们将尝试在我们的环境中重现和调查它。

654280938 · January 2, 2019, 1:17am

var filePath=“test.pdf”;
var pdfDocument=getLicense(filePath);
ParagraphAbsorber absorber = new ParagraphAbsorber();
absorber.visit(pdfDocument);
for (PageMarkup markup : absorber.getPageMarkups()) {
markup.setMulticolumnParagraphsAllowed(true);
var list = markup.getTextFragments();
for (TextFragment textFragment : list) {
textFragment.getTextState().setFont(FontRepository.openFont(“STSONG.TTF”));
// The translated code is too much to provide, please feel free to use Chinese replacement
textFragment.setText(“Translated text”);
}
}
pdfDocument.save(filePath + “Translation_ok.pdf”);
STSONG.zip (7.1 MB)

Farhan.Raza · January 2, 2019, 9:31am

@654280938

我们使用下面的代码片段生成了附加的PDF文档。它提取段落并替换文本，因此生成的PDF文档是API的预期行为。您能否详细说明您的要求和关注点，以便我们进一步调查以帮助您。Translation_ok.pdf

Document pdfDocument = new Document(dataDir + "test_654.pdf");
ParagraphAbsorber absorber = new ParagraphAbsorber();
absorber.visit(pdfDocument);
for (PageMarkup markup : absorber.getPageMarkups()) 
{
    markup.setMulticolumnParagraphsAllowed(true);
    List list = markup.getTextFragments();
    for (Object object : list) 
    {
        TextFragment textFragment = (TextFragment) object;
        textFragment.getTextState().setFont(FontRepository.openFont("STSONG.TTF"));
        // The translated code is too much to provide, please feel free to use Chinese replacement
        textFragment.setText("年肝动脉解剖变异选择性腹腔");
    }
}
pdfDocument.save(dataDir + "Translation_ok.pdf");

您也可以访问替换PDF文档中的文本作为参考。

654280938 · January 2, 2019, 11:07am

But the formatting is all messed up isn’t it? Shouldn’t there be this text replacement that automatically rearranges the content of the page for enhancement? I used version 18.9 and didn’t have this
image.png (15.3 KB)
This paragraph goes forward

Farhan.Raza · January 2, 2019, 8:59pm

@654280938

Please note that each paragraph is extracted and replaced so text is not arranged in this case. We have generated PDF documents with Aspose.PDF for Java 18.9 and Aspose.PDF for Java 18.12 but we can not notice the difference. Kindly share your generated files to elaborate the difference. Moreover, previously suggested documentation article explains re-arranging of page contents with TextFragment if that suits your requirements.

Translation_ok_18.12.pdf
Translation_ok_18.9.pdf