How to add encoding type

JamesGuo · October 9, 2019, 5:28am

Mac OS 10.14.4
java version “1.8.0_181”
Java™ SE Runtime Environment (build 1.8.0_181-b13)
Java HotSpot™ 64-Bit Server VM (build 25.181-b13, mixed mode)
Aspose: aspose-pdf-19.9

5421e2d679d54cca81a3e50c6217ed2b.pdf (41.0 KB)

Read text from pdf return “: Resource file GB-EUC-H not found in assembly”

Find GB-EUC-H from github (https://github.com/euske/pdfminer/raw/d6fd7e76b272e88dca67f7fabead397601ec6fce/pdfminer/cmap/GB-EUC-H.pickle.gz)

How can i use it

java.lang.IllegalStateException: Resource file GB-EUC-H not found in assembly
at com.aspose.pdf.internal.l0n.ld.lI(Unknown Source)
at com.aspose.pdf.internal.l4t.lv.lf(Unknown Source)
at com.aspose.pdf.internal.l4t.lv.lI(Unknown Source)
at com.aspose.pdf.internal.l4u.lt.lI(Unknown Source)
at com.aspose.pdf.internal.l4u.lt.lI(Unknown Source)
at com.aspose.pdf.internal.l4u.lt.lI(Unknown Source)
at com.aspose.pdf.internal.l4h.lI.ld(Unknown Source)
at com.aspose.pdf.internal.l4h.lk.(Unknown Source)
at com.aspose.pdf.internal.l4h.lI.(Unknown Source)
at com.aspose.pdf.internal.l4h.lj.(Unknown Source)
at com.aspose.pdf.internal.l4u.lk.lI(Unknown Source)
at com.aspose.pdf.internal.l4n.l1h.l0if(Unknown Source)
at com.aspose.pdf.internal.l5l.ld.lj(Unknown Source)
at com.aspose.pdf.internal.l5l.lv.lb(Unknown Source)
at com.aspose.pdf.internal.l5l.lv.(Unknown Source)
at com.aspose.pdf.internal.l5l.ld.lI(Unknown Source)
at com.aspose.pdf.internal.l5l.ly.lI(Unknown Source)
at com.aspose.pdf.internal.l5l.ly.lb(Unknown Source)
at com.aspose.pdf.internal.l5l.ly.lu(Unknown Source)
at com.aspose.pdf.internal.l5l.ly.ly(Unknown Source)
at com.aspose.pdf.internal.l5l.l0t.lI(Unknown Source)
at com.aspose.pdf.internal.l5l.l0t.lI(Unknown Source)
at com.aspose.pdf.internal.l5l.l0t.le(Unknown Source)
at com.aspose.pdf.internal.l5l.l0t.(Unknown Source)
at com.aspose.pdf.internal.l5l.l0t.(Unknown Source)
at com.aspose.pdf.TextFragmentAbsorber.visit(Unknown Source)

Farhan.Raza · October 9, 2019, 5:34pm

@JamesGuo

Thank you for contacting support.

Would you please also share the code snippet you are using at your end, so that we may try to reproduce and investigate it in our environment.

JamesGuo · October 10, 2019, 2:01am

Document pdfDocument = new Document(file);
Page itemPage = pdfDocument.getPages().get_Item(1);
TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber();
itemPage.accept(textFragmentAbsorber);

Farhan.Raza · October 10, 2019, 12:53pm

@JamesGuo

The API should not throw exception while extracting text. We have logged a ticket with ID PDFJAVA-38911 in our issue management system for the sake of correction. The ticket ID has been linked with this thread so that you will receive notification as soon as the ticket is resolved.

We are sorry for the inconvenience.

aspose.notifier · November 4, 2019, 4:44pm

The issues you have found earlier (filed as PDFJAVA-38911) have been fixed in Aspose.PDF for Java 19.10.