How to add encoding type


#1

Mac OS 10.14.4
java version “1.8.0_181”
Java™ SE Runtime Environment (build 1.8.0_181-b13)
Java HotSpot™ 64-Bit Server VM (build 25.181-b13, mixed mode)
Aspose: aspose-pdf-19.9

5421e2d679d54cca81a3e50c6217ed2b.pdf (41.0 KB)

Read text from pdf return “: Resource file GB-EUC-H not found in assembly”

Find GB-EUC-H from github (https://github.com/euske/pdfminer/raw/d6fd7e76b272e88dca67f7fabead397601ec6fce/pdfminer/cmap/GB-EUC-H.pickle.gz)

How can i use it

java.lang.IllegalStateException: Resource file GB-EUC-H not found in assembly
at com.aspose.pdf.internal.l0n.ld.lI(Unknown Source)
at com.aspose.pdf.internal.l4t.lv.lf(Unknown Source)
at com.aspose.pdf.internal.l4t.lv.lI(Unknown Source)
at com.aspose.pdf.internal.l4u.lt.lI(Unknown Source)
at com.aspose.pdf.internal.l4u.lt.lI(Unknown Source)
at com.aspose.pdf.internal.l4u.lt.lI(Unknown Source)
at com.aspose.pdf.internal.l4h.lI.ld(Unknown Source)
at com.aspose.pdf.internal.l4h.lk.(Unknown Source)
at com.aspose.pdf.internal.l4h.lI.(Unknown Source)
at com.aspose.pdf.internal.l4h.lj.(Unknown Source)
at com.aspose.pdf.internal.l4u.lk.lI(Unknown Source)
at com.aspose.pdf.internal.l4n.l1h.l0if(Unknown Source)
at com.aspose.pdf.internal.l5l.ld.lj(Unknown Source)
at com.aspose.pdf.internal.l5l.lv.lb(Unknown Source)
at com.aspose.pdf.internal.l5l.lv.(Unknown Source)
at com.aspose.pdf.internal.l5l.ld.lI(Unknown Source)
at com.aspose.pdf.internal.l5l.ly.lI(Unknown Source)
at com.aspose.pdf.internal.l5l.ly.lb(Unknown Source)
at com.aspose.pdf.internal.l5l.ly.lu(Unknown Source)
at com.aspose.pdf.internal.l5l.ly.ly(Unknown Source)
at com.aspose.pdf.internal.l5l.l0t.lI(Unknown Source)
at com.aspose.pdf.internal.l5l.l0t.lI(Unknown Source)
at com.aspose.pdf.internal.l5l.l0t.le(Unknown Source)
at com.aspose.pdf.internal.l5l.l0t.(Unknown Source)
at com.aspose.pdf.internal.l5l.l0t.(Unknown Source)
at com.aspose.pdf.TextFragmentAbsorber.visit(Unknown Source)


#2

@JamesGuo

Thank you for contacting support.

Would you please also share the code snippet you are using at your end, so that we may try to reproduce and investigate it in our environment.


#3

Document pdfDocument = new Document(file);
Page itemPage = pdfDocument.getPages().get_Item(1);
TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber();
itemPage.accept(textFragmentAbsorber);


#4

@JamesGuo

The API should not throw exception while extracting text. We have logged a ticket with ID PDFJAVA-38911 in our issue management system for the sake of correction. The ticket ID has been linked with this thread so that you will receive notification as soon as the ticket is resolved.

We are sorry for the inconvenience.