从pdf提取文字到txt文件时报错,示例代码如下:
private static void ExtractText() throws IOException
{
String inputPath=Environment.getExternalStorageDirectory().getAbsolutePath()+"/page_1.pdf";
String outputPath=Environment.getExternalStorageDirectory().getAbsolutePath()+"/extractedtext.txt";
com.aspose.pdf.Document pdfDocument = new com.aspose.pdf.Document(inputPath);
//create TextAbsorber object to extract text
com.aspose.pdf.TextAbsorber textAbsorber = new com.aspose.pdf.TextAbsorber();
//accept the absorber for all the pages
pdfDocument.getPages().accept(textAbsorber);
//pdfDocument.getPages().get_Item(2).accept(textAbsorber);
//get the extracted text
String extractedText = textAbsorber.getText();
// create a writer and open the file
java.io.FileWriter writer = new java.io.FileWriter(new java.io.File(outputPath));
writer.write(extractedText);
// write a line of text to the file
//tw.WriteLine(extractedText);
// close the stream
writer.close();
}
错误如下:
2022-03-08 10:09:52.835 24218-24218/com.aspose.pdf.examples E/AndroidRuntime: FATAL EXCEPTION: main
Process: com.aspose.pdf.examples, PID: 24218
java.lang.RuntimeException: Unable to start activity ComponentInfo{com.aspose.pdf.examples/com.aspose.pdf.examples.MainActivity}: class com.aspose.pdf.engine.io.serialization.PdfSerializationException: Culture Name: zh-CN-#Hans is not a supported culture
com.aspose.pdf.engine.data.PdfArray$z1.deserialize(Unknown Source:185)
com.aspose.pdf.engine.io.serialization.PdfSerializer.deserialize(Unknown Source:44)
com.aspose.pdf.engine.data.PdfDictionary$z1.deserialize(Unknown Source:220)
com.aspose.pdf.engine.io.serialization.PdfSerializer.deserialize(Unknown Source:44)
com.aspose.pdf.engine.data.PdfTrailer.m1(Unknown Source:255)
com.aspose.pdf.engine.data.PdfTrailer.m1(Unknown Source:230)
com.aspose.pdf.engine.data.PdfTrailer$XrefSerializer.deserialize(Unknown Source:11)
com.aspose.pdf.engine.io.serialization.PdfSerializer.deserialize(Unknown Source:44)
com.aspose.pdf.engine.io.PdfReader.m1021(Unknown Source:368)
com.aspose.pdf.engine.io.PdfReader.(Unknown Source:197)
com.aspose.pdf.engine.io.PdfReader.(Unknown Source:9)
com.aspose.pdf.internal.p41.z1.m289(Unknown Source:2)
com.aspose.pdf.engine.io.PdfFile.(Unknown Source:3)
com.aspose.pdf.internal.p41.z1.m291(Unknown Source:2)
com.aspose.pdf.engine.PdfDocument.open(Unknown Source:7)
com.aspose.pdf.engine.PdfDocument.(Unknown Source:12)
com.aspose.pdf.ADocument.init(Unknown Source:12)
com.aspose.pdf.ADocument.(Unknown Source:46)
请问下是否支持中文内容的提取