Convert PDF to DOCX using Java - Conversion Problem IDENTİTY-H type font

Hi i try pdf2docx but when saving process there is an exception like, what can be the problem ?
i use aspose.pdf-21.1.jar

Blockquote
Exception in thread “main” java.lang.IllegalStateException: Resource file IDENTİTY-H not found in assembly
at com.aspose.pdf.internal.l0h.ld.lI(Unknown Source)
at com.aspose.pdf.internal.l4l.lv.lf(Unknown Source)
at com.aspose.pdf.internal.l4l.lv.lI(Unknown Source)
at com.aspose.pdf.internal.l4p.lt.lI(Unknown Source)
at com.aspose.pdf.internal.l4p.lt.lI(Unknown Source)
at com.aspose.pdf.internal.l4p.lt.lI(Unknown Source)
at com.aspose.pdf.internal.l4j.lI.ld(Unknown Source)
at com.aspose.pdf.internal.l4j.lk.(Unknown Source)
at com.aspose.pdf.internal.l4j.lI.(Unknown Source)
at com.aspose.pdf.internal.l4j.lj.(Unknown Source)
at com.aspose.pdf.internal.l4p.lk.lI(Unknown Source)
at com.aspose.pdf.internal.l4y.l1h.l0if(Unknown Source)
at com.aspose.pdf.internal.l3p.lI.lI(Unknown Source)
at com.aspose.pdf.internal.l3p.lb.lI(Unknown Source)
at com.aspose.pdf.internal.l2u.lu.lj(Unknown Source)
at com.aspose.pdf.internal.l2u.lu.lb(Unknown Source)
at com.aspose.pdf.internal.l2u.lu.lf(Unknown Source)
at com.aspose.pdf.internal.l2u.lu.lf(Unknown Source)
at com.aspose.pdf.l12y.lI(Unknown Source)
at com.aspose.pdf.l12y.lb(Unknown Source)
at com.aspose.pdf.ApsUsingConverter.lI(Unknown Source)
at com.aspose.pdf.ApsUsingConverter.lf(Unknown Source)
at com.aspose.pdf.l4v.lI(Unknown Source)
at com.aspose.pdf.ADocument.lj(Unknown Source)
at com.aspose.pdf.ADocument.lI(Unknown Source)
at com.aspose.pdf.Document.lI(Unknown Source)
at com.aspose.pdf.ADocument.lI(Unknown Source)
at com.aspose.pdf.ADocument.save(Unknown Source)
at com.aspose.pdf.Document.save(Unknown Source)
at TestAsposePdf2Docx.main(TestAsposePdf2Docx.java:30)

@ahmetkarabatak09

Please share a sample source PDF document with us along with sample code snippet that you are using so that we can test the scenario in our environment and address it accordingly.

text2.pdf (85.4 KB)
my sample code:

Blockquote
Document document = new Document(“d:/Temp/text2.pdf”);

	// Create DocSaveOptions object
	DocSaveOptions saveOption = new DocSaveOptions();
	// Set the recognition mode as Flow
	//saveOption.setMode(DocSaveOptions.RecognitionMode.Flow);
	// Set the Horizontal proximity as 2.5
	//saveOption.setRelativeHorizontalProximity(2.5f);
	// Enable the value to recognize bullets during conversion process
	//saveOption.setRecognizeBullets(true);
	saveOption.setFormat(DocSaveOptions.DocFormat.DocX);
	// Save the resultant DOC file
	document.save("d:/Temp/resultantflow.docx", saveOption);

@ahmetkarabatak09

We tested the scenario using Aspose.PDF for Java 21.2 and did not notice any issue. For your kind reference, an output DOCX is also attached. Please try using the latest version of the API and let us know if issue still persists.

Sample_21.2.zip (92.4 KB)

Thank you for reply,
i detected that java default locale affects the code, for example in my case my pdf contains fonts that has “Identity-H” encoding, i think somewhere in your java codes that encoding translated to upper case and in my java locale it is translated to “IDENTİTY-H”, but when i changed java locale to english no problems occurs, and document successfully converted to docx.

@ahmetkarabatak09

What is your current locale settings? Please share the detail with us so that we can try to replicate the issue in our environment and address it accordingly.

our java locale tr_TR

@ahmetkarabatak09

We tried to set the local as following before converting the PDF to DOCX with 21.2v of the API and did not face any issue:

Locale.setDefault(new Locale("tr_TR"));

Could you please share how you changed your locale settings? We will again test the scenario in our environment and address it accordingly.

the code below and outputs in my java:

import java.util.Locale;

public class LocaleTest {

	public static void main(String[] args) {
		String encodingText = "Identity-H";
		System.out.println(encodingText.toUpperCase());
		
		Locale.setDefault(Locale.ENGLISH);
		
		System.out.println(encodingText.toUpperCase());
	}
}

and that code outputs:

IDENTİTY-H
IDENTITY-H

maybe your operating system and java dont support tr_TR (turkish locale), but in our language when “i” character capitiliazed it turns out to “İ”

@ahmetkarabatak09

We need to further investigate whether this issue is related to specific locale settings or not. For the sake of further analysis, an investigation ticket as PDFJAVA-40268 has been logged in our issue tracking system. We will further look into its details and keep you posted with the status of its resolution. Please be patient and spare us some time.

We are sorry for the inconvenience.

The issues you have found earlier (filed as PDFJAVA-40268) have been fixed in Aspose.PDF for Java 21.4.