How to extract text from .mht and xps file in java

Hello Team,

I am trying to extract text from .mht and .xps file in java .I am facing issue to extracting text

pls help me out …

@Rahul_Sharma1,

You can use Aspose.Words for Java to convert MHTML file to Text format and Aspose.PDF for Java API to convert XPS file to Text format.

Document doc = new Document("C:\\Temp\\source.mhtml");
// String extracted_Text = doc.toString(SaveFormat.TEXT);
// or save to TXT file
doc.save("C:\\Temp\\awjava-21.6.txt");

The following articles will be helpful to transform XPS into Text format:

Please let us know if you need more information; we are always glad to help you.

Thank you …

1 Like