We have several clients that import our MS Word “DOC” files into their Medical EMR system. Their import process extracts the text from the MS Word file using a legacy proprietary program. This program is extracting the text incorrectly due to the NULL characters. This is something we have no control over.
Our clients’ import process works correctly when we use MS Word to convert the DOCX to DOC. This is because the MS Word file doesn’t contain those NULL characters from the unicode text.
We cannot use any other format other than MS Word DOC due to the legacy proprietary program that our clients use. I wish there was an alternative but there isn’t.
Is there a way that we could change that unicode text somehow to remove the NULL characters? Perhaps a method of directly modifying the bytes of the file?
Any work around is greatly appreciated until an option is added to Aspose.Words API.
Thanks