Invalid characters converted to question marks

We are using a csv upload to add users. One of the requirements is to strip invalid characters (that are not in the Windows-1252 character set) from any text input fields during file processing. e.g. "Test象形字String" should result in "TestString". However when setting the file encoding using the following:

TxtLoadOptions loadOptions = new TxtLoadOptions(LoadFormat.CSV);
loadOptions.Encoding = Encoding.GetEncoding(1252);

all invalid characters are automatically converted to question marks. So the above string would be read from the file as "Test???String". A question mark is a valid Windows-1252 character and we can't assume a question mark is always invalid, as it can be valid for certain inputs.

Is there a way to stop this automatic conversion?



Hi,


Thanks for providing us some details.

Please provide us your original CSV file so we could test your issue with corresponding encoding type on our end.

Thank you.

Hi Amjad,


Thanks for the response, please find my csv file attached.

Kind regards,
Paul

Hi,


This is probably your template (input) CSV file. When I opened it into notepad, I saw the “???” b/w the strings, see the screenshot attached here:
http://prntscr.com/b348mx

Thank you.

Oh dear. I see this was my error. It turns out that excel won’t save non-standard characters and does the conversion to question marks. I never closed and reopened the file throughout my testing so the foreign (Chinese) characters always displayed correctly. After reopening I can see them as question marks.


Please close this request.
Thanks,
Paul

Hi Paul,


Thank you for the confirmation on this. Please feel free to contact us back in case you need our further assistance with Aspose APIs.