Hi, I am evaluating if Aspose work for my use case. Thank you in advance for your help!
I want to:
identify a date in a word document using Regex
convert the date into a data format
This is a sample of the document I work with:
MyAddress
MyAddress
MyAddress
2EE
09 November 2020
Reference: 1234
RE: Lorem ipsum
TO WHOM IT MAY CONCERN
This is the regex to extract the date:
(?<=2EE\n\n)(\b\d.{0,2}\s\w*\s\d{4})
So far I have managed to extract the date and replace it with another string, following examples online.
What I want to do however is to get the date string and parse it into a date.
In your code, you are replacing date with text ‘hello!’.
Could you please share some more detail about your requirement?
The date may be text or date field in MS Word document. You can find the date (text) from document and replace it with desired content using Range.Replace method.
However, if your date is a field, we suggest you please use Range.Fields property to get the field collection. Iterate over this collection and find the date field using Field.Type property. You can move the cursor to the date field and insert your desired content and remove date field.
Could you please ZIP and attach your input and expected output Word documents? We will then provide you more information about your query.
Apologies, I made a typo in my question. I meant:
I want to grab a date string from my word document using regex, and convert that string in java into a DATE format.
I attach an example document input as requested.
At the moment the only working solution I found works this way:
get all text from document as a string
String fullText = document.getRange().getText();
created a re-useable method to extract text with Regex (not using Aspose, just normal java)
public class TextExtractor {
public TextExtractor() {
}
public String extract(String text, String regex) {
Pattern pattern= Pattern.compile(regex);
Matcher matcher= pattern.matcher(text);
while (matcher.find()) {
return matcher.group();
}
return null;
}
}
created a method that takes my date string and converts it to date format
public class ConvertDate {
public ConvertDate() {
}
public Date convert(String dateString) throws ParseException {
TextExtractor textExtractor = new TextExtractor();
String day = textExtractor.extract(dateString, "\\b\\d{1,2}(?=[^0-9]{1})");
String month = textExtractor.extract(dateString, "\\b[A-Z]+[a-z]+");
String year = textExtractor.extract(dateString, "\\b\\d{4}\\b");
String concatenate= day+" "+month+" "+year;
Date dateResult = new SimpleDateFormat("dd MMMM yyyy").parse(concatenate);
return dateResult;
}
}
I am fairly sure there must be a better way with Aspose. Let me know if you have any suggestions,
thank you!