Hi,
I would like to know if I can read the anchors of a html file (in java)? That is to say, if I am able with Aspose to verify where the file pointed…
Thanks,
Alex
Hi Alex,
Document doc = new Document();
DocumentBuilder builder = new DocumentBuilder(doc);// insert hyperlink pointing to Aspose’s website
builder.insertHtml("<a href=“http://www.aspose.com/”>Home");
for(Field field : (Iterable<Field>)doc.getRange().getFields())
{
if(field.getStart().getFieldType() == FieldType.FIELD_HYPERLINK){
FieldHyperlink link = (FieldHyperlink) field;System<font color="BLUE"><b>.</b></font>out<font color="BLUE"><b>.</b></font>println<font color="BLUE"><b>(</b></font><font color="PURPLE">"---------------"</font><font color="BLUE"><b>)</b></font><font color="BLUE"><b>;</b></font> System<font color="BLUE"><b>.</b></font>out<font color="BLUE"><b>.</b></font>println<font color="BLUE"><b>(</b></font>link<font color="BLUE"><b>.</b></font>getAddress<font color="BLUE"><b>(</b></font><font color="BLUE"><b>)</b></font><font color="BLUE"><b>)</b></font><font color="BLUE"><b>;</b></font> System<font color="BLUE"><b>.</b></font>out<font color="BLUE"><b>.</b></font>println<font color="BLUE"><b>(</b></font>link<font color="BLUE"><b>.</b></font>getResult<font color="BLUE"><b>(</b></font><font color="BLUE"><b>)</b></font><font color="BLUE"><b>)</b></font><font color="BLUE"><b>;</b></font> <font color="BLUE"><b>}</b></font>
}
Hi,
Thanks for your request. I tested your code on my file, and I have (in output):
---------------
Adr page_0.html
Res (Provide a document title)
---------------
Adr page_1.html
Res Introduction
---------------
Adr page_2.html
Res Follow-up of the evolutions
---------------
Adr page_3.html
Res 1 Scope
---------------
Adr page_3.html
Res 1.1 Identification
---------------
Adr page_3.html
Res 1.2 System overview
---------------
Adr page_3.html
Res 1.3 Document overview […]
However, in my file .html, I would like to recover the element, for example “page_3.hmtl#Element_223323332…”…
.html file:
Hi Alex,
Hi,
Thanks for your request.
So, are you saying that is impossible to recover all the information contained in a hyperlink?
Actually, I tested on a Word document? I am able to recover the information of the hyperlink:
(console output)
Paragraph (Style Normal, java.awt.Color[r=0,g=0,b=0]) : HYPERLINK \l “Element_4040009190” Configure Availability Of Services a
FieldStart (22)
FieldSeparator (23)
FieldEnd (24)
However, with html code, I can’t recover the “#Element_999198187” even if it is written in the html code ( href=“page_3.html#Element_999198187”). How can I fix it?
Hi,
Thanks for your help.
Please, find my html file in attachment.
Hi,
Thanks for your advice, I fixed my issue!
I would like to know if it is possible to manage the metadata of a html file… Indeed, “Document.CustomDocumentProperties” and “Document.BuiltInDocumentProperties” allows to manage Word metadata. But when I tried to use the same code for a html file, I had an issue: the algorithm can’t find the build-in and custom properties… Is it normal?
Regards,
Hi Alex,