Preserve custom data during html import

Hi there,


I am currently looking for ways to preserve custom information during html import into Aspose.
The imported data is to be modified further on the server side depending on the data (repeater, footnotes, …).

The idea is similar to the information stored by Aspose for word-html-word roundtrips:

<span style="-aw-bookmark-end:_Toc426635787"/>

Is there a way I can get this information after import? Or do I have to parse and modify the html before importing?

- Aspose-Version: 15.6.0 / 15.7.0 (evaluation)
String html = “” +

heading no 1 with some text, no style info

\n+
\n+

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Maecenas condimentum tortor quis tortor tincidunt, a luctus orci consequat. Pellentesque cursus justo in ullamcorper congue. Sed auctor, est non mattis maximus, sem felis aliquet justo, tincidunt efficitur tellus mi quis purus. Mauris facilisis rhoncus sapien, sit amet molestie mi dignissim sed. Integer fringilla lectus nisl. Sed faucibus molestie nibh ut blandit. Integer tempus lorem at nisl laoreet pretium eget ut sapien. Sed porta neque eu magna suscipit ullamcorper. Sed non nulla et arcu blandit interdum et a justo. Proin eu lectus tortor. Duis molestie, massa eu euismod tincidunt, felis ex convallis erat, semper sollicitudin sem massa ac libero. Phasellus viverra enim ac velit pulvinar, eget eleifend felis ultricies. Sed in lacinia metus. Nullam laoreet dui sit amet pharetra condimentum. Integer eget urna ex. Praesent a consequat diam, non blandit ipsum.

\n+
\n+

Some other Text in new Line

\n+
\n+

Subheading no (heading 2)

\n+
\n+
    \n+
    \t
  1. List opt to repeat
  2. \n+
    \n+ “”;

    Document doc =
    new Document();
    DocumentBuilder builder =
    new DocumentBuilder(doc);
    builder.insertHtml(html,
    true);

    // Now how can I find this information on a node?
    Thanks,
    Alexander

    Hi Alexander,

    Thanks for your inquiry. Aspose.Words does not preserve such custom styles in Aspose.Words DOM. It would be great if you please share following detail for our reference. We will then provide you more information on this.

    • Please share some detail about your requirements
    • Please share what you want to achieve after preserving custom style
    • Please share to which file format you want to save your final document

    As soon as you get these pieces of information to
    us we’ll start our investigation into your issue.

    Hi Tair,


    thank you for your response.

    Our basic idea is to generate a document for our customer. The content of those documents is (possibly) defined in multiple building blocks.
    Those building blocks can contain markers which will later be processed by our software.
    Some example functionality:
    - Provide data (e.g. replace $firstName with customer first name)
    - Repeaters (e.g. define one table row or list element, our software will multiply the nodes according to data)
    - Includes (import a second building block to the current position, e.g. an address)
    - Other expressions
    - …

    Since our customer prefers HTML5 solutions wherever possible, we want to give him the possibility to define those parts in a web application.
    However, for more advanced purposes, defining the document parts in Word(docx) or maybe OpenOffice(odt) will be inevitable.

    To prevent redundant code, it seems best to convert those input formats into the same model (e.g. Aspose.Words DOM) an continue from there.

    The final document will usually be in ODT, PDF or Docx. Output to HTML is currently not the main objective. Also, there will be no round trip scenario for the final document (at least not in the near future).

    I hope you can understand the general idea and can provide some ideas.

    Thanks,
    Alexander

    Hi Alexander,

    Thanks for your
    inquiry. Please note that Aspose.Words mimics the same behavior as MS Word does. Aspose.Words does not preserve such custom properties in
    Aspose.Words DOM.

    In your case, I suggest you please bookmark the contents e.g bookmark the text (List opt to repeat) with name ‘customrepeat5’. Aspose.Words does support bookmarks. You can load the document having bookmarks into Aspose.Words DOM and do an appropriate action according to your requirements.

    Please read following documentation links for your kind reference.
    http://www.aspose.com/docs/display/wordsjava/Aspose.Words+Document+Object+Model
    http://www.aspose.com/docs/display/wordsjava/Using+DocumentBuilder+to+Modify+a+Document+Easily

    Hope this helps you. Please let us know if you have any more queries.