How do you want document properties exported to and imported from HTML?

At the moment, when exporting to HTML Aspose.Words only writes built-in document properties Subject, Keywords and Title as META elements.

There is a request from a customer to export all custom document properties to HTML. It is a good request and we are going to implement this feature. But before we do this, I would like to give you an opportunity to comment on this feature.

How do you want built-in and custom document properties exported/imported to HTML in Aspose.Words?

There are two ways I see at the moment (if you have other ideas, let me know).

1. Do the “HTML” way and export like this:

All built in and custom document properties can be simply exported as META elements.

Do you always want to see ALL builtin document properties exported? There are quite a few of them (about 30): https://reference.aspose.com/words/net/aspose.words.properties/builtindocumentproperties. I guess exporting them all as META plus all custom properties could be a bit too much. Another drawback is there will be really no easy way to distinguish between builtin and custom document properties except by name. This could lead to some issues. Essentially two property collections will be merged into a single collection of META elements. If you happen to have a custom document property with a name that matches a name of the built in document property, one of them will have to be overriden.

2. Do the Microsoft Word way and export to HTML like this:

my name
some template.dot
q

SalesDeal.doc
Create Sales Deal Rebates

Note that I will want export and import of document properties in Aspose.Words to agree on the same format. Looking forward for your responses.

Hello colleagues!

I think we can parameterize properties output on export and try to read both forms on import.

Currently we have one Boolean option in SaveOptions: HtmlExportDocumentProperties. If true then all built-in and custom properties are output to HTML in Microsoft manner. At the same time some meta declarations are output unconditionally. We can change to enumeration with values like this: None, Meta, Mso, MetaAndMso. In this case we don’t have to retain Boolean switch so we can reuse the same name. It is used rarely and won’t be a severe breaking change. As previously, some meta declarations will be output unconditionally.

We can distinguish built-in and custom properties by some prefix right in property names. For instance we can output built-in properties without any prefix since they are more frequent and prepend custom properties with “Custom:

Another way is utilizing schema attribute. By HTML standard it should help interpret metadata elements. And it could be any recognizable hint. Again, built-in properties will be output without schema and custom ones should be marked somehow:

We won’t output empty properties as it is implemented now. We can also parameterize what particular properties to output (collection of names). But I think this is overkill.

When importing HTML we should try interpreting both metadata and MSO properties. Since MSO technique is more specific it should take priority over metadata.

Here is a useful link to appropriate chapter in HTML 4.01 Specification:

http://www.w3.org/TR/html4/struct/global.html#didx-meta_data

Regards,