Copying formatted text with styles between documents - open for your comments

This was an issue in Aspose.Word for some time, but Aspose.Word 3.0.2 solves the issue by importing appropriate styles into the destination document when you call Document.ImportNode in the same way that MS Word does when you paste text from another document.

The process goes like this:

  1. Built in styles are matched using locale independent style identifier. User defined styles are matched using case-sensitive style name.

  2. If a matching style already exists in the destination document, the imported nodes are updated to reference that style.

  3. If a matching style is not found in the destination document, the style (and all styles referenced by it) are copied into the destination document and the imported nodes are updated to reference the new style.

---------------

I see a problem with the above approach:

Sometimes users don’t care about the styles and just want the text that was copied from one document to another to look exactly like it was in the original document, but this will not be the case because existing styles in the destination document could be different.

The problem is further aggravated by the fact that even a blank document already contains Normal, Heading 1, Heading 2 and Heading 3 styles. This means that if any of those (quite frequently used) styles are different between the two documents you are likely to have your text look different in the destination document.

Ideally, I would like to provide an option that specifies “copy formatted text and preserve full formatting”, but I’m not sure how this should work from the user point of view.

Should Aspose.Word disregard styles and make all formatting direct on the copied text or maybe Aspose.Word should rename all styles that match up and create new set of styles in the destination document (for example if Normal is defined, it will create Normal_2 and so on?).

Let me know if you have any ideas.

I think been able to do both would be good, or you could go for the way word does it, and offer 4 different options.

  1. “Keep Source Formatting”
    a. Apply direct formatting if there is a clash of styles that are the same, and copy new styles.
  2. “Use Destination Styles”
    a. Use the source document styles and copy new styles.
  3. “Match Destination formatting”
    a. Apply direct formatting to the copied text then apply current position formatting to the copied text.
  4. “Keep Text Only”
    a. Remove all formatting from current text.

I tried in Word 2003 and this seems to be how they deal with this problem.
That’s my two cents.
Hope that helps
Toby

Found this resource describing them a little better than me.
http://www.shaunakelly.com/word/styles/HowPasteOptionsWorks.html

Toby

I think Toby’s suggestion would work well for most instances. The biggest thing we have run into so far is one of the standard headings (i.e. Heading 3) meaning different things in different documents. I think renaming the styles and then copying them into the new document is an adequate solution if we want to preserve the formatting from the source document.

John

Roman,

If you were to take this approach how soon would this functionality be available. I am asking because we have one of our customers that is really affected by this issue and we would like to be able to provide them a timeline for resolution.

Thanks,

John

Roman,

Do you have a solution for this in development right now? If so, when can we expect it?

Thanks,

John

Thanks for the ideas. I think I will try to provide just two options - the one like it works now (it matches one of MS Word options) without overriding destination styles and add another option to copy with all formatting.

I’m still poking about the possible implementation for this copy with all formatting option. Should it copy required styles into the destination document and rename them to avoid conflicts or should it just “expand” all styled formatting into direct formatting so styles don’t need to be copied.

I don’t want to promise, but I hope to fit it into next 3-4 weeks or so.

My thoughts would be to create new styles in the destination document, so if the destination document is later edited the user can still use styles to format text.

It is easier for me to do, but the problem will be if the user copies from document A to document B say a dozen times, this will result in new styles generated a dozen times.

Well if you used some sort of naming convention like Heading1-1, you could check those and only create a new style if there isn’t one with a name in your convention that has the same properties:

Doc1
Heading 1 = 10pt Times New Roman Bold

Doc2
Heading 1 = 10pt Courier New

Say we insert documents in the following order (Doc1, Doc2, Doc1, Doc2) into a document with no sections.

Inserting Doc1 will just automatically copy Heading 1 to the document.
Inserting Doc2 will see that Heading 1 exists and that it is different than it’s formatting of Heading 1 so it will create Heading 1-1 in the final document.
Inserting Doc1 again will see that Heading 1 exists and check the formatting of heading 1, since they are the same, it will use the Heading 1 already in the document.
Inserting Doc2 again will check Heading 1 and see that it is different, it will then check Heading 1-1 and find that it is the same and use Heading 1-1 instead of creating new styles in the document.

What do you think of this type of approach?

-John

Thanks, I was thinking along similar lines. This approach relies on the ability to recognize identical styles so duplicate new styles will not be created if you import from say document B into document A numerous times. But I found it would be pretty hard to figure out when two styles can be considered identical.

Let’s say we need to come up with some hash or checksum of a style. What fields of the style should be included in the checksum? It is okay to hash style type and all formatting attributes of the style, but we also need to include some identifier of the style into the hash. We cannot hash by the style name since it might change in the destination document (Heading1 becomes Heading1-1 for example), we probably need to include the “original” style name into the hash. Also, the checksum of all based on styles needs to be included. Maybe even the checksum of the NextParagraphStyle should be included (and all its based on styles?).

So I decided to take a simpler approach for now and implementing an import option that will expand all style formatting on the imported nodes into direct formatting. There will be two options when importing nodes:

Use Destination Styles - the way it works now. Imports styles only if they don’t exist in the destination document. Text might appear differently in the destination document if the styles are different.

Keep Source Formatting - the new option. Also imports styles if they don’t exist in the destination document, but if the style exists, expands style formatting into direct so the text appears exactly like in the original document. This will be available in 3.2 in the next few days.

In the future, I might come back and implement that more complex option we’ve just been talking about:
Keep Source Formatting Smart - Copies in such a way that imported text appears exactly like in the original document, yet does not expand into direct formatting and does not create tons of new styles.

Thanks for the update Roman, will this be implemented on the Document.ImportNode function?

Hi,

We have released Aspose.Word 3.2.

  • Added the "Keep Source Formatting" option for copying content between documents. Allows to make sure the content looks exactly like it was in the original document.
  • Hi Roman,
    What about the development plans for “Keep Source Formatting Smart”?

    Thanks.

    Hi Praneeth,

    Thanks for your inquiry. I have logged this feature request as WORDSNET-11032 in our issue tracking system. You will be notified via this forum thread once this feature is available.

    Moreover, we had already logged a new feature request as WORDSNET-4173.
    A new import format mode will be added which only creates a new style
    when conflicting styles are different. This new option will make a copy
    of the style only when the styles are actually different. Hopefully, this feature will be available in near future.

    We apologize for your inconvenience.

    Hi Praneeth,

    PraneethS:
    Hi Roman,
    What about the development plans for “Keep Source Formatting Smart”?

    Could you please share some detail about your requirements related to ‘Keep Source Formatting Smart’? We need the detail of your requirement to implement this feature. Thanks for your cooperation in advance.

    The issues you have found earlier (filed as WORDSNET-4173) have been fixed in this .NET update and this Java update.

    This message was posted using Notification2Forum from Downloads module by aspose.notifier.
    (1)

    Hi Praneeth,

    *PraneethS:

    What about the development plans for “Keep Source Formatting Smart”?*

    Please use the latest version of Aspose.Words for Java 14.12.0 and use KeepDifferentStyles as
    ImportFormatMode. This import mode only copies the styles that are different from those in the source document.

    The issues you have found earlier (filed as WORDSNET-11032) have been fixed in this .NET update and this Java update.

    This message was posted using Notification2Forum from Downloads module by aspose.notifier.