.docx or .PDF documentGeneration do not work if a file has Image file or content or PlaceHolder post Deployment on Linux

Hi,

I am using Amazon Linux version 2.Deployed .net core 3.1 version on Linux. All docs are generated fine except file with image file.png/jpg/gif. Linux is completely separate version.
deployed for some of the client.

@poonammishra Could you please attach your source and output documents along with simple code that will allow us to reproduce the problem?

PDFoutputFile.jpg (20.4 KB)
MacInput File.docx (48.7 KB)
Attached is the .docx input file with ContentPlaceHolder and this is replaced with image into pdfoutput file.
code snippet used

public MemoryStream DoChangesForConversionToPdf(XmlDocument inputXml, MemoryStream memDocx)
{
    using (WordprocessingDocument wDoc = WordprocessingDocument.Open(memDocx, true))
    {
        foreach (var cc in wDoc.ContentControls().ToList())
        {
            var xpath = cc.XPath();
            if (xpath == "") continue;
            if (cc.Tag() == "Image") //Replace Image placeholders with actual images provided in input Xml
                cc.ReplaceImagePlaceholder(wDoc, inputXml);
        }
    }
    memDocx.Position = 0;
    return memDocx;
}

internal MemoryStream DoChangesForImages(XmlDocument inputXml, MemoryStream memDocx)
{
    using (WordprocessingDocument wDoc = WordprocessingDocument.Open(memDocx, true))
    {
        foreach (var cc in from cc in wDoc.ContentControls()
                                            .Where(cc => cc.Tag() == "Image").ToList()
                           let xpath = cc.XPath()
                           where xpath != ""
                           select cc)
        {
            cc.ReplaceImagePlaceholder(wDoc, inputXml);
        }
    }
    memDocx.Position = 0;
    return memDocx;
}

@poonammishra Thank you for additional information. As I can see you do not use Aspose.Words in your code. Looks like you are using Open XML SDK for manipulating your documents. So it is not quite clear how the problem is related to Aspose.Words.

Yes, we are using Aspose.Word
aspose.word.jpg (54.8 KB)

@poonammishra According to the code Aspose.Words is used for conversion MS Word document to PDF. Is the DOCX generated by your application in Linux has the image? If I understand correctly DOCX document is generated/modified by Open XML SDK.

Yes, Linux has ContentPlaceHolder and we can assign the image file to this control. Yes we use OpenXML SDK.

@poonammishra Please attach DOCX document produced by OpenXML SDK (after insertion image) in Linux environment before processing it using Aspose.Words. I will check conversion on my side and let you know how it goes.

Hi Support Team,

Its the same file without Image into it. Rest all is same.Attached is the file PDFoutputFile.jpg (20.4 KB)

File does not have icon pic at left corner

@poonammishra Could you please attach the actual DOCX document (not screenshot) produced by OpenXML SDK (after insertion image) in Linux environment before processing it using Aspose.Words?
Does the image exist in DOCX document before processing it using Aspose.Words?

Hi Support Team,

After updating package.
I was able to generate the .docx file perfectly as suggested by you after installing fontconfig on amazon Linux machine. But after convertToPDF , PDF file generated do not have either image content placeholder or added Image into it.

Hence PDF file do not render image file. Kindly help me on this.

@poonammishra Could you please attach the actual (correct) DOCX document generated on your side and PDF document, that demonstrates the problem.
Also, please make sure that SkiaSharp.NativeAssets.Linux package is referenced by your application.

Yes, SkiaSharp.NativeAssets.Linux is referenced in the application.ADR_2447_copy.docx (36.7 KB)
pdfgenerated.jpg (69.7 KB). Barcode is pic image present into .docx file but same file is converted to PDF and do not have barcode image into pdf file

@poonammishra Thank you for additional information. Unfortunately, I cannot reproduce the problem on my side. I have tested your scenario in Docker:

FROM amazonlinux:2 AS base

# Install .NET6 SDK
RUN rpm -Uvh https://packages.microsoft.com/config/centos/7/packages-microsoft-prod.rpm
RUN yum install dotnet-sdk-6.0 -y

WORKDIR /src
COPY ["TestNet6.csproj", "."]
RUN dotnet restore "./TestNet6.csproj"
COPY . .
WORKDIR "/src/."
RUN dotnet build "TestNet6.csproj" -c Release -o /app/build
RUN dotnet publish "TestNet6.csproj" -c Release -r linux-x64 --no-self-contained -o /app/publish

ENTRYPOINT ["dotnet", "/app/publish/TestNet6.dll"]

Here is .csproj file:

<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net6.0</TargetFramework>
    <CopyLocalLockFileAssemblies>true</CopyLocalLockFileAssemblies>
  </PropertyGroup>
  
   <PropertyGroup>
     <InvariantGlobalization>false</InvariantGlobalization>
   </PropertyGroup>

  <ItemGroup>
	  <PackageReference Include="Aspose.Words" Version="23.4.0" />
	  <PackageReference Include="SkiaSharp.NativeAssets.Linux.NoDependencies" Version="2.80.3" />
  </ItemGroup>

</Project>

Here is the test code:

using Aspose.Words;
using System;
using System.Diagnostics;

namespace Aspose.NetCore.TestRunner
{
    class Program
    {
        static void Main(string[] args)
        {
            License lic = new License();
            lic.SetLicense("/temp/Aspose.Words.NET.lic");

            Console.WriteLine("Test started.");

            Document doc = new Document("/temp/in.docx");
            doc.Save(@"/temp/out.pdf");

            sw.Stop();
            Console.WriteLine(sw.ElapsedMilliseconds / 1000d);

            Console.WriteLine("Done");
        }
    }
}

I run the test using the following commands:

docker build -f Dockerfile_amazon -t awtest .

docker run --mount type=bind,source=C:\Temp,target=/temp --rm awtest from Docker

The output PDF document looks correct: out.pdf (22.4 KB)

Hi Support Team,

I was able to generate the .PDF document, after following all the steps mentioned above after installation of Skiashap. There is a slight difference between the docs genereted on Linux machine. Bullet point is not correct on Linux machine generated are pdfonLinux.jpg (55.7 KB)
pdfonwindows.jpg (32.7 KB)

Kindly help me on this since this issue is critical and there are major timelines.

@poonammishra This looks like a known peculiarity - Windows “Symbol” font (which is used for bullets) is a symbolic font (like “Webdings”, “Wingdings”, etc.) which uses Unicode PUA. MacOS or Linux “Symbol” font on the other hand is a proper Unicode font (for example Greek characters are in the U+0370…U+03FF Greek and Coptic block). So these fonts are incompatible and Mac/Linux “Symbol” font cannot be used instead of Windows “Symbol” without additional actions. In your particular case, it looks like, the bullet is represented as U+2022, but in Windows “Symbol” it is PUA U+F0B7 (or U+00B7 which also can be used in MS Word for symbolic fonts). So you should change U+2022 character to U+00B7:

Document doc = new Document(@"C:\Temp\in.docx");

List<Run> items = doc.GetChildNodes(NodeType.Run, true).Cast<Run>()
    .Where(r => r.Font.Name == "Symbol").ToList();

foreach (Run r in items)
{
    if (r.Text == "\x2022")
        r.Text = "\x00b7";
}

doc.Save(@"C:\Temp\out.pdf");

Alternatively, you can use “Symbol” from Windows in your Linux environment.

Hi Support Team,

Thanks for your help, there is also 1 more issue that I need help on this.
There is a pdf generation discrepancy between windows and Linux
1.There is some font issue… windows has different font as compared to Linux. Pl refer attachment
2.Footer issue , in Linux footer gets break down into 2 lines instead of 1. When you compare both attached files.
Kindly help me on this there is critical timelines on this and client is waiting for this to be Done.
I have feeling somehow pdf generated document is shrinked so footer is break down into second line.
pdfgeneratedonLinuxfontissue.jpg (70.6 KB)
pdfgeneratedonwindowsfont.jpg (58.3 KB)
pdfgeneratedonwindowsfooter.jpg (98.7 KB)
pdfgeneratedonLinuxfooter break issue.jpg (72.1 KB)

Let me know if you have any further question.

@poonammishra Most likely the problem occurs because the fonts used in your input document are not available on the machine where document is processed. If Aspose.Words cannot find the font used in the document, the font is substituted. This might lead into fonts mismatch and document layout differences due to the different fonts metrics. You can implement IWarningCallback to get notifications when font substitution is performed.
Please see our documentation to lean where Aspose.Words looks for fonts:
https://docs.aspose.com/words/net/specifying-truetype-fonts-location/

Thanks for your help, Why footer gets break into 2 lines… instead of 1 . This is not as expected according to client.

@poonammishra Most likely the reason of the problem is the same - font substitution. Since different fonts have different metrics, font substitution might cause layout differences.
Unfortunately, it is difficult to tell for sure what the problem is without your real documents. If the problem still persist after installing the required fonts, please attach your input and output documents here for testing. We will check the issue and provide you more information.