Our process stitches hundreds of PDFs that may or may not have bullet points and a few other non-letter characters. When this happens, each of these PDFs will include a subset of the SymbolMT font, so when we stitch them, the file gets bloated with hundreds of these subsets.
The printers we use for our project tend to run out of memory fairly quickly, so we’re currently using iText to unembed these subsets and embed one full version of the font from a TTF file embedded into our application, but we are trying to migrate to Aspose.Pdf and I cannot find a way to do this.
To replace subsets of the SymbolMT font with a full embedded version in your PDFs using Aspose.Pdf, you can utilize the TextFragmentAbsorber
class along with TextEditOptions
. This allows you to search for text fragments that use the SymbolMT font and replace them with the full font from a TTF file.
Here’s a step-by-step approach you can follow:
- Load the PDF Document: Open the PDF file that you want to modify.
- Create a TextFragmentAbsorber: This will help you find all text fragments that use the SymbolMT font.
- Set TextEditOptions: Use the
FontReplace
option to specify that you want to replace the font. - Iterate Through Text Fragments: For each text fragment found, set the font to the full version of the SymbolMT font.
- Save the Document: Finally, save the modified PDF.
Here’s a sample code snippet to illustrate this:
using Aspose.Pdf;
using Aspose.Pdf.Text;
public void ReplaceSymbolMTFont(string inputFilePath, string outputFilePath, string fullFontPath)
{
// Load the PDF document
Document document = new Document(inputFilePath);
// Create TextFragmentAbsorber to find all text fragments
TextFragmentAbsorber absorber = new TextFragmentAbsorber();
// Set text edit options to replace fonts
TextEditOptions options = new TextEditOptions(TextEditOptions.FontReplace.RemoveUnusedFonts);
absorber.TextEditOptions = options;
// Accept the absorber for all pages
document.Pages.Accept(absorber);
// Iterate through all text fragments
foreach (TextFragment textFragment in absorber.TextFragments)
{
// Check if the font is SymbolMT
if (textFragment.TextState.Font.FontName == "SymbolMT")
{
// Replace with the full font
textFragment.TextState.Font = FontRepository.FindFont(fullFontPath);
}
}
// Save the modified document
document.Save(outputFilePath);
}
Make sure to replace fullFontPath
with the actual path to your full SymbolMT TTF file. This approach will effectively reduce the number of font subsets in your stitched PDFs and improve performance on your printers.
If you have any further questions or need additional assistance, please let me know!
Thank you for the blazing fast response, let me try the code you’ve provided and I will reply back with results
EDIT: Didn’t quite work out of the box because of how I have the font file (as an embedded resource in the assembly) so here’s my version, which worked*
/// <summary>
/// Replace existing font subsets with a provided font from an embedded resource file
/// </summary>
/// <typeparam name="T">Type of the assembly that will have the font file embedded into</typeparam>
/// <param name="document">Aspose.Pdf <see cref="Document"/> to process</param>
/// <param name="fullFontPath">Full path to the embedded resource font file.</param>
public static void ReplaceFontSubsets<T>(this Document document, String fullFontPath)
{
// Create TextFragmentAbsorber to find all text fragments
TextFragmentAbsorber absorber = new TextFragmentAbsorber();
// Set text edit options to replace fonts
TextEditOptions options = new TextEditOptions(TextEditOptions.FontReplace.RemoveUnusedFonts);
absorber.TextEditOptions = options;
// Accept the absorber for all pages
document.Pages.Accept(absorber);
// Iterate through all text fragments
foreach (TextFragment textFragment in absorber.TextFragments)
{
// Check if the font is SymbolMT
if (textFragment.TextState.Font.FontName.Contains("SymbolMT"))
{
//Load the embedded resource font file from the assembly
var assembly = Assembly.GetAssembly(typeof(T));
var fontStream = assembly?.GetManifestResourceStream(fullFontPath);
// Replace with the full font
//textFragment.TextState.Font = FontRepository.FindFont(fullFontPath);
textFragment.TextState.Font = FontRepository.OpenFont(fontStream, FontTypes.TTF);
}
}
}
* The result still had the Symbol font as an embedded subset, but only once, which isn’t what I originally asked for, but for all intents and purposes does what I wanted. It also now has one instance of an embedded subset of the Arial font, which I’m not sure where it came from. Again, not breaking, but it wasn’t there in the original.
It looks like you were able to sort out the issue that you were facing. Please feel free to create a new topic in case you need any kind of assistance.