When an Excel Chart containing Japanese words is being converted to SVG using Aspose, some Japanese words are being split into individual characters.
One pattern we have noticed that causes the issue seems to be when the Japanese text has a “()” outside of axis categories
Code Snippet
public class AsposeSvgConverter {
public static void main(String[] args) throws FileNotFoundException, Exception {
Properties props = System.getProperties();
// Store fonts that you want to use, if any, at the following location: ${java.io.tmpdir}/aspose/fonts/
props.setProperty(“Aspose.Cells.FontDirExc”, System.getProperty(“java.io.tmpdir”) + “/aspose/fonts”);
// Store excel files that you want to process at the following location: ${java.io.tmpdir}/aspose/sample.xlsx
Workbook book = new Workbook(new FileInputStream(System.getProperty(“java.io.tmpdir”) + “/aspose/sample.xlsx”));
ImageOrPrintOptions imgOptions = new ImageOrPrintOptions();
imgOptions.setSaveFormat(SaveFormat.SVG);
for (Object obj : book.getWorksheets()) {
Worksheet sheet = (Worksheet) obj; // Assuming only one chart is present on chart sheet
com.aspose.cells.Chart chart = sheet.getCharts().get(0);
chart.toImage(“sample.svg”, imgOptions); } }}
Hi,
We are using version 17.3.0. When we tried version 17.4.0
Words are still split into individual characters
Font selection now has changed; earlier it was picking up MS P Gothic and now it uses MS Gothic (so this might be a different issue?)
We did more investigations and we have the following observations. The words get split up into individual characters because of the presence of parenthesis. There are two variations of parenthesis; one from the ASCII set with code point 40 for opening parenthesis and another a double byte version of parenthesis for Japanese fonts with code point of 65288
i.e.,
Character “(” Code point Single byte: 40
Character “(” Code point Single byte: 65288
Character “)” Code point Single byte: 41
Character “)” Code point Single byte: 65289
When we mix the single byte characters with Japanese characters, Aspose seems to split the word into individual characters. However, if we use the double byte version instead, it retains it as a single word. Using an English keyboard with the default settings, when we type parenthesis it picks the single byte version by default.
public static void main(String[] args) {
System.out.println(Character.codePointAt(String.valueOf("("), 0)); // Note that there is no space before “(“. However, depending on the font/application it renders double byte characters the way you notice in sysout
System.out.println(Character.codePointAt(String.valueOf(")"), 0));
System.out.println(Character.codePointAt(String.valueOf("("), 0));
System.out.println(Character.codePointAt(String.valueOf(")"), 0));
}
Output
65288
65289
40
41
Hi,
Hi Amjad,
We tested using 17.4.2 but have the same observations. When you look at the SVG in IE it looks as if the Text is not split into words but if you try selecting the 2 different words and do an Inspect Element you will see a difference between the way the 2 words are treated. One remains as text and the other has been split into individual characters
Hi,
e.g
Sample code:
Workbook book = new Workbook(“F:\Files\japanese word1\test1.xlsx”);
ImageOrPrintOptions imgOptions = new ImageOrPrintOptions();
imgOptions.setSaveFormat(SaveFormat.SVG);
for (Object obj : book.getWorksheets()) {
Worksheet sheet = (Worksheet) obj; // Assuming only one chart is present on chart sheet
com.aspose.cells.Chart chart = sheet.getCharts().get(0);
chart.toImage(“F:\Files\japanese word1\out1sample1.svg”, imgOptions); }
Hi,
some cases.
The behavior doesn’t seem consistent. Our observation is that it does the splitting when it encounters a single byte parenthesis and not for double byte. Any reason for this?
"We did more investigations and we have the following observations. The words get split up into individual characters because of the presence of parenthesis. There are two variations of parenthesis; one from the ASCII set with code point 40 for opening parenthesis and another a double byte version of parenthesis for Japanese fonts with code point of 65288
i.e.,
Character “(” Code point Single byte: 40
Character “(” Code point Single byte: 65288
Character “)” Code point Single byte: 41
Character “)” Code point Single byte: 65289
When we mix the single byte characters with Japanese characters, Aspose seems to split the word into individual characters. However, if we use the double byte version instead, it retains it as a single word. Using an English keyboard with the default settings, when we type parenthesis it picks the single byte version by default. "
Hi Amjad, can we check with the product team if there is a way to override this behavior so that we always get exactly what’s in the Excel in the SVG as well.
Hi,
Hi,