What is the specific difference between writing text using TextFragment and writing text based on TextSegment? Why do single characters overlap? why.png (247.0 KB)
The following are all the documents and codes used
scan file:test.jpg (387.5 KB)
ocr json data file:ocr.d.json.zip (12.2 KB)
use TextFragment to generate pdf
val doc = Document()
val pageData = [Convert ocr.d.json position to aspose.pdf poistion]
val page = doc.pages.add()
page.setPageSize(pageData.width, pageData.height) // ocr.d.json -> {width,height}
val textBuilder = TextBuiler(page)
for(textData in pageData.textFragments) {
val text = TextFragment().also {
it.text = textData.text // ocr.d.json -> prism_wordsInfo -> {word}
// ocr.d.json -> prism_wordsInfo.pos
// (left,top),(right,top),(right,bottom),(left,bottom)
// Max(leftBottom.y - leftTop.y, rightBottom.y - rightTop.y)
it.textState.fontSize = textData.fontSize
}
textBuiler.appendText(text)
}
doc.save("new-ocr-text-fragment.pdf")
This is the resulting file:new-ocr-text-fragment.pdf (147.9 KB)
use Textsegment
val textBuilder = TextBuilder(page)
for (textData in pageData.textFragments) {
val tf = TextFragment()
// ocr.d.json - prism_wordsInfo.charInfo
for (segmentData in textData.segments) {
tf.segments.add(TextSegment().also {
it.text = segmentData.text
// ocr.d.json - prism_wordsInfo.charInfo {x, pageHeight - y}
it.position = Position(segmentData.position.x, segmentData.position.y)
it.textState.font = FontRepository.findFont("Arial")
// ocr.d.json - prism_wordsInfo.charInfo {h}
it.textState.fontSize = segmentData.fontSize
})
}
textBuilder.appendText(tf)
}
This is the resulting file:new-ocr-text-segment.pdf (147.8 KB)
@shiyajun
TextFragment is text between BT/ET operators and TextSegment is text drawn by single Tj operator inside BT/ET block. BT is operator which starts block of text output and ET marks end of text fragment. The text is shown between these operators by text operator (like Tj or TJ). Main thing is that TextFragment contains several text segments.
Furthermore, recommended approach to add text with position attributes is using TextFragments. However, we will also investigate the scenario in further details. Would you kindly share the complete code snippet where you are reading the .json file and pageData/textData objects are formed. This would help us testing the scenario accordingly.
maven dependency
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
<version>1.2.62</version>
</dependency>
Call Ocr.conver("ocr.d.json")
to get data.
class Ocr {
companion object {
fun convert(fileName: String): PageBo {
if (Files.notExists(Paths.get(fileName))) {
throw FileNotFoundException("json file not found")
}
val json = JSON.parseObject(File(fileName).readText())
val result = PageBo().also {
it.height = json.getDouble("height")
it.width = json.getDouble("width")
it.content = json.getString("content")
}
json.getJSONArray("prism_wordsInfo").forEach { text ->
val word = text as JSONObject
val textFragment = createTextFragment(word, result)
result.textFragments.add(textFragment)
}
return result
}
private fun createTextFragment(word: JSONObject, page: PageBo): TextFragmentBo {
val positionArray = word.getJSONArray("pos")
val leftTop = positionArray[0] as JSONObject
val rightTop = positionArray[1] as JSONObject
val rightBottom = positionArray[2] as JSONObject
val leftBottom = positionArray[3] as JSONObject
// llx,lly
val position = PositionBo().also { p ->
p.x = leftBottom.getDouble("x")
p.y = page.height - leftBottom.getDouble("y")
}
val fontSize = (leftBottom.getDouble("y") - leftTop.getDouble("y"))
.coerceAtLeast(rightBottom.getDouble("y") - rightTop.getDouble("y"))
.toFloat()
val segments = mutableListOf<SegmentBo>()
word.getJSONArray("charInfo").forEach {
val segment = it as JSONObject
segments.add(createSegment(segment, page))
}
return TextFragmentBo().also {
it.text = word.getString("word")
it.position = position
it.rotate = word.getIntValue("direction")
it.segments = segments
it.fontSize = fontSize
}
}
private fun createSegment(segment: JSONObject, page: PageBo): SegmentBo {
val position = PositionBo().also { p ->
// llx,lly
p.x = segment.getDouble("x")
p.y = page.height - segment.getDouble("y")
}
return SegmentBo().also {
it.position = position
it.text = segment.getString("word")
it.fontSize = segment.getFloat("h")
}
}
}
}
class PageBo {
var height: Double = 0.0
var width: Double = 0.0
var content: String = Strings.EMPTY
var textFragments: MutableList<TextFragmentBo> = mutableListOf()
}
class TextFragmentBo {
var text: String = Strings.EMPTY
var fontSize: Float = 14F
var rotate: Int = 0
var position: PositionBo = PositionBo()
var segments: MutableList<SegmentBo> = mutableListOf()
}
class SegmentBo {
var text: String = Strings.EMPTY
var fontSize: Float = 0F
var position: PositionBo = PositionBo()
}
class PositionBo {
var x: Double = 0.0
var y: Double = 0.0
}
@shiyajun
We are looking into the scenario and will get back to you in a while. Please give us some time.
1 Like