Hi all,
i am working on pulling out text and its properties from a pdf file. iTextSharp library is used for this purpose. Initially i took the fontsize property directly from the tf operator. but then i stumbled on a scenario where tf said font size was 1. So i went with the explanation on this link-http://stackoverflow.com/questions/9329709/how-to-calculate-font-size-from-pdf-in-php/26357764#26357764. Today i got a pdf where things got a little more trickier: Following is the extract from the qpdf:

BT
/TT2 1 Tf
0 19.98 -19.98 0 52.68 56.6906 Tm
2 Tr
0 G
0 J 0 j 0.4 w 1 M []0 d
0 g
0 Tc
0 Tw
( )Tj
0 -1.4414 TD
( )Tj
ET

Now in this case how do i determine the actual font size. Adobe reader shows me the font in a readable format . it definitley is more than 16. Any inputs will be appretiated. THe PDF Spec seems to be vague (my opinion ofcourse.)

Recommended Answers

Have you tried using BaseFont.GetDocumentFonts()?
Perhaps this could be useful: http://itextsharp.10939.n7.nabble.com/How-to-get-the-list-of-embedded-fonts-in-a-PDF-td3153.html

Jump to Post

All 4 Replies

Thanks for the response Oxigen, but i don't think that answers my question. I am able to get the font face and various other properties but the font-size becomes 0. the tf operator that is being used is this:/TT2 1 Tf
So i scaled the font size of 1 with the text matrix 0 index value. this value in this case is 0: 0 19.98 -19.98 0 52.68 56.6906 Tm
so the font size ends up being 0. but i know this is not correct as adobe reader shows me the font in a readable manner. i want to know what are the set of transformations that i will have to do to get the correct font size.Following is a snapshot of the information i have,

Thank you oxiegen, that was helpful. i used the ConvertHeightFromTextSpaceToUserSpace function to scale the font size.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of 1.19 million developers, IT pros, digital marketers, and technology enthusiasts learning and sharing knowledge.