I'm trying to implement an algoirthm descirbed here: http://pub.uni-bielefeld.de/luur/download?func=downloadFile&recordOId=2497720&fileOId=2525546

The purpose of that algorithm is to create a concept hierarchy based on a document corpus. For instants, a "Dog" concept would be a child of the "Animal" concept in such a hierarchy.

To the point, I've come accross a peice of the implementation that I can't riddle out. It's refering to a mysteriouse (t1, m) and (t2, n). Though t1 and t2 clearly refer to terms (and before are mentioned as the tuple: (t1, t2)), m and n are never explained. Anywhere. I do computer science as a hobby, so I've never been formally taught anything to do with algorithms. what am I missing? So, in summery, I know what (t1, t2) is, but not what (t1, m) or (t2, n) is.

Here is a link to a picture of the peice that I am talking about: http://picpaste.com/Screen_Shot_2014-01-24_at_8.36.23_PM-l2Kem9mF.png If you want more context, look at the link at the start, which has the full paper. The part in question is on page 4, part 2.3, step 3a.

Also, in the next step, they introduce a lowercase h and start using (h, m) and (h, n).

I believe that H(t1) stands in for a list of all the synonymes of t1 (or is a function to get all the synonymes of t1), so I know what that part is.

I would really appriciate your help as I'm very lost.

Recommended Answers

All 2 Replies

If you want to understand this stuff, you need to study formal logic! Boolean predicate logic for sure! In my engineering studies in the mid-1960's I had to take a philosophy class as a requirement. I took the formal logic course (part of the philosophy dept), which has stood me in good stead for a 30+ year career in software engineering... :-) There are good links for learning this stuff on the internet, but you need to do some serious Google searching to find them. Start with Wikipedia: http://en.wikipedia.org/wiki/Predicate_logic

I believe that H(t1) stands in for a list of all the synonymes of t1

According to the paper it stands for "a set of tuples (h, f), where h is a hypernym and f is the number of times the algorithm has found evidence for it".

So n stands for the number of times the algorithm has found evidence of h being a hypernym of t1 and m stands for the number of times the algorithm has found evidence of h being a hypernym of t2.

In general whenever you're unsure what exactly something means, it's a good idea to go back to where it was defined and read the definition carefully. In this case knowing that H produces a set of tuples - not just a set of hypernyms - was crucial to understanding what n and m are, so it's important to pay attention to those kinds of details.

commented: Nice! +15
commented: rashakil is this person +14
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.