0

I'm trying to implement an algoirthm descirbed here: http://pub.uni-bielefeld.de/luur/download?func=downloadFile&recordOId=2497720&fileOId=2525546

The purpose of that algorithm is to create a concept hierarchy based on a document corpus. For instants, a "Dog" concept would be a child of the "Animal" concept in such a hierarchy.

To the point, I've come accross a peice of the implementation that I can't riddle out. It's refering to a mysteriouse (t1, m) and (t2, n). Though t1 and t2 clearly refer to terms (and before are mentioned as the tuple: (t1, t2)), m and n are never explained. Anywhere. I do computer science as a hobby, so I've never been formally taught anything to do with algorithms. what am I missing? So, in summery, I know what (t1, t2) is, but not what (t1, m) or (t2, n) is.

Here is a link to a picture of the peice that I am talking about: http://picpaste.com/Screen_Shot_2014-01-24_at_8.36.23_PM-l2Kem9mF.png If you want more context, look at the link at the start, which has the full paper. The part in question is on page 4, part 2.3, step 3a.

Also, in the next step, they introduce a lowercase h and start using (h, m) and (h, n).

I believe that H(t1) stands in for a list of all the synonymes of t1 (or is a function to get all the synonymes of t1), so I know what that part is.

I would really appriciate your help as I'm very lost.

Edited by mgold: additional question

3
Contributors
2
Replies
32
Views
3 Years
Discussion Span
Last Post by sepp2k
1

If you want to understand this stuff, you need to study formal logic! Boolean predicate logic for sure! In my engineering studies in the mid-1960's I had to take a philosophy class as a requirement. I took the formal logic course (part of the philosophy dept), which has stood me in good stead for a 30+ year career in software engineering... :-) There are good links for learning this stuff on the internet, but you need to do some serious Google searching to find them. Start with Wikipedia: http://en.wikipedia.org/wiki/Predicate_logic

4

I believe that H(t1) stands in for a list of all the synonymes of t1

According to the paper it stands for "a set of tuples (h, f), where h is a hypernym and f is the number of times the algorithm has found evidence for it".

So n stands for the number of times the algorithm has found evidence of h being a hypernym of t1 and m stands for the number of times the algorithm has found evidence of h being a hypernym of t2.

In general whenever you're unsure what exactly something means, it's a good idea to go back to where it was defined and read the definition carefully. In this case knowing that H produces a set of tuples - not just a set of hypernyms - was crucial to understanding what n and m are, so it's important to pay attention to those kinds of details.

Votes + Comments
rashakil is this person
Nice!
This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.