In my previous articles, I explained how you could apply heuristic and statistical approaches for finding inter-annotator agreement between multiple annotators.

However, while applying those approaches, I found that finding inter-annotator agreement in the case of multi-label ranked data is a difficult task, and traditional inter-annotator agreement techniques will almost return a lack of agreement or only slight agreement between annotators. Therefore, we need to post-process multi-label ranked annotators to find a more flexible degree of agreement between annotators.

After some research, I found a paper that proposes a simple yet intuitive postprocessing technique for multi-label ranked annotations. The paper can be found at this link.

I will not delve into the details of the paper. Instead, I will implement the proposed technique in Python. You can use the code in this article to post-process multilabel ranked annotations.

## The Dataset

I will not go into the details of the dataset used as a sample in this article since

it has already been explained in my previous article on Finding Inter Annotator Agreement between three Annotators in Python.

At a high level, the dataset looks like this. Each row contains multilabel ranked annotations for a tweet.

- Rank 1 for the most likely emotion,
- Rank 2 the second most likely emotion,
- Rank 3 the third most likely emotion).

## Post-processing Multilable Annotations

### Finding Degree of Presence of Ranks

The post-processing technique is mathematically explained in section 3.1 of the paper, under the heading ** Manual annotation**. The formula for post-processing manual annotations finds the degree of presence of emotion based on its rank value.

I will try to explain it in simple words. Suppose we have the following ranked annotation:

`[1, 2, 0, 0, 0, 0, 0, 0, 0]`

Here the first column (index 0) is assigned the rank 1, while the second column is the rank 2.

**Step 1:**

Find the emotions (column values in this case) with at least one ranked value. We call this value C. In the above example, the list C will be [0, 1].

**Step 2: **

Find the rank values for C. We name these values `ranks`

. The ranks list in our case will be [1,2]

**Step 3: **

Find the backward rank values, which according to the paper, are defined as:

1 - Card(C) - Original Rank Value.

For our sample annotation, the cardinality of C (Card(C)) is 2. Therefore, the backward rank values in our case are [2, 1].

**Step 4:**

Find the final degree of ranks by dividing the backward ranks by the sum of all ranks. For our sample annotation, the degree of rank becomes [⅔, ⅓] = [0.667, 0.333].

The post-processed annotation now looks like this:

`[0.667, 0.333, 0, 0, 0, 0, 0, 0, 0]`

I wrote a Python method `get_degree()`

that implements the steps above on each row of a Pandas dataframe.

```
import numpy as np
def get_degree(annotation_df):
processed_list = []
for i in range(len(annotation_df)):
annotations = annotation_df.iloc[i].tolist()
C = [i for i, e in enumerate(annotations) if e != 0]
ranks = [r for r in annotations if r]
ranks_inv = (1 + len(C) - np.array(ranks))
degree = ranks_inv/sum(ranks)
process_ranks = annotations
for index, value in enumerate(C):
process_ranks[value] = degree[index]
process_ranks = [ round(elem,3) for elem in process_ranks ]
processed_list.append(process_ranks)
processed_df = pd.DataFrame(processed_list, columns = annotation_df.columns)
return processed_df
```

Let’s find the degree of presence of emotion based on rank using the `get_degree()`

function defined in the above script.

Our input dataset looks like this.

The following script passes the above dataframe to the `get_degree()`

function:

```
df1_processed = get_degree(file_50_1_SB)
df1_processed.head()
```

You can see from the above output that the emotion ranks are converted to their degree of presence.

### Finding Final Annotation Values

In the paper, the authors have calculated the final annotation values by taking an average of the degree of presence in dataframes annotated by two different annotators.

The following script finds the mean degree of presence between two dataframes:

```
df_mean_processed = (df1_processed + df2_processed)/2
df_mean_processed.head()
```

The mean of two dataframes in my case looks like this:

Finally, the emotions having a mean value of greater than one are assigned a value of 1, while the rest of the emotions are discarded, i.e., assigned a value of zero.

I found this post-process technique for multi-label ranked annotations very useful. I hope the function I implemented will be helpful for you in solving a similar problem.