Chinney 0 Newbie Poster

Hello everyone, hope you're having a great day :D

I have a project idea in mind but unsure how to go about solving it / unsure if i can solve it.

I want to be able to classify messages to categories, where not all categories are known. I want the system to be able to add new categories as it goes, and be able to classify new messages to these new categories. I am willing to take the system offline, retrain using new data and put the system back once deployed.

The environment i have in mind is a Twitch Stream Live Chat. Picture a bot that reads all incoming messages and classifies that message to a category, not all messages need to be classified, however ones that relate to a specific topic should, for example:

The message "Hi, how are you?" - Could be classified as a greeting, but ultimately, i am not interested in such messages.

Where as

The message "I'm really enjoying GTA right now" - Is of interest and we would want to assign this message to a category, likely a GTA category.

However, i can't possibly predict every single category that is of interest right now and this is where, for me, the complexity comes in.

If you have any ideas of possible methods to implement this, please let me know :D

A bot is already in place that can read all messages, i merely need a python script to classify the messages
I am a BSc Comp Sci student and have studied an Intro module into Natural Language Engineering
Will be studying advanced NLE (Natural Language Engineering) in coming terms and this is a project of interest to, in a way, prepare
Have experience with other machine learning methods such as Neural Networks, Genetic Algorithms, Hill Climbers etc