It's been quite a long time since I last played here. I don't know for sure whether I'll return to old activity levels eventually, but that's not what I'm posting about. What I'm here to talk about is the ability of computer programs to play forum mafia. The details are quite technical, but for everyone else I will spare that and get straight to the results.
What it can do
Joke's on you, the results are also technical. This computer program (actually three of them, working together), henceforth titled Scumbot, reads the filters of the players and labels them either town or mafia. Assuming 25% of the other players are mafia (as it is from the perspective of a townie in a standard 13-person game) and the bot accuses 25% of people of being mafia, the bot's success rates are as follows:
Working with half a page of filter: 76% of townreads are correct, 29% or scumreads are correct (64% accurate overall)
With 1 page of filter: 77% on town, 33% on scum (66% overall)
With 2 pages: 78% on town, 33% on scum (67% overall)
With 4 pages: 78% on town, 34% on scum (67% overall)
With 8 pages: 79% on town, 37% on scum (69% overall)
All figures are accurate to within 1 percentage point. Accuracy metrics for more than 8 pages of filter are unreliable due to sample size issues.
Scumbot does not have godlike scumreading powers, but it is significantly better than blind guessing (which would have accuracy of 75% / 25% / 62.5%). I don't know exactly how it compares to real players, but research I did last year indicates that about 33% of votes land on mafia by deadline. I do not as of yet have statistics for how often Scumbot's #1 scumread in a game would be mafia, but comparing a computer's reads in a vacuum to human votes in a game situation is flawed anyway. In conclusion, Scumbot's reads are decent but not groundbreaking compared to human players. As I continue to refine it, its reads may improve slightly, but it's doubtful that its reads would improve by more than about 5 percentage points in scumread accuracy.
How it works
The initial part of this process was downloading mass amounts of filters. The dataset was the filters of all players in all games from January 2013 to December 2016 (huge thanks to kita for compiling the database, it was a life saver). The full dataset was about 1500 filters and 267000 posts. Next, all posts were edited with some strategic regexing to be more consistent. All words were eventually replaced by numbers to convert the data into a csv format. Posts shorter than 50 words were padded and posts longer than 50 words were truncated to make all posts the same length. (More accuracy could potentially be eked out by raising the limit, but it would take so long for my computer to process it that it's not worth it at this stage.)
The original plan was to use a recurrent neural network to evaluate every post, but differences are so small on a post-by-post level that this approach had little success. Instead, I converted each post into a row describing how often each word was used in the post. These rows were aggregated back into filters, so each row became the average number of times each word was used in the first 10, 20, 30, etc. posts of a given filter.
These aggregated filters were then fed into a simple neural network (one hidden layer with 8 neurons). Extensive tweaking and study led me to isolate fifteen "words" that were useful in predicting players' alignments. Using totals of those 15 words, the machine would train on the dataset to learn how to predict new alignments.
A machine's guide to mafia
According to Scumbot, some words make people mafia and some words make people town. I use "words" lightly, because it also includes pad characters, punctuation marks, and BBCode.
The scummiest things you can do is, in fact, ask a question. Question marks were consistently rated the best indicator that a player was scum. Other words that predict scumminess in a player are: even, like, day, what, the, and you. Semicolons are also scummy, but most of them occur due to incorrect parsing of images and gifs.
And it seems to hold true that you are what you post: The towniest word, according to Scumbot, is “town” itself. Further town-indicative words are they, probably, bad, and really. Also townie were periods and “pad” characters. (Pads meaning the post was shorter than 50 words--brevity is apparently pro-town. Or maybe spamming is, depending on how you look at it.)
Each of these “words” allows a small insight into a player’s alignment. All fifteen of them together, however, allow for a medium-sized insight. The power of teamwork.
The future of forum mafia
Robots are not going to take over TL Mafia and drive out the human players. For one thing, Scumbot could be rendered totally useless by spamming the word “town” 50 times at the start of every post. It is also not even guaranteed to be as reliable on future games of mafia; the available dataset from 2014-2016 is only an approximation of the actual meta.
Furthermore, nobody else has the bot. I don’t plan on using it in games. (Obs threads, though, are fair game.) The main point of the creation of a mafia bot is to prove it can be done, and to see how far it can go. And to learn coding skills.
That’s all I really had to say. I hope you learned something. Have a nice day. I’ll be here to answer questions or to discuss the ethics of using bots in mafia.