Home

New Twitter Algorithms

November 5, 2017

My Twitter block list got unmanageably large, and blocktogether.org was not even able to remove blocks at any sort of a reasonable rate to help me fix it. So, I used my employer’s mighty-fine search engine to look for any Go packages for the Twitter API, and found Anaconda.

I had to spend a little time checking back at the Twitter API pages, and reading the source code, but pretty quickly (spare time in one afternoon) I had a program put together to remove my old block list, printing it as it goes. I’m going to include two programs here in case anyone else wants a leg up to do something of their own, because I can only spend so much time on this. Maybe I should toss these back to the Anaconda author as examples.

This program gets your block list. If your list is long, it takes a while because of throttling:

package main

import (
	"fmt"
	"net/url"
	"github.com/ChimeraCoder/anaconda"
)

func main() {
	// Next three lines use secret strings from Twitter developer API.
	// Go there, follow your nose.  See in particular:
	// https://developer.twitter.com/en/docs/basics/authentication/guides/access-tokens
	anaconda.SetConsumerKey("your_consumer_key")
	anaconda.SetConsumerSecret("your_consumer_secret")
	api := anaconda.NewTwitterApi("your_access_token", "your_access_token_secret")
	fmt.Println(*api.Credentials)

	v := url.Values{}

	cursor := "-1" // Initial cursor value
	for cursor != "0" {
		v.Set("cursor", cursor)
		v.Set("count", "5000") // 200 might be a better number
		result, err := api.GetBlocksList(v)

		if err != nil {
			fmt.Println("Err = ", err)
			return
		}
		fmt.Printf("#Users=%d\n", len(result.Users))
		for _, user := range result.Users {
			fmt.Printf("id=%s, name=%s\n", user.IdStr, user.Name)
		}
		cursor = result.Next_cursor_str
	}
}

This program undoes a block list supplied on standard input, printing it as it goes. I had previously downloaded mine from blocktogether using a shell script someone provided as a workaround on the relevant blocktogether bug. Again, throttling will slow you down. I started it running last night, it’s still running today (just had breakfast, started writing this):

package main

import (
	"bytes"
	"fmt"
	"io/ioutil"
	"net/url"
	"os"
	"strings"
	"github.com/ChimeraCoder/anaconda"
)

func main() {
	// Next three lines use secret strings obtained from Twitter developer API.
	// Go there, follow your nose.  See in particular:
	// https://developer.twitter.com/en/docs/basics/authentication/guides/access-tokens
	anaconda.SetConsumerKey("your_consumer_key")
	anaconda.SetConsumerSecret("your_consumer_secret")
	api := anaconda.NewTwitterApi("your_access_token", "your_access_token_secret")
	fmt.Println(*api.Credentials)

	b, err := ioutil.ReadAll(os.Stdin)
	if err != nil {
		fmt.Printf("err=%v\n", err)
		return
	}
	lb := bytes.Split(b, []byte{'\n'})
	fmt.Printf("Number of lines = %d\n", len(lb))

	for i, bb := range lb {
		v := url.Values{}
		u := strings.TrimSpace(string(bb))
		if u == "" {
			continue
		}
		v.Set("user_id", u)
		user, err := api.Unblock(v)
		if err != nil {
			fmt.Printf("Unblock %s, err=%v\n", u, err)
		} else {
			fmt.Printf("Unblock %s ok, id=%s, name=%s, #%d\n", u, user.IdStr, user.Name, i)
		}
	}
}

The old block list had lots of people on it worth blocking, but also lots of people accidentally swept up in the huge pile of blocks. The plan for the new list is to create two sets of twitter ids, “okay” and “vile” and use those to obtain a smaller and more accurate block list.

Step one is to create the okay list; anyone that I follow is okay, anyone that those people follow is okay. Maybe I take that one iteration further; because these queries are rate-limited there’s a limit to how quickly I can form these sets.

Step two is to form the vile list.
That is anyone from the old block list whose name satisfies the following (case-insensitive) pattern:

'kek|deplorable|pepe|maga|gamer.*gate'

I did a quick scan of the people on that list, they’re all terrible. Anyone who would follow such a terrible person for any reason other than “what are the horrible crazy people saying today?” is not someone whose opinions I need to read, and they won’t listen to mine, and arguably by blocking them I will slightly reduce the noise on Twitter. For all I know they’re fake accounts intended to stir up trouble. So all those people get blocked. Assembling that list will take some time; at one probe per minute (which I think will usually get all of one user’s followers) it will take days. One sanity check is to see if anyone on the “ok” list appears to be landing on the “block” list. I think the initial treatment is to not block them, but note the exception for manual correction.

That’s all I feel confident doing right now; I’ll watch for mistakes (both positive and negative) and see if I can create more refined definitions of “ok” and “vile”. “Vile” is actually easy — just look at someone’s profile, if it’s horrible, and if what they tweet is so horrible, that only a horrible person would follow them, then they’re vile. “Okay” is harder because I think it might be much larger and much vaguer; merely being someone I disagree with should not disqualify them from “okay”. The size is also an impediment because of rate-limiting; obviously I need to maintain a cache so I don’t wait to refetch information I already have.

Over time I expect I will discover more “vile” people, and so I need something into which I can just drop a name and have it automatically alert me if it overlaps the “okay” list in a big way, and otherwise just block all their followers. This is pretty much what the Twitter blockchain app does, but that lacks as comprehensive a definition of “okay”, and I lose track of the difference between the “vile” people and those are merely blocked, so I’d like to keep this information myself.

Two other programs that would be nice to write would implement time-limited block and mute; muting, especially, is just to get someone who’s gone on some stupid rant out of your display so you don’t need to consciously ignore them (for example, if some otherwise sane person decides they want to rehash the 2016 Democratic primaries) and they’ll eventually stop ranting and normally they say worthwhile stuff, that’s why you follow them. A time-limited block might be for when an otherwise sane person says something that really pisses you off temporarily.

And no, I’m not creating a bubble, I grew up a mile from a KKK bookstore, grew up with plenty of racists and children of racists, I can read the news anytime I want to see what the Nazis and racists are up to and how President Very-Fine-People has excused their vileness this week. I use Twitter for my purposes, not somebody else’s.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: