I tweeted 216K, and I’m not a robot

Zhao Feng
5 min readApr 14, 2021

Background

I have regularly downloaded my twitter archive in the past decade, but I never thought to take a look. Until recently, in a school workshop My Data initiated by my supervisor Anya Ernest, I was encouraged to dive into the past to understand data and myself better.

The following below is how I reviewed my Twitter history.

A quick stat of my account:

My work flow

As usual, following Twitter’s instruction of requesting an archive of my data, I downloaded a zipped file. After I unzipped the file, the file structure is

The structure of the unzipped archive file

README.txt provides necessary introduction of the file and acts as data dictionary. The README file also helps me narrow down my work to .js files and the .html file.

The .html allows me to have a quick glance of all my data with a single click.

Content showed through the html renderer

My next step is cleaning data and exploring data.

Tools and libraries I used in the projects are:

  • JupyterLab
  • Pandas
  • NumPy
  • Matlibplot
  • seaborn
  • json

To read machine-readable JSON files with a .js extension, I wrote a function

def get_js_as_df(filename):
with open(filename, 'r') as file:
raw_data = file.read()
json_data=json.loads(raw_data[raw_data.find('=')+2:])
return pd.json_normalize(json_data)

My findings

During this phase, my focus narrowed down to three tweet.js files which included all 216.8K tweets.

Firstly, I review in which languages I tweeted. The result turns out very quite odd.

Tweet languages detected by Twitter

Undoubtedly, Chinese (China) is dominant, and it’s reasonable that English is the top 3. But it’s confusing that Japanese is my second most-tweeted language. Why? I’m neither Japanese speaker nor manga reader. How could I created over 20,000 tweets in Japanese? The explanation sounds ridiculous but makes sense.

The majority of these Japanese tweets were created between 2010 and 2011 when I set up system languages cross my devices Chinese (Traditional) and I intensively used replacing . The mixing two similar languages confused Twitter and also led to extra 1500+ tweets in undetermined language.

The language mixing can explain my biggest confusion, but still can’t solve the problem — how come I tweeted in Welsh, Hindi, and etc. It’s insane! Now we have to address another reality when these tweeted created. Before I settled down in Sweden, I heavily relied on VPN. I believe these tweets reflect my struggle of dealing with internet accessibility.

Secondly, I would like to have a glance of how much tweets I could send under a specific period. The following three graphs show all my tweets distribution in year, in day, and in month.

Tweet year distribution
Tweet day distribution
Tweet distribution in month

During my preparation for the My Data workshop, I learned that time granularity has a meaningful impact for storytelling. The year distribution shows my activity level through the last decade, but with this wide range, I have hard time to find a red thread to tell a story. The day distribution looks intense but still, is hard to relate. So many dots are there and the story would become too messy if I include every piece.

The month distribution graph makes the storytelling easier. The contrast of the darkest blocks and the blank creates questions. What happened? The curiosity gave me space to tell the story. Twitter has been acting as my safe house since the first day I used it. Back in 2011, when I learned my father diagnosed cancer in March and lost him in the summer, I tweeted thousands of babbling to deal with my anger and my grief, while my followers, and other random Twitter users tolerated and even encouraged this kind of nonsense. They followed my Swedish learning journey, and witnessed my battle again racism in the language school. They offered me work opportunities, as partner, tech writer, and book editor. These people, but not Twitter, the app, gave the feeling of belonging which I miss a lot. In 2019, after my long absence to deal with another grief alone, they were still there, welcoming and supportive, without questions.

Rewinding my tweet history is like taking time travelling, and sharing this journey is like an emotional therapy. Before I started this small project, I never expect I can learn so much.

You might think here is the end. Wait, I want to share a bit more.

Remember, I mentioned earlier, I settled down in Sweden 2016. Will the firewall-free environment boost my tweeting level? Here are some numbers.

On one of those darkest days, when 561 tweets sent in 24 hours, I acted like a robot. I couldn’t believe once upon a time I was allowed to do this. The decade of using Twitter teaches me all the fundamental code of conduct to behave well on Internet.

I still tweet actively. But no robotic behaviour any more. I know better ways.

--

--

Zhao Feng
0 Followers

Data nerd | Business thinker| Obsessed dumpling maker