Analyzing cloud data lab 3

In my text analysis I hope to see tweets covering President Trump. I think that there will be a lot of recent tweets regarding his first address to congress.

My number of words is 396,197 and 42,955 unique words. Interestingly, the most common word, or phrase in this instance, is “02” and is tweeted 19,265 times. alternativefacts is only used 18,739 times. Many of the phrases that stand out and I expected to be there were “realdonaldtrump”, “trump”, “fake news”, “theresistance” and “nobannowall”. Some of the weird ones that are common are “02”, “_u”, “u_” and “killerbee805”. I don’t see the how any of these are useful in tweets regarding alternative facts or Trumps administration.

02

rt

alternativefacts

https

t.co

I included 95 words

It revels much about the controversy revolving around #alternativefacts. I was expecting to see much about Trump’s decisions and actions. Some of the key words are; “nobannowall”, “potus”, “noraids”, “resist”, “believe”, and “seanspicer”. Many of these make sense as they all have to do with Trump’s administration. This confirms what I thought most of my data was going to be about. Five of my most common terms are “nobannowall”, “potus”, “noraids”, “resist”, and “seanspicer”. All of these were not surprising as most of these are very common things talked about when talking about Trump. Also, Sean Spicer tweets a lot about politics so it makes sense that he would be a top word. Another interesting one was “impeachtrump” which appeared 186 times.

Screen Shot 2017-03-01 at 9.27.59 PM

 

I chose to do “nobannowall” “impeachtrump” and “resist”. I am choosing to do these because I think they will come up with the most interesting and controversial data. I am interested in impeach Trump the most because I want to see what everyone is saying about it and why since there are currently no grounds for impeachment.

Screen Shot 2017-03-01 at 9.37.19 PM

http://voyant-tools.org/?corpus=c1a41e49aa06b78c7eba9e39aa0b4df7&stopList=keywords-702dc271899e3005adb694ee462c1f0b&panels=cirrus,reader,trends,summary,contexts

I chose to do an article by The Hill titled, “The House should start impeachment against Trump now”. I chose to do this because I think the controversy surrounding him is intriguing. I also would like to read about valid reasons for people arguing impeachment, not just angry liberals ranting on social media.

 

Screen Shot 2017-03-01 at 9.56.00 PM

 

Everything is related to the American aspect because almost all of the data is relevant to Trump and he is the President of the United States. Most of the new key words about things Trump is either trying to change or already changing. Some of these include healthcare, energy and business. I think this really bring a more American aspect to the dat because it is talking about American policies that are being affected and changed in some way or another. Although most of the data from twitter talks about similar things, this is a more formal way of viewing it.

I think that Tufte’s approach is best. He says that superior methods are better, making them more precise, thus making them more credible. I think the last part of the lab demonstrates this best. The data scraped from twitter was not as serious or credible as the data scraped from The Hill article. This was easy to see when comparing the two word clouds. The one from twitter was filled with bizarre users and irrelevant words. The word cloud from the article was filled with all words connected right to Trump and his administration. It proves that more data is not always better data. Yau seems to believe that the more data the better but this proved that the more superior methods of collecting data find a better depiction of the actual data.

 

2 thoughts on “Analyzing cloud data lab 3

  1. I understand why most of the words you deleted were there, yet “02” is still interesting. I wonder if it was just a coding piece or if there was any significance to it. In any event, this post definitely conveys a message and I enjoyed reading your opinion on the volume of tweets you saw being posted in attempts to bash the new administration. Obviously there will typically be dissidence between the two parties, but in this case, it obviously is more extreme considering his unusual approach to policy. There is definitely merit to saying that many who are simply outraged provide nothing but leftist jargon to feel they are being productive, yet there are values on both sides that I can understand yet it seems, just like all else, they are diminished but those who don’t fully understand their own views. This is a sure connection to Crawford’s critical questions of data and I thought it was wise to include that last point.

  2. I appreciate your reflection on the data scraped from Twitter in comparison to the data from your article. I definitely would suggest keeping this in mind as you continue to track your hashtag, since it could help you refine your thinking about your own results. I think considering the unique nature of Twitter as opposed to other word-based platforms is important, and I will keep that in mind for my own research.

Leave a Reply

Your email address will not be published. Required fields are marked *