The Social Network Innovation Lab has been exploring the differing pace of life in urban American cities. Within the field of pace of life studies, there exist a variety of metrics for measuring the speed or pace of life in populated areas (and that these measures can be positively correlated with the population in each area). We have been exploring whether the usage of the social micro-blogging service Twitter might share characteristics with other established pace of life metrics, and thus may qualify as a pace of life metric in it’s own right.
This type of research requires a high level of data about Twitter use in urban areas. For our research, we developed methods to collect geo-located tweets (tweets which contain the location information of where they were tweeted), from 50 of the most populated U.S. cities across the country according to U.S. Census statistics[i]. Although the option to geo-locate one’s Tweets within Twitter is an opt-in feature and this biases our data, we posit that our data is roughly proportional to US urban Twitter activity. We began collecting tweets in mid-January, 2012. Our collection streams capture on average 450,000 Tweets a day and to date we have collected over 120,000,000 tweets.
As a first approach to consider how this data might reflect the pace of life with cities, we examined all Tweets collected during the month of April 2012. From this data, we counted how many tweets were collected from each city and then to compare these counts to the latest US Census data for each city. This data was then plotted on a log-log chart (seen above). This process allows us to begin to see how the Twitter data we have been collecting corresponds with city populations.
As would be expected, larger cities generally return a higher volume of Tweets, with a clear linear trend. However, as the graph above illustrates, this correlation has a fairly high variance. What is interesting to notice is the range in tweets collected from different cities with similar populations. For example, while Atlanta, Miami, Cleveland, New Orleans, Minneapolis, and Raleigh have very similarly sized populations, their volumes of Twitter activity are all significantly different. This immediately suggests that there may be another population demographic or city attribute more closely correlates with the volume of Tweets associated with each city.
One general observation is that major media “hub cities” and cities with large metropolitan areas, such as LA, New York, Chicago, Atlanta etc. tend to exceed the general trend of the plot. These are cities in which the media and entertainment industries have a stronger presence in peoples’ everyday life than in other cities. It could be hypothesized that because of the strong media-focus in these larger metro cities, Twitter may be more culturally relevant and popular to individuals living in those cities.
Another possibility is the effect that ethnic and racial diversity might have on Twitter usage in certain cities. The graph above may be suggesting that some of the most diverse cities in the US also tend to be ones with the highest volume of Twitter usage when compared to similarly sized cities with less diversity. For example, the graph above illustrates Atlanta’s noteworthy tweet frequency for its population. African-Americans have a “rich” history in Atlanta and in the 1990s, the city “gained the largest number of African-Americans of any US metropolitan area” (McConnell 2011). Twitter literature suggests that African-Americans tend to tweet more than whites (Smith 2011; Smith and Brenner 2012). Our initial results suggest the possibility that cities with large percentages of African-American residents tend to also be the ones with high twitter-volume. Of course, the caveat is that there are too many confounding variables which have not been examined at this early stage and this hypothesis is very preliminary.
In addition to the potential influence of racial diversity is the effect that age (and by relation market penetration) might have on Twitter volume seen in different cities. Twitter, and many other internet-based social media platforms in general tend to have youth-skewed memberships. This would indicate that cities with “younger” populations may also have higher Twitter activity within the city as a whole. Younger cities[ii] such as San Francisco, Boston, and New York appeared to have more tweets than most older-cities with similar population sizes.
Though these are interesting preliminary findings, it is important to note that these are aggregate numbers consisting of entire cities and their generated tweets. It is not surprising that trends within these cities would be highly correlated with significant demographic trends and city attributes when taken in aggregate. Our next steps will be to study how cities may affect the behavior of individuals (particularly through exploring metrics like speed and rate) as opposed to aggregate counts and volumes. We are currently exploring this data to determine the distribution of tweet rates by individual users to see if the average rates by users in each city shares a statistically significant relationship with city population.
Lab members involved in this project: Alexander Pensavalle, Alexander Gross, and Dhiraj Murthy.
References:
McConnell, E. D. 2011 ‘An “incredible number of Latinos and Asians:” Media representations of racial and ethnic population change in Atlanta, Georgia’, Latino Studies 9(2-3): 177-197.
Smith, A. 2011 ‘Technology Use by People of Color: Overview of Pew Internet Project Research’ Pew Internet and American Life Project, Washington DC: Pew foundation.
Smith, A. and Brenner, J. 2012 ‘Twitter Use 2012′ Pew Internet and American Life Project, Washington DC: Pew foundation.
The 2012 Presidential Election has been mentioned significantly on Twitter. The candidates, news media, citizen journalists, and voters have been participating in the discourse. Events including speeches, major events, and prominent gaffes have yielded interesting patterns. We have developed a tool to visualize Twitter buzz around candidates from the 50 most populous urban American cities. Please visit the Election 2012 Twitter Visualization Tool, here. The visualization tool has been optimized for the Safari browser (and is known to have some issues in other browsers). This tool was conceptualized and developed by Professor Dhiraj Murthy, lab programmer Alexander Gross, and undergraduate student researcher Stephanie Bond. This research builds upon our previous work on Twitter and Health and our GOP Primary Visualization Tool.
The goal of our research is to explore urban American responses to the 2012 presidential candidates on Twitter. In order to create a representative sample of tweets from urban centers in the United States, we collected tweets from Twitter by location. We took the 50 most populous American cities according to the U.S. Census and instructed Twitter to send us tweets that were within 7-12km of the locations of these cities.
Our software collects these geo-located tweets and uses the data to chart the relative buzz surrounding candidates in the 2012 presidential election. The tool charts the relative popularity of each primary candidate as measured by the number of tweets which we have collected over the last 24 hours and identified with a particular candidate. For a tweet to be counted as referring to a particular candidate, the tweet must contain the candidate’s first and last name separated by a space e.g. “Mitt Romney” or the candidate’s official campaign twitter account name or the account name eg @mittromney. A single mention as reported by the chart’s dynamic legend is equivalent to one tweet which contains one of the candidate names. Tweets which contain more than one candidate name will be counted as mentions for both candidates. These stringent rules prevent unecessary possible over counting of tweets for a candidate. Though the frequency of the tweet count in our visualization is low because of this, the data collected is very robust. Specifically, all tweets visualized do refer to Obama or Romney.
Stay tuned for new feature releases and updates. We are making this Election 2012 Visualization Tool available so others can explore the data we have been collecting as the race is being decided. If you experience any problems or bugs with this tool, please leave a comment at this page and we will endeavor to resolve them quickly.
The Social Network Innovation Lab is pleased to announce the beta launch of its new tool to visualize the activity of users on a popular virtual life science community of practice. Amongst other things, the community includes virtual spaces to promote the creation of life science-oriented virtual organizations. The community (including usernames) have been anonymized.
The tool’s visualization engine uses the Google Earth API to plot and explore the location-based activity of users in this community. BioViz was developed by Professor Dhiraj Murthy, lab programmer Alexander Gross, and undergraduate student research fellows Alex Takata and Macgill Eldredge.
The tool is still under development in the Lab, and is released now in its beta form with some known bugs. The current tool visualizes activity within the virtual community from a topic specific standpoint, as well as allowing comparison of these results against a broad range of other potentially relevant location based demographic data.
In order to share the online visualization tool we have made the tool available to the public. BioViz can be found here. If you experience any problems or bugs with this tool please leave a comment at this page and we will endeavor to resolve them quickly.
BioViz is a research output supported by NSF Grant #1025428*
* Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
The 2012 Republican Primary has been mentioned significantly on Twitter. The actual candidates, news media, citizen journalists, and voters have been participating in the discourse. Various events including Primary debates and candidates pulling out of the race have led to interesting patterns on Twitter. For example, we have seen Ron Paul mentioned at the highest frequency during several debates; polling by media organizations at the time indicated a lagging rather than leading position for Paul. Our research examines urban American responses to Republican Primary candidates on Twitter. We have developed a tool to visualize tweets from the most populous urban American cities. Please visit the 2012 GOP Primary Twitter Visualization Tool, here. The visualization tool has been optimized for the Safari browser (and is known to have some issues in other browsers). This tool was conceptualized and developed by Professor Dhiraj Murthy, lab programmer Alexander Gross, and undergraduate student researcher Stephanie Bond. This research builds upon our previous work on Twitter and Health.
The goal of our research is to explore urban American responses to Republican primary candidates on Twitter. In order to create a representative sample of tweets from urban centers in the United States, we collected tweets from Twitter by location. We took the 50 most populous American cities according to the U.S. Census and instructed Twitter to send us tweets that were within 7-12km of the locations of these cities.
Our software collects these geo-located tweets and uses the data to chart the relative popularity of candidates in the 2012 GOP Primary Race. The tool charts the relative popularity of each primary candidate as measured by the number of tweets which we have collected over the last 24 hours and identified with a particular candidate. For a tweet to be counted as referring to a particular candidate, the tweet must contain the candidate’s first and last name separated by a space e.g. “Herman Cain” or the candidate’s official campaign twitter account name or the account name eg @THEHermanCain. A single mention as reported by the chart’s dynamic legend is equivalent to one tweet which contains one of the candidate names. Tweets which contain more than one candidate name will be counted as mentions for both candidates. These stringent rules prevent, for example, tweets which include ‘Perry’ which refer to the American pop singer Katy Perry to be counted as mentions for Rick Perry. Though the frequency of the tweet count in our visualization is low because of this, the data collected is very robust. Specifically, all tweets visualized do refer to the primary candidates.
The tool is still under development in our Lab. We are making this 2012 GOP Primary Visualization Tool available in its beta form so others can explore the data we have been collecting as the race is being decided. If you experience any problems or bugs with this tool, please leave a comment at this page and we will endeavor to resolve them quickly.
The Social Network Innovation Lab is pleased to announce the beta launch of its new tool to visualize cancer-related tweets on Twitter. This visualization tool represents a next step for the Lab building on its previous work on the Cancer Keyword Visualization Tool. The tool was developed as part of the lab’s current research on Analyzing Health Networks on Twitter and uses the Google Earth API to plot and explore a growing dataset of cancer related tweets.
For the last year the Lab has utilized a set of software scripts initially developed by Professor Dhiraj Murthy and student researcher Scott Longwell, to develop a dataset of almost 500,000 tweets that contain the following cancer health related keywords : ‘chemo’, ‘mammogram’, ‘lymphoma’, and ‘melanoma’. The tool allows for the exploration of who is tweeting about cancer and where in the world they are located.
This tool is still being developed in the Lab, and is released now in its beta form with some known bugs. The current tool visualizes tweets from a static collection of tweets captured over the last year with identified locations. In the near future we hope to improve the tool’s accuracy in locating tweets, make it a real-time visualization tool, explore the most recent cancer tweets, and allow users to perform temporal queries on the data.
In order to share the online visualization tool we have made the tool available to the public. The CancerTweets Twitter visualization tool can be found here. If you experience any problems or bugs with this tool please leave a comment on this page and we will endeavor to resolve them quickly.
This project is funded by a three year grant from the National Science Foundation’s Office of CyberInfrastructure (Award #1025428)*. This project explores the question of why Virtual Organization Breeding Environments (VBEs), virtual spaces which are responsible for encouraging the formation of virtual organizations (VOs), often do not use social networking site technology (SNS) and whether this cyberinfrastructure could promote the development of VOs. This research will examine empirically two pioneering instances of the use of SNS in life science VBEs to assess the ways that they are utilized and to understand their potential to foster trust and social cohesion between potential VO team members, crucial prerequisites for VO success.
* Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
This is a pilot project which has attempted to understand and visualize health-related social networks on Twitter. We are currently building a longitudinal data set based on selected cancer networks on Twitter. The aim of this pilot is to test the feasibility of various modes of collecting data regarding cancer networks on Twitter using Social Network Analysis (SNA).
One topic we are currently exploring for this project is to take a look at who is tweeting about cancer and where in the world they are located. Using special set of software scripts developed by student researcher Scott Longwell the Lab has developed a dataset of several 100,000 cancer related tweets indexed by cancer keywords and location. We are just now begin to look closely at this data to see what it can tell us about how to identify cancer specific networks and hubs within the Twitter social network.
In order to share some of this data we have created an online visualization tool which allows users to begin to explore some of this data themselves. This tool can be found here.
This project examines Twitter’s influence on social communication and virtual communities. A core aspect of this project is the development of a trust-based algorithm on Twitter which incorporates tweet text, followers, followed, and list activity amongst other data.