Documentation

Key Points

Introduction	Twitter is a microblogging platform that allows data collection from its API. Twarc is a Python application an dlibrary that allows users to programmatically collect and archive Tweets.
Getting familiar with JupyterLab	Navigating Python in a JupyterLab environment Configuring an application to work with an API Arranging a directory structure and loading libraries
Anatomy of a tweet: structure of a tweet as JSONL	Tweets arrive as JSONL, a super common format. We can use online viewers for a human-readable look at JSONL Tweets come with a TON of associated data
The Twitter Public API	There are many online sources of Twitter data Utilities and plugins come with twarc to help us out A consistant ‘harvest > convert > examine’ pipeline will help us work with our data
Ethics and Twitter
Search and Stream	Search: collect pre-existing tweets that satisfy parameters Stream: collect Tweets that satisfy parameters, as they are posted
Plugins and Searches	twarc2 has plug-ins that need separate installation The network plug-in shows us how tweeters are related to each other
TextBlob Sentiment Analysis	TextBlob has a bunch of functions that we should learn
Data Management	Dehydrating your tweets solves a lot of issues
Don't Map Twitter	Determining the location of a tweet when it happened is fuzzy. At best, it’s a proxy for ‘aboutness’ Proceed with Caution and Respect Humans’ Privacy

Check out Documenting the Now’s extensive twarc documentation. The UCSB Library has also created a guide to using twarc with the v.1 API.