Skip to content

twarc2

Collect data from the Twitter V2 API.

Usage:

twarc2 [OPTIONS] COMMAND [ARGS]...

Options:

  --consumer-key TEXT         Twitter app consumer key (aka "App Key")
  --consumer-secret TEXT      Twitter app consumer secret (aka "App Secret")
  --access-token TEXT         Twitter app access token for user
                              authentication.
  --access-token-secret TEXT  Twitter app access token secret for user
                              authentication.
  --bearer-token TEXT         Twitter app access bearer token.
  --app-auth / --user-auth    Use application authentication or user
                              authentication. Some rate limits are higher with
                              user authentication, but not all endpoints are
                              supported.  [default: app-auth]
  -l, --log TEXT
  --verbose
  --metadata / --no-metadata  Include/don't include metadata about when and
                              how data was collected.  [default: metadata]
  --config FILE               Read configuration from FILE.
  --help                      Show this message and exit.

compliance-job

Create, retrieve and list batch compliance jobs for Tweets and Users.

Usage:

twarc2 compliance-job [OPTIONS] COMMAND [ARGS]...

Options:

  --help  Show this message and exit.

create

Create a new compliance job and upload tweet IDs.

Usage:

twarc2 compliance-job create [OPTIONS] {tweets|users} INFILE [OUTFILE]

Options:

  --job-name TEXT     A name or tag to help identify the job.
  --wait / --no-wait  Wait for the job to finish and download the results.
                      Wait by default.
  --hide-progress     Hide the Progress bar. Default: show progress.
  --help              Show this message and exit.

download

Download the compliance job with the specified ID.

Usage:

twarc2 compliance-job download [OPTIONS] JOB [OUTFILE]

Options:

  --wait / --no-wait  Wait for the job to finish and download the results.
                      Wait by default.
  --hide-progress     Hide the Progress bar. Default: show progress.
  --help              Show this message and exit.

get

Returns status and download information about the job ID.

Usage:

twarc2 compliance-job get [OPTIONS] JOB

Options:

  --verbose      Show all URLs and metadata.
  --json-output  Return the raw json content from the API.
  --help         Show this message and exit.

list

Returns a list of compliance jobs by job type and status.

Usage:

twarc2 compliance-job list [OPTIONS] [[tweets|users]]

Options:

  --status [created|in_progress|complete|failed]
                                  Filter by job status. Only one of 'created',
                                  'in_progress', 'complete', 'failed' can be
                                  specified. If not set, returns all.
  --verbose                       Show all URLs and metadata.
  --json-output                   Return the raw json content from the API.
  --help                          Show this message and exit.

configure

Set up your Twitter app keys.

Usage:

twarc2 configure [OPTIONS]

Options:

  --help  Show this message and exit.

conversation

Retrieve a conversation thread using the tweet id.

Usage:

twarc2 conversation [OPTIONS] TWEET_ID [OUTFILE]

Options:

  --since-id INTEGER              Match tweets sent after tweet id
  --until-id INTEGER              Match tweets sent prior to tweet id
  --start-time [%Y-%m-%d|%Y-%m-%dT%H:%M:%S]
                                  Match tweets created after UTC time (ISO
                                  8601/RFC 3339), e.g.  2021-01-01T12:31:04
  --end-time [%Y-%m-%d|%Y-%m-%dT%H:%M:%S]
                                  Match tweets sent before UTC time (ISO
                                  8601/RFC 3339)
  --archive                       Search the full archive (requires Academic
                                  Research track)
  --limit INTEGER                 Maximum number of tweets to save
  --max-results INTEGER           Maximum number of tweets per API response
  --hide-progress                 Hide the Progress bar. Default: show
                                  progress, unless using pipes.
  --help                          Show this message and exit.

conversations

Fetch the full conversation threads that the input tweets are a part of. Alternatively the input can be a line oriented file of conversation ids.

Usage:

twarc2 conversations [OPTIONS] [INFILE] [OUTFILE]

Options:

  --limit INTEGER               Maximum number of tweets to return
  --conversation-limit INTEGER  Maximum number of tweets to return per-
                                conversation
  --archive                     Use the Academic Research project track access
                                to the full archive
  --hide-progress               Hide the Progress bar. Default: show progress,
                                unless using pipes.
  --help                        Show this message and exit.

counts

Return counts of tweets matching a query.

Usage:

twarc2 counts [OPTIONS] QUERY [OUTFILE]

Options:

  --since-id INTEGER              Count tweets sent after tweet id
  --until-id INTEGER              Count tweets sent prior to tweet id
  --start-time [%Y-%m-%d|%Y-%m-%dT%H:%M:%S]
                                  Count tweets created after UTC time (ISO
                                  8601/RFC 3339), e.g.  2021-01-01T12:31:04
  --end-time [%Y-%m-%d|%Y-%m-%dT%H:%M:%S]
                                  Count tweets sent before UTC time (ISO
                                  8601/RFC 3339)
  --archive                       Count using the full archive (requires
                                  Academic Research track)
  --granularity [day|hour|minute]
                                  Aggregation level for counts. Can be one of:
                                  day, hour, minute. Default is hour.
  --limit INTEGER                 Maximum number of days of results to save
                                  (minimum is 30 days)
  --text                          Output the counts as human readable text
  --csv                           Output counts as CSV
  --hide-progress                 Hide the Progress bar. Default: show
                                  progress, unless using pipes.
  --help                          Show this message and exit.

dehydrate

Extract tweet or user IDs from a dataset.

Usage:

twarc2 dehydrate [OPTIONS] [INFILE] [OUTFILE]

Options:

  --id-type [tweets|users]  IDs to extract - either 'tweets' or 'users'.
  --hide-progress           Hide the Progress bar. Default: show progress,
                            unless using pipes.
  --help                    Show this message and exit.

flatten

"Flatten" tweets, or move expansions inline with tweet objects and ensure that each line of output is a single tweet.

Usage:

twarc2 flatten [OPTIONS] [INFILE] [OUTFILE]

Options:

  --hide-progress  Hide the Progress bar. Default: show progress, unless using
                   pipes.
  --help           Show this message and exit.

followers

Get the followers for a given user.

Usage:

twarc2 followers [OPTIONS] USER [OUTFILE]

Options:

  --limit INTEGER  Maximum number of followers to save. Increments of 1000.
  --hide-progress  Hide the Progress bar. Default: show progress
  --help           Show this message and exit.

following

Get the users that a given user is following.

Usage:

twarc2 following [OPTIONS] USER [OUTFILE]

Options:

  --limit INTEGER  Maximum number of friends to save. Increments of 1000.
  --hide-progress  Hide the Progress bar. Default: show progress
  --help           Show this message and exit.

hydrate

Hydrate tweet ids.

Usage:

twarc2 hydrate [OPTIONS] [INFILE] [OUTFILE]

Options:

  --hide-progress  Hide the Progress bar. Default: show progress, unless using
                   pipes.
  --help           Show this message and exit.

mentions

Retrieve max of 800 of the most recent tweets mentioning the given user.

Usage:

twarc2 mentions [OPTIONS] USER_ID [OUTFILE]

Options:

  --since-id INTEGER              Match tweets sent after tweet id
  --until-id INTEGER              Match tweets sent prior to tweet id
  --start-time [%Y-%m-%d|%Y-%m-%dT%H:%M:%S]
                                  Match tweets created after time (ISO
                                  8601/RFC 3339), e.g.  2021-01-01T12:31:04
  --end-time [%Y-%m-%d|%Y-%m-%dT%H:%M:%S]
                                  Match tweets sent before time (ISO 8601/RFC
                                  3339)
  --hide-progress                 Hide the Progress bar. Default: show
                                  progress
  --help                          Show this message and exit.

sample

Fetch tweets from the sample stream.

Usage:

twarc2 sample [OPTIONS] [OUTFILE]

Options:

  --limit INTEGER  Maximum number of tweets to save
  --help           Show this message and exit.

Search for tweets.

Usage:

twarc2 search [OPTIONS] QUERY [OUTFILE]

Options:

  --since-id INTEGER              Match tweets sent after tweet id
  --until-id INTEGER              Match tweets sent prior to tweet id
  --start-time [%Y-%m-%d|%Y-%m-%dT%H:%M:%S]
                                  Match tweets created after UTC time (ISO
                                  8601/RFC 3339), e.g.  2021-01-01T12:31:04
  --end-time [%Y-%m-%d|%Y-%m-%dT%H:%M:%S]
                                  Match tweets sent before UTC time (ISO
                                  8601/RFC 3339)
  --archive                       Search the full archive (requires Academic
                                  Research track). Defaults to searching the
                                  entire twitter archive if --start-time is
                                  not specified.
  --limit INTEGER                 Maximum number of tweets to save
  --max-results INTEGER           Maximum number of tweets per API response
  --hide-progress                 Hide the Progress bar. Default: show
                                  progress, unless using pipes.
  --help                          Show this message and exit.

searches

Execute each search in the input file, one at a time.

The infile must be a file containing one query per line. Each line will be passed through directly to the Twitter API - unlike the timelines command quotes will not be removed.

Input queries will be deduplicated - if the same literal query is present in the file, it will still only be run once.

It is recommended that this command first be run with --counts-only, to check that each of the queries is retrieving the volume of tweets expected, and to avoid consuming quota unnecessarily.

Usage:

twarc2 searches [OPTIONS] [INFILE] [OUTFILE]

Options:

  --since-id INTEGER              Match tweets sent after tweet id
  --until-id INTEGER              Match tweets sent prior to tweet id
  --start-time [%Y-%m-%d|%Y-%m-%dT%H:%M:%S]
                                  Match tweets created after UTC time (ISO
                                  8601/RFC 3339), e.g.  2021-01-01T12:31:04
  --end-time [%Y-%m-%d|%Y-%m-%dT%H:%M:%S]
                                  Match tweets sent before UTC time (ISO
                                  8601/RFC 3339)
  --archive                       Search the full archive (requires Academic
                                  Research track). Defaults to searching the
                                  entire twitter archive if --start-time is
                                  not specified.
  --limit INTEGER                 Maximum number of tweets to save *per
                                  search*, ignored if --counts-only is
                                  specified.
  --hide-progress                 Hide the Progress bar. Default: show
                                  progress, unless using pipes.
  --counts-only                   Only retrieve counts of tweets matching the
                                  search, not the tweets themselves. outfile
                                  will be a CSV containing the counts for all
                                  of the queries in the input file.
  --combine-queries               Merge consecutive queries into a single OR
                                  query. For example, if the three rows in
                                  your file are: banana, apple, pear then a
                                  single query ((banana) OR (apple) OR (pear))
                                  will be issued.
  --granularity [day|hour|minute]
                                  Aggregation level for counts (only used when
                                  --count-only is used). Can be one of: day,
                                  hour, minute. Default is day.
  --help                          Show this message and exit.

stream

Fetch tweets from the live stream.

Usage:

twarc2 stream [OPTIONS] [OUTFILE]

Options:

  --limit INTEGER  Maximum number of tweets to return
  --help           Show this message and exit.

stream-rules

List, add and delete rules for your stream.

Usage:

twarc2 stream-rules [OPTIONS] COMMAND [ARGS]...

Options:

  --help  Show this message and exit.

add

Create a new stream rule to match a value. Rules can be grouped with optional tags.

Usage:

twarc2 stream-rules add [OPTIONS] VALUE

Options:

  --tag TEXT  a tag to help identify the rule
  --help      Show this message and exit.

delete

Delete the stream rule that matches a given value.

Usage:

twarc2 stream-rules delete [OPTIONS] VALUE

Options:

  --help  Show this message and exit.

delete-all

Delete all stream rules!

Usage:

twarc2 stream-rules delete-all [OPTIONS]

Options:

  --help  Show this message and exit.

list

List all the active stream rules.

Usage:

twarc2 stream-rules list [OPTIONS]

Options:

  --help  Show this message and exit.

timeline

Retrieve recent tweets for the given user.

Usage:

twarc2 timeline [OPTIONS] USER_ID [OUTFILE]

Options:

  --limit INTEGER                 Maximum number of tweets to return
  --since-id INTEGER              Match tweets sent after tweet id
  --until-id INTEGER              Match tweets sent prior to tweet id
  --exclude-retweets              Exclude retweets from timeline
  --exclude-replies               Exclude replies from timeline
  --start-time [%Y-%m-%d|%Y-%m-%dT%H:%M:%S]
                                  Match tweets created after time (ISO
                                  8601/RFC 3339), e.g.  2021-01-01T12:31:04
  --end-time [%Y-%m-%d|%Y-%m-%dT%H:%M:%S]
                                  Match tweets sent before time (ISO 8601/RFC
                                  3339)
  --use-search                    Use the search/all API endpoint which is not
                                  limited to the last 3200 tweets, but
                                  requires Academic Product Track access.
  --hide-progress                 Hide the Progress bar. Default: show
                                  progress, unless using pipes.
  --help                          Show this message and exit.

timelines

Fetch the timelines of every user in an input source of tweets. If the input is a line oriented text file of user ids or usernames that will be used instead.

The infile can be:

- A file containing one user id per line (either quoted or unquoted)
- A JSONL file containing tweets collected in the Twitter API V2 format

Usage:

twarc2 timelines [OPTIONS] [INFILE] [OUTFILE]

Options:

  --limit INTEGER           Maximum number of tweets to return
  --timeline-limit INTEGER  Maximum number of tweets to return per-timeline
  --use-search              Use the search/all API endpoint which is not
                            limited to the last 3200 tweets, but requires
                            Academic Product Track access.
  --exclude-retweets        Exclude retweets from timeline
  --exclude-replies         Exclude replies from timeline
  --hide-progress           Hide the Progress bar. Default: show progress,
                            unless using pipes.
  --help                    Show this message and exit.

tweet

Look up a tweet using its tweet id or URL.

Usage:

twarc2 tweet [OPTIONS] TWEET_ID [OUTFILE]

Options:

  --pretty  Pretty print the JSON
  --help    Show this message and exit.

users

Get data for user ids or usernames.

Usage:

twarc2 users [OPTIONS] [INFILE] [OUTFILE]

Options:

  --usernames
  --hide-progress  Hide the Progress bar. Default: show progress, unless using
                   pipes.
  --help           Show this message and exit.

version

Return the version of twarc that is installed.

Usage:

twarc2 version [OPTIONS]

Options:

  --help  Show this message and exit.