This site contains Twitter JSONL and media related to:
This data is provided "as is" for researchers, journalists, historians, and data hoarders.
Files are located here: https://echoarchive.org/files/
wget -m -np -c -U FAQ -R "index.html*" "https://echoarchive.org/files/"
Search for a term in a specific JSONL.ZST file:
find ./ -type f -name "{{ file_name }}.zst" -exec zstd --long=28 -d -c {} \; | grep "{{ search_term }}" | less
Search an entire directory:
find ./ -type f -name "*.zst" -exec zstd --long=28 -d -c {} \; | grep "{{ search_term }}" | less
Note: You may need to increase the --long
value based on the file size.
twitter/ ├── COVID_Tweets_2020_01-05 │ ├── 2020-01 │ ├── 2020-02 │ ├── 2020-03 │ ├── 2020-04 │ └── 2020-05 ├── history ├── ukraine │ ├── images │ │ └── urls │ ├── jsonl │ ├── users │ └── video │ ├── contact_sheets │ ├── sheet_videos │ └── urls └── various
Mirror of Tweet JL: The largest collection of tweets available in JSONL format.
Original Source: The Eye - Twitter Archive
There are probably better ways to iterate and search through the data. I am not a data analyst. If you want to contact me, please email: vid.archive9@gmail.com