pip3 install praw-- used to benefit from pagination and rate limiting.
- In order to use Reddit API, you need to register a custom 'personal script' app and get
client_secretparameters. See more here.
- In order to access user's personal data (e.g. saved posts/comments), reddit API also requires
passwordparameters. (yes, your Reddit password).
- It might be convenient to dump these in a file like
client_id = ... clien_secret = ... username = ... password = ...
rexport.py --secrets /path/to/reddit_secrets.py. That way you have to type less and have control over where you're keeping your plaintext reddit password.
Alternatively, you can pass auth arguments directly, e.g.
rexport.py --username <user> --password <password> --client_id <client_id> --client_secret <client_secret>. However, this is prone to leaking your password in shell history.
You can also import script and call
get_json function directory to get raw json.
WARNING: reddit API limits your queries to 1000 entries. It's highly recommended to back up regularly and keep old versions. Easy way to achieve it is command like this:
./rexport.py --secrets secrets.py >reddit-$(date -I).json.
- your data is there, there is just no resources to serve it
- perhaps you can request all of your data under GDPR? I haven't tried that personally though.
- pushshift can potentially
See ./output.json, it's got some example data you might find in your data export. I've cleaned it up a bit as it's got lots of different fields many of which are probably not relevant.However, this is pretty API dependent and changes all the time, so better check with Reddit API if you are looking to something specific.