this post was submitted on 20 Jun 2023
10 points (100.0% liked)

Reddit Migration

37 readers
2 users here now

### About Community Tracking and helping #redditmigration to Kbin and the Fediverse. Say hello to the decentralized and open future. To see latest reeddit blackout info, see here: https://reddark.untone.uk/

founded 1 year ago
 

This is a followup to my previous PSA, https://kbin.social/m/RedditMigration/t/47320/PSA-If-you-have-more-than-1000-posts-more-than

That one warned you about the problem, which seems to have caught a lot of people unawares and caused a lot of confusion in general. (Thank you reddit!!! /s)

Now here's what you can do about it.

If you happen to know that you're only slightly over the limit in one category, odds are good that one of the modified versions of Power Delete Suite will work for you. (You want one with at least an additional 5.1s delay - the original PDS fails to delete/overwrite a lot of stuff because it tries too fast and reddit rejects the changes and imposes a cooldown period.)

PDS will delete your comments and submissions (posts), and it checks all four categories - new, hot, top, and controversial. So if you say only had 1001 or 1002, odds are good that the last few missing from new would show up in top (if old and upvoted a lot) or controversial (if old and downvoted a lot), and thus you still get everything wiped.

In theory you could get really lucky here. Say you have exactly 3000 comments - your newest 1000 have never been upvoted or downvoted, your oldest 1000 have all been heavily downvoted, and your middle 1000 have been heavily upvoted. Then PDS would get the newest, the most controversial 1000 - which happen to be the oldest, and the top 1000 - which happen to be the middle. All 3000 comments gone.

But what to do if you have A LOT beyond the limit? Say more than 4000? Also keep in mind, I was just shy of 1500 comments, and I wasn't so lucky - this method didn't get everything for me, despite not even hitting 2000.

A common tip is to search for your reddit self on google. This works really well to find more stuff but won't catch everything. I speak from experience here.

The right answer is to do a GDPR or CCPA request to reddit, and then they will have to give you your data, your complete history. You can then use this to wipe everything clean.

However, folks who started this around or after the blackout aren't getting timely responses. (As late as the end of last April, it came back in a few hours. Now, everyone who has requested has yet to hear back.)

I'm skeptical that we'll get timely responses here.

So the other right answer used to be to use the Pushshift API instead (hat-tip to 1chemistdown - https://kbin.social/u/1chemistdown ). That API archived a copy of reddit - and unlike reddit, it didn't have the 1000 indexing limit. So you could find even older content just by searching yourself using that API.

The bad news is that this no longer works as Pushshift was forced to shutdown by reddit, and they won't be reopening normal API access for regular users.

The good news is that before all this went down, Pushshift published torrents of their raw data. Up to the end of 2022. You can just download this, and then extract your own post and comment history.

The Pushshift torrent is 1.66TB, compressed. But the Pushshift team also released really good scripts so it's easy to extract your own info without having to uncompress the full archive like you traditionally would with a zip file. Instead, their scripts uncompress a small segment at a time in RAM and process that, so you just need enough space to store the compressed files and enough compute power to process it. (Spoiler alert - most modern laptops have enough compute power. My cheap <$1k laptop did.)

Even better - you can selectively download specific files from a torrent. Pushshift's torrent is broken up into a bunch of files, two each per sub (one for comments and one for submissions). So if you know which subs you frequented, you can just pick and download those instead of having to download the full archive.

So grab your reddit data from the Pushshift torrent, use their scripts to print out the complete list of every comment and every post that you have ever created on reddit, and then feed that to something like the reddit-migration script or a modified shreddit script to completely wipe everything. 100% of everything.

No need to depend on the benevolence of reddit to hand you your history back first. (For those in the EU or California, I'm sure reddit will get back to us eventually, but you'll forgive me for being more skeptical when these powerful laws don't apply.)

The specific details on where to grab everything, including links to the torrent and to the various github scripts and even patches that I wrote, is on my other article - https://kbin.social/m/RedditMigration/t/59451/Finally-Managed-to-erase-all-1477-of-my-comments

TL;DR - you can grab your entire reddit history from a torrent published by pushshift up to the end of 2022, and use that data to overwrite/delete your content beyond the 1000 index limit on reddit.

top 5 comments
sorted by: hot top controversial new old
[–] fossilesque@mander.xyz 4 points 1 year ago* (last edited 1 year ago)

Bless this post.

Interestingly, I just tried to delete all of my gilded posts by hand and reddit kept booting me off.

280k karma, 10-year account. Bye, Felicia.

[–] Eggyhead@kbin.social 2 points 1 year ago (2 children)

Could you set your VPN to Europe and make a request for your history?

[–] abff08f4813c@kbin.social 1 points 1 year ago

Brilliant! This would probably work, if pressed I can just say that I moved recently or something. Heck, even if it ends up in court or something, if I'm using a VPN to transmit my content through a specific country (say Ireland or France), maybe that'd be enough to count since arguably my data is now being processed in the EU!

[–] dan@upvote.au 0 points 1 year ago* (last edited 1 year ago) (1 children)

You shouldn't have to. GDPR applies for all EU citizens regardless of the country they're currently living in, and I doubt they're checking that people that submit GDPR requests are actually European.

[–] abff08f4813c@kbin.social 1 points 1 year ago

Hmm, interesting point. I ultimately decided against it - reddit isn't fully complying at the moment (see https://kbin.social/m/RedditMigration/t/99466/Reddit-violates-CCPA ) and if comes down to a fight, then I'd still have no recourse. reddit might not check, but the ICO or whoever I try to appeal to for relief probably would.