this post was submitted on 18 Jun 2023
59 points (100.0% liked)

Reddit Migration

37 readers
2 users here now

### About Community Tracking and helping #redditmigration to Kbin and the Fediverse. Say hello to the decentralized and open future. To see latest reeddit blackout info, see here: https://reddark.untone.uk/

founded 1 year ago
 

I just spent all day today fighting with reddit, trying to get all my comments deleted/overwritten: https://kbin.social/m/RedditMigration/t/45417/Anyone-have-experience-with-deleting-comments-to-see-older-comments#entry-comment-190482

It's not just me, someone else reported the same, though using a different tool: https://kbin.social/m/RedditMigration/t/46805/Strange-phenomenon-while-deleting-my-comments

Basically, reddit has the most ridiculous api ever! A 1000 limit on viewing .. well basically anything. Try to go further back, and you can't.

The tools and scripts and websites we are using to delete, they are hitting that limit and can't go past it. My own reddit is only 5 years old and I hit this. I imagine that many folks where, the ex-redditors who had 12, 17 year old accounts, you probably didn't get everything on your way out.

Unless of course, you had a data retrieval request made to reddit, and reddit responded with your data. Only then are tools like shreddit and websites like shreddit.com able to completely wipe out your history. Or else you knew about this somehow already and used an external manager like eternity - https://github.com/jc9108/eternity - to save a copy of your posts before they got lost to the 1k limit.

Worst of all, it's explained that deleting items does not rebuild the list - so you can't see the older stuff by deleting newer stuff.

I'm hoping that private/public transition is an exception to this and it'll rebuild my lists when that happens. Maybe then I can go far back enough to delete everything.

Edit: Nope, someone confirmed in a comment below that this doesn't happen.

Also looks like pushshift is not an option, as pushshift was shut down last month, https://old.reddit.com/r/pushshift/comments/13mhuzq/api_has_been_taken_down/ - and under the new deal, regular users won't be able to use it when it opens up for business again, only approved moderators can (and likely only for approved reasons) if i'm understanding https://old.reddit.com/r/pushshift/comments/13w6j20/advancing_communityled_moderation_an_update_on/ correctly.

top 42 comments
sorted by: hot top controversial new old
[–] gk99@kbin.social 6 points 1 year ago* (last edited 1 year ago) (2 children)

Honestly, what's really important to me is that my top stuff was deleted, rather than all the other horseshit I posted day to day on reddit. As much as possible would be good, but what I really want is to leave behind holes in major discussions, turning the place into Emmentaler. Thankfully, this is my third account wipe so I'm not that worried.

Edit: Plus I haven't actually deleted my account yet and continually run a delete script every time I see comments come back. Hell, I cleared it out early today manually at work while I had free time.

[–] H_Interlinked@kbin.social 1 points 1 year ago (1 children)

That's what I just got done doing. Submissions sorted by Top, deleted. Comments sorted by Top, deleted.

[–] abff08f4813c@kbin.social 2 points 1 year ago

I've used all four methods - new, top, hot, controversial - and each one brought up new stuff that got missed by another category. After that I used google - and that brought more stuff that got missed by all four. I see why at this point most people would say, "good enough" and call it a day.

[–] abff08f4813c@kbin.social 1 points 1 year ago

This is a good strategy.

[–] arquebus_x@kbin.social 6 points 1 year ago (2 children)

I used Redact and afterwards, before I deleted my accounts, I checked to see if there were any posts/comments still showing up under my user profile and I didn't see any. Account was over a decade old. Not sure if it really did kill everything, but even if not, it was close enough for my needs (a middle finger to the Reddit-man).

[–] trk@kbin.social 2 points 1 year ago (1 children)

If you search...

"your username" site:reddit.com

... you will probably find lots of comments that weren't deleted, but don't show up in your profile.

[–] abff08f4813c@kbin.social 1 points 1 year ago

Already did that successfully. But I lack confidence that the search engine found everything.

[–] abff08f4813c@kbin.social 2 points 1 year ago (2 children)

Cool. As long as the goal is accomplished.

That said the 1000 limit applies to your profile too. It will look empty but you can still have things that got missed, unfortunately.

[–] Anomander@kbin.social 3 points 1 year ago (2 children)

AFIK, or at least as Reddit has said in the past, the 1K limit should roll backwards as you delete recent content from it. It's a display limit to prevent data usage through scraping, not a hard limit on the database.

[–] ono@lemmy.ca 1 points 1 year ago (1 children)

It’s a display limit to prevent data usage through scraping, not a hard limit on the database.

It's neither. It's an index limit. The older messages are still in the database, but they won't appear in listings, because those are built from indexes.

[–] abff08f4813c@kbin.social 1 points 1 year ago

Yeah, that's how it looks to me as well. Thus far I've not seen anything older come back into my indexes after deleting.

[–] abff08f4813c@kbin.social 1 points 1 year ago

Can you find the cite for that? What I found from hackernews (and saw something similar on a subreddit that was supposed to be from a reddit dev itself) says differently. More like an index limit than a display limit.

[–] 2muchcaffeine4u@lemmy.fmhy.ml 0 points 1 year ago (1 children)

I've noticed this by googling my username and finding reddit comments still present that don't show up on my profile.

[–] abff08f4813c@kbin.social 1 points 1 year ago

Yeah, I did the same and then handled those manually. (I could have probably just grabbed the URLs and found a script to scub them but I had few enough that it was probably faster to do it by hand in my case.)

[–] terath@kbin.social 5 points 1 year ago (2 children)

Others have been reporting that their deleted or shredded comments are restored the next day. Are yours actually staying deleted up to the 1000 limit?

[–] abff08f4813c@kbin.social 4 points 1 year ago

I checked and so far nothing has come back that I deleted/overwrote. But it's like 24 hours or something so maybe still too soon to be sure.

Only stuff past the limit and stuff on subs that are still private/going public after I ran, so far.

[–] Anomander@kbin.social 4 points 1 year ago

I can certainly attest that Reddit is not restoring 100% of those edit-for-purge cases, there's been a couple users run those in communities I mod and those edits have remained in effect a few days later.

[–] abff08f4813c@kbin.social 4 points 1 year ago* (last edited 1 year ago)

Does anyone have experience with the pushshift API?

Looks like that might be a way forward: https://stackoverflow.com/questions/59533629/praw-how-to-get-a-reddit-user-total-number-of-submissions-when-it-is-greater-th

Edit: looks like this is out, pushshift was shut down last month, https://old.reddit.com/r/pushshift/comments/13mhuzq/api_has_been_taken_down/ - and under the new deal, regular users won't be able to use it when it opens up for business again, only approved moderators can (and likely only for approved reasons) if i'm understanding https://old.reddit.com/r/pushshift/comments/13w6j20/advancing_communityled_moderation_an_update_on/ correctly.

[–] earthling@kbin.social 3 points 1 year ago* (last edited 1 year ago) (1 children)

I think I may be running into the index thing. I can easily find old, unedited comments of mine by using site:reddit.com "username".

My next move is to request my data from reddit which, as I understand it, should contain a list of comments in .json. I then plan on iterating through those and use PRAW to edit all of my comments going back 14yrs. Then I'll delete my account.

[–] abff08f4813c@kbin.social 1 points 1 year ago (1 children)

This was my plan as well, but I'm worried that the request will only be answered after July 1st, and maybe the tools will break with the API changes that happen then.

[–] earthling@kbin.social 2 points 1 year ago (1 children)

Will that affect even small-time users like us who hardly ever use the API? That would kill things like the conversionbot, remindme, etc too.

[–] abff08f4813c@kbin.social 3 points 1 year ago

That would kill things like the conversionbot, remindme, etc too.
Well, yeah.
Will that affect even small-time users like us who hardly ever use the API?
Maybe not. But we'd need to keep to under 100 API calls a day. So - say we get our archives from reddit in July, and then we manage to do some finangling to filter out the stuff already redacted by shreddit/redact.dev/whatever that we are doing now.

Say we then have 33 comments left to redact. Or 22 comments, 11 posts. One api call to retrieve info about it (including content), one api call to edit to overwrite, and one api call to delete. That puts us at 99 api calls.

I guess someone could modify shreddit so that when running on the archive, it does 100 api calls max, then sleeps whatever time period required, then wakes up with the limit reset. Might work, just take longer. But we'll have to see.

[–] 1chemistdown@kbin.social 2 points 1 year ago (1 children)
[–] abff08f4813c@kbin.social 2 points 1 year ago* (last edited 1 year ago) (2 children)

I might need some help with this. When I try to run the script, it just tells me that there's a 404 not found error trying to connect to pushshift, then it keeps retrying until it hits a limit and gives up.

Someone else on this thread mentioned that pushshift was down, seems to be backed up by https://old.reddit.com/r/pushshift/comments/13mhuzq/api_has_been_taken_down/ and https://stats.uptimerobot.com/l8RZDu1gBG

[–] 1chemistdown@kbin.social 1 points 1 year ago

I have not gotten to the point of running everything yet because I need u/iamthatis to finish the transfer of my pixel pal off the Apollo app. There are a few other ones that also use pushshift api to clear your user info, and that is the only way to clear your stuff. ¯\_(ツ)_/¯

[–] 1chemistdown@kbin.social 1 points 1 year ago (1 children)

Just successfully deleted and overwrote all comments and posts using https://github.com/j0be/PowerDeleteSuite

I went to r/pushshift and their sticky post has a link to request deletion of everything from pushshift.

Have not deleted all accounts yet. I’m waiting to see about porting pixel pal from apollo app and my main is good communication tool. Deleted a one off account that didn’t have an email associated with it. The rest I’m using as test to see if comments/posts return. They’ll be deleted soon.

[–] abff08f4813c@kbin.social 3 points 1 year ago (1 children)

I tried PDS from that same link. It didn't do the job - I checked immediately afterwards and I had a bunch of stuff that it missed visible on my profile page.

Even when it works perfectly, it seems like it wouldn't get everything - your profile only shows 1000 at most (in my case I see about eight hundred something since some of my more recent comments are hidden behind blackedout subs but it would be one thousand otherwise). And PDS just deletes from your profile, so if you can't see it there, then PDS can't get it.

[–] 1chemistdown@kbin.social 2 points 1 year ago (1 children)

My largest account was 10 years, over 3000 comment, and over 500 posts. It missed one post out of everything. That got missed when I closed my laptop to grab the peeing 3 year old. When I opened it up, it restarted but reported the error on that one.

It does not delete saved posts/commments, hidden, upvoted, and downvoted.

[–] abff08f4813c@kbin.social 3 points 1 year ago (1 children)

500 posts is under the limit so that makes sense.

For the comments, since the oldest ones don’t show up in your profile anywhere, how did you verify that PDS deleted them?

[–] 1chemistdown@kbin.social 2 points 1 year ago (1 children)

I have not deleted the account. I can login. I’m awaiting my pushshift confirmation and going through manually scrubbing saved posts and comments. Saving some recipes and other shit. I’m keeping an eye out for the reinstatement of comments if it happens.

[–] abff08f4813c@kbin.social 1 points 1 year ago (1 children)

I’m awaiting my pushshift confirmation

Can you elaborate more on this? If there's a way to get pushshift access, even if just temporarily and for a short time, this would be really great! I could finish running the PSAW script and wipe everything now.

I have not deleted the account. I can login.

I know, you said this.

I’m keeping an eye out for the reinstatement of comments if it happens.

Sorry to be a bit thick here. This makes it sound like you didn't have a way to verify the older 2000 comments are gone and were just assuming that they are gone because your profile doesn't show anything.

Again, sorry to be a bit pushy here. It's just that, if PDS can really do this, bypass the 1000 index limits, then I am willing to give it another try (and maybe share some fixes since it seems to be broken on my browser).

But I'd rather not waste my time if that's not the case. So a confirmation (about the ability of PDS to bypass the 1k limit) would be super helpful!

After plan B (pushshift) failed (due to pushshift being down), my plan C also failed (I saw the pushshift torrent on archive.org but it's too huge for me to grab).

But I found a plan D - my earliest comments were restricted to just three or four relatively small subs across six months, and someone posted on how to download just the dumps for specific subs in specific year-months.

https://news.ycombinator.com/item?id=36038684
https://academictorrents.com/details/c398a571976c78d346c325bd75c47b82edf6124e/tech&filelist=1

So I'm going to try and download the relevant files (much smaller, only teens of MB compressed), search for my own comments, and feed them to a script for overwriting.. With this I think I'll have even my oldest comments covered.

and going through manually scrubbing saved posts and comments. Saving some recipes and other shit.

How many of these did you have? I never saved anything (didn't really know about the feature until after the blackout) but supposedly it has a 1000 index limit as well.

[–] 1chemistdown@kbin.social 1 points 1 year ago (1 children)

Go to the r/pushshift sub and the stickied post for deleting your pushshift content. Follow the directions in there.

I had pds copy and report all comments before overwriting them and have a csv file of all the comments deleted. PDS ran for 45 minutes on my largest account.

[–] abff08f4813c@kbin.social 1 points 1 year ago

I think there's been a misunderstanding - you're just trying to delete your data from pushshift itself by following https://teddit.adminforge.de/r/pushshift/comments/10yj803/removal_request_form_please_put_your_removal/ right?

I thought maybe you had a confirmation to access their API directly. Oh well.

BTW, you can run a line count on the CSV from PDS. If you really had 3000 comments and 500 posts, the CSV should have at least 3500 lines (one per post/comment). Probably will have a lot more as PDS uses quotes to make multi CSV records. But if you have less then that's a red flag that PDS might not have saved everything or erased everything.

For my much smaller account, PDS also ran for longer - over an hour. But maybe internet speeds has something to do with that as well.

[–] ono@lemmy.ca 1 points 1 year ago* (last edited 1 year ago) (2 children)

reddit has the most ridiculous api ever! A 1000 limit on viewing …

It's not an API limit. It's that they only index the most recent 1000 items. That applies to your comments, your posts, I think even posts within a subreddit. The limit applies separately to each listing, so sorting by new might find a different 1000 than sorting by controversial.

As you discovered, they don't re-index old messages when you delete new ones. But you can still reach your older posts if you can find them some other way than a listing, like in search results.

Pushshift was a good way to find your old messages, but it stopped working for me when Reddit cut off their access. I think the best way now is to make that data request.

[–] abff08f4813c@kbin.social 2 points 1 year ago

Yeah, that's what I meant - index limit.

Wish you had been around earlier - I specifically asked this before deleting my newest comments, and mostly got the impression that this would make my older ones show up. So far, 24 hours and nothing.

Someone else on this thread suggested a script using pushshift's API to find older stuff and then delete thru reddit's api. I'm going to take a look - hopefully that still works (the ones past the 1k limit are more than five years old so hopefully aren't effected by the cutoff, seeing as pushshift must have it's own database).

[–] abff08f4813c@kbin.social 2 points 1 year ago

Ok, looks like the pushshift shutdown broke the script. I still have one other shot - my comments are from dec -2017 to may 2018, so i think hopefully would be included in this dump, available as a torrent from archive.org - https://archive.org/details/pushshift_reddit_200506_to_202212

Jeez, that dump is going to be huge. Wish they'd say how big it was, I question if i have a big enough disk to whole all of reddit up to 2022...

[–] crshbndct@kbin.social 1 points 1 year ago (1 children)

I've heard good things about redact.dev? I used to wipe my whole account about once a fortnight anyway, so it was not biggie for me

[–] abff08f4813c@kbin.social 2 points 1 year ago* (last edited 1 year ago)

If you read thru what I linked, you'll find another poster who used redact.dev and got caught by the exact same thing. Even redact.dev isn't immune to the 1000 limit, unfortunately.

[–] MrComradeTaco@lemmy.fmhy.ml 1 points 1 year ago (1 children)

I have successfully used shreddit for that porpoise. 👍👍

[–] abff08f4813c@kbin.social 2 points 1 year ago (1 children)

Which one, from github or shreddit.com? Both the version you install fro github, and the website, state that they can only delete everything if you have the data retrieval response from reddit. Otherwise they just delete the latest.

Some folks were smart and have been running shreddit once a month since they got on reddit, which makes sure that they get everything (unless you are someone who can make 1000 comments in a month or something - but even in that case you just need to run shreddit more often).

[–] MrComradeTaco@lemmy.fmhy.ml 1 points 1 year ago (1 children)

I used the webpage version and it worked fine.

[–] abff08f4813c@kbin.social 1 points 1 year ago

You probably had less than 1000 public posts and less than 1000 public comments then.

But hey, as long as it worked, amirite?

Sadly the evidence is mounting that it may not work for me..

load more comments
view more: next ›