Skip to content
  • Hjem
  • Seneste
  • Etiketter
  • Populære
  • Verden
  • Bruger
  • Grupper
Temaer
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Kollaps
FARVEL BIG TECH
  1. Forside
  2. Ikke-kategoriseret
  3. How to performantly bulk remove dormant users from a forum?

How to performantly bulk remove dormant users from a forum?

Planlagt Fastgjort Låst Flyttet Ikke-kategoriseret
2 Indlæg 2 Posters 0 Visninger
  • Ældste til nyeste
  • Nyeste til ældste
  • Most Votes
Svar
  • Svar som emne
Login for at svare
Denne tråd er blevet slettet. Kun brugere med emne behandlings privilegier kan se den.
  • zipit@community.nodebb.orgZ This user is from outside of this forum
    zipit@community.nodebb.orgZ This user is from outside of this forum
    zipit@community.nodebb.org
    wrote sidst redigeret af
    #1

    Hey everyone,

    I am trying to clean out our forum from dormant and spam users. We have roughly 60000 accounts (sic!) of which about 56000 are spam accounts with no posts at all.

    I have written a small Python script which reaches into our MongoDB database and identifies ‘invalid’ accounts over a handful criteria such as the user having no posts, URLs in the profile of the user and more. And I can quite accurately sort out spam from legit accounts. The problem is when I just delete these documents and their directly related documents (e.g., for user:100 also user:100:emails, user:100:settings, …) in the Mongo database, then I end up with an at first glance first glance functional NodeBB instance. But secondary data has not been updated as NodeBB does not seem to be very atomic. The users list on the dummy-forum now has for example countless empty pages, as the users are gone but something has not been updated which feeds that user list. I already rebuilt the forum, but this did not change anything.

    I also had a look at the WriteAPI. I did not (yet) get the bulk user account deletion to work, but when I use the endpoint /api/v3/users/{uid}, my script ends up like this: Processing users: 1%| 320/56329 [11:13<32:23:18, 2.08s/user] I.e., it takes NodeBB about 2 seconds to delete a single user account. And in total this is then more than a day of processing time. I cannot be the first one with this problem, right? I did not find any solutions to this problem. I also found /nodebb/src/api/users.js:processDeletion and the lower level nodebb/src/user/delete.js:User.deleteAccount, but there is no clear path for me which database documents I have to delete and update.

    Cheers,
    zipit

    julian@community.nodebb.orgJ 1 Reply Last reply
    0
    • zipit@community.nodebb.orgZ zipit@community.nodebb.org

      Hey everyone,

      I am trying to clean out our forum from dormant and spam users. We have roughly 60000 accounts (sic!) of which about 56000 are spam accounts with no posts at all.

      I have written a small Python script which reaches into our MongoDB database and identifies ‘invalid’ accounts over a handful criteria such as the user having no posts, URLs in the profile of the user and more. And I can quite accurately sort out spam from legit accounts. The problem is when I just delete these documents and their directly related documents (e.g., for user:100 also user:100:emails, user:100:settings, …) in the Mongo database, then I end up with an at first glance first glance functional NodeBB instance. But secondary data has not been updated as NodeBB does not seem to be very atomic. The users list on the dummy-forum now has for example countless empty pages, as the users are gone but something has not been updated which feeds that user list. I already rebuilt the forum, but this did not change anything.

      I also had a look at the WriteAPI. I did not (yet) get the bulk user account deletion to work, but when I use the endpoint /api/v3/users/{uid}, my script ends up like this: Processing users: 1%| 320/56329 [11:13<32:23:18, 2.08s/user] I.e., it takes NodeBB about 2 seconds to delete a single user account. And in total this is then more than a day of processing time. I cannot be the first one with this problem, right? I did not find any solutions to this problem. I also found /nodebb/src/api/users.js:processDeletion and the lower level nodebb/src/user/delete.js:User.deleteAccount, but there is no clear path for me which database documents I have to delete and update.

      Cheers,
      zipit

      julian@community.nodebb.orgJ This user is from outside of this forum
      julian@community.nodebb.orgJ This user is from outside of this forum
      julian@community.nodebb.org
      wrote sidst redigeret af
      #2

      zipit if the accounts have no actual content you can just call .deleteAccount as that’s more lightweight.

      The reason why user deletion takes so long is because of all those cross referenced sets. There are probably opportunities for optimization there.

      1 Reply Last reply
      0
      Svar
      • Svar som emne
      Login for at svare
      • Ældste til nyeste
      • Nyeste til ældste
      • Most Votes


      • Log ind

      • Har du ikke en konto? Tilmeld

      • Login or register to search.
      Powered by NodeBB Contributors
      Graciously hosted by data.coop
      • First post
        Last post
      0
      • Hjem
      • Seneste
      • Etiketter
      • Populære
      • Verden
      • Bruger
      • Grupper