Hey folks! Just realized something that makes Lemmy different from Reddit. Because of the federation, your votes are not technically anonymous on Lemmy. At least, I think.

Although there’s no UI to look at a user’s voting history yet, one could conceivably be built by an instance. Perhaps coincidentally, I hear there’s instances out there populated by mostly bots?

  • UrbenLegend@lemmy.ml
    link
    fedilink
    English
    arrow-up
    48
    ·
    2 years ago

    From a technical standpoint, it’s not different from Reddit. The only difference here is that normal people can host their own instances, whereas Reddit is only hosted by the company and they can keep it under wraps.

    • o_o@programming.devOP
      link
      fedilink
      English
      arrow-up
      24
      arrow-down
      1
      ·
      2 years ago

      Agreed from a technical standpoint.

      But the implications are still interesting. One might (big might) trust Reddit as an organization not to use this data for evil, but with federation, there’s nothing stopping an instance from simply releasing all users’ voting history to be public.

      Of course, my instance didn’t even ask for an email to sign up, so my entire account is anonymous that way.

      I wonder if there are technical ways to federate votes anonymously?

      • UrbenLegend@lemmy.ml
        link
        fedilink
        English
        arrow-up
        14
        ·
        2 years ago

        Yeah, I wonder how you can federate anonymously while still maintaining defenses against vote manipulation.

        • Zagorath@aussie.zone
          link
          fedilink
          English
          arrow-up
          5
          ·
          2 years ago

          I think you could probably do something like have the votes be reported in aggregate by the instance.

          Any individual instance admin could use defences against vote manipulation by their own users, and other instances’ admins could use defences against one particular instance being widely used for vote manipulation.

          • UrbenLegend@lemmy.ml
            link
            fedilink
            English
            arrow-up
            3
            ·
            2 years ago

            I know some privacy oriented services (Brave Browser comes to mind) aggregate telemetry data like that to preserve privacy. Perhaps something like that is possible for Lemmy as well.

          • hare_ware@pawb.social
            link
            fedilink
            English
            arrow-up
            1
            ·
            2 years ago

            Someone could just run a rogue instance host all their bots on there, hiding it from anyone else.

            • Zagorath@aussie.zone
              link
              fedilink
              English
              arrow-up
              2
              ·
              2 years ago

              Right, but that’s where defederation comes in. Good faith admins can detect their own users and selectively ban them, while bad-faith admins running a server full of brigaders can be defederated if, for example, they detect anomalous patterns coming from that instance.

      • WalrusDragonOnABike@kbin.social
        link
        fedilink
        arrow-up
        10
        ·
        edit-2
        2 years ago

        but with federation, there’s nothing stopping an instance from simply releasing all users’ voting history to be public.

        Which kbin.social does.

      • JackbyDev@programming.dev
        link
        fedilink
        English
        arrow-up
        9
        ·
        2 years ago

        Maybe you could hash the user and post together somehow this way it is hashed but also unique per post. If you only hashed the username then the entirety of the user’s voting history would be known if the hash was reverted.

        • o_o@programming.devOP
          link
          fedilink
          English
          arrow-up
          8
          arrow-down
          1
          ·
          2 years ago

          Could be hashed and salted, with a random salt.

          The trouble is, then, that it’s harder to disallow users from voting multiple times if the voting user isn’t on the post’s home instance.

            • o_o@programming.devOP
              link
              fedilink
              English
              arrow-up
              4
              ·
              2 years ago

              Yes, true, the current system does allow that. But the current system also doesn’t allow users to accidentally vote twice (and it remembers your vote)— this is the feature I think would be more challenging to implement if we were to hash & salt the user’s ID.

        • kevincox@lemmy.ml
          link
          fedilink
          English
          arrow-up
          5
          arrow-down
          1
          ·
          2 years ago

          Hashing can’t effectively protect known values. If you want to know if someone voted for a post you can just hash their username and post ID. This is trivial and cheap.

          If you want to know who voted on a post you just find every username you can find and hash it. It isn’t super cheap but isn’t very expensive either. There are only 8G people on the planet, many bitcoin rigs can calculate this in seconds. Sure, you can use a more expensive hash and there may be more accounts than people but it will remain feasible.

          This is the same reason you can’t hash phone numbers in a meaningful way.

          The best option is probably just for the instance to report counts and you just have to trust it. If it is noticed that an instance seems to be inflating votes you stop counting its votes. People can work together to create blocklists for known cheating instances. Your instance would still know this but at least it is within your trust, not federated publicly.

          • JackbyDev@programming.dev
            link
            fedilink
            English
            arrow-up
            0
            arrow-down
            1
            ·
            2 years ago

            Nah, if you can properly hash a password such that it doesn’t match the same properly hashed password from a different website then you can properly hash usernames in this case such that others couldn’t reverse it or put in the same input and get the same output you created. The technology is there. It’s more of a question if it’s really worth it. At least for now I’m not concerned with a malicious admin leaking someone’s vote history.

            https://en.wikipedia.org/wiki/Salt_(cryptography)

            • kevincox@lemmy.ml
              link
              fedilink
              English
              arrow-up
              2
              ·
              2 years ago

              No, hashing passwords is a different case because you know what the user is so you can use a unique salt. The password itself is also high entropy. For this use cause you can have at best per-post salt.

              Think about it. The task that you are asking for is to quickly check if a user has voted for a post to prevent duplicates. So literally the operation you want is the same as you are trying to prevent. If you can enumerate users then you an by definition check if they have voted for a post.

      • CorrodedCranium@lemmy.fmhy.ml
        link
        fedilink
        English
        arrow-up
        3
        arrow-down
        2
        ·
        2 years ago

        But the implications are still interesting. One might (big might) trust Reddit as an organization not to use this data for evil, but with federation, there’s nothing stopping an instance from simply releasing all users’ voting history to be public.

        Another potential privacy issue is that deleted content stays server and I believe it’s similar with posted images.

        • o_o@programming.devOP
          link
          fedilink
          English
          arrow-up
          6
          arrow-down
          1
          ·
          2 years ago

          I think this issue is overblown. Instances of Lemmy might run modified code and choose to save things that the user intended to delete, of course, but the default setup of Lemmy seems reasonable to me in terms of how it treats deletion.

          Currently it keeps deleted posts forever to allow users to un-delete if they choose, but deleting your account clears everything. And I believe there’s work in progress to discard deleted posts after 30 days. Details here: https://github.com/LemmyNet/lemmy/issues/2977

          • CorrodedCranium@lemmy.fmhy.ml
            link
            fedilink
            English
            arrow-up
            3
            ·
            edit-2
            2 years ago

            Thank you for pointing this out. I was looking into privacy in relation to Lemmy and came across this post where I got the wrong idea I guess. I couldn’t find much else online at the time

            And I believe there’s work in progress to discard deleted posts after 30 days.

            That would be a nice addition

          • sinnerdotbin@lemmy.ca
            link
            fedilink
            English
            arrow-up
            2
            arrow-down
            1
            ·
            edit-2
            2 years ago

            This keeps on being asserted but it is far from true. If defederation happens or your local goes offline, posts/comment history/profile/votes will remain on other widely used instances and out of your control.

            A large instance has already defederated with 2 other larger instances. If you run a personal instance I feel it will become very, very common to be be locked out of managing your data.

            You can expect defederation to happen all the time as that is a deliberate part of the open federated model.

            And that is to say nothing about federation simply breaking sometimes.

            I already have been locked out of content that exists on other instances that will remain forever and I’ve only been around a short while. I don’t care personally, but people keep asserting this claim that only bad actors or scrapers will dupe your data. Federated data is very different than a non-federated copy for many reasons and that matters to some people. Everyone should understand deleting your account, or modifying your content will often not remove your content outside your instance, and many people engage outside their local. It will likely exist in federated, Lemmy searchable form forever in some capacity (in the current iteration anyway).

            Not trying to spread FUD, but if we want to maintain users they have to be educated as they will find out eventually and not be happy.

            I have some working drafts on policies for admins to help them navigate and explain their responsibilities to their users.

            It is a bit of a weird read outside of the context, but this is an optional primer I have drafted that will hopefully help explain the distinctions:

            https://github.com/BanzooIO/federated_policies_and_tos/blob/main/optional-privacy-policy-intro.md

            • o_o@programming.devOP
              link
              fedilink
              English
              arrow-up
              4
              ·
              edit-2
              2 years ago

              Yes, that’s a fair point. Just because you send a “I have deleted this message” signal out into the universe doesn’t mean that everyone will receive or obey it.

              I assumed that was understood.

              But that’s very different from instances intentionally and malevolently keeping data despite indicating to users that it was deleted, which is what I think folks’ privacy concerns are about.

              EDIT: What I mean is that the federation model is inherently non-private in a certain sense (but in the same sense that someone could take a screenshot of your Reddit comment and your deleting your comment won’t delete their copy). But Lemmy is not egregiously misusing data.

              • sinnerdotbin@lemmy.ca
                link
                fedilink
                English
                arrow-up
                2
                arrow-down
                1
                ·
                edit-2
                2 years ago

                This is largely assumed by someone like yourself or I who understands the implications. I am finding it evident that a lot of people are not aware.

                There is also a distinction to a potential screenshot, a scrape or archive no one visits, and a federated copy on a widly used instance you have lost access to.

                I edited my comment above to include a project I am working on to hopefully help admins get this across and educate users on how to appropriately engage to their comfort level.

                • o_o@programming.devOP
                  link
                  fedilink
                  English
                  arrow-up
                  4
                  ·
                  2 years ago

                  I appreciate your commitment to this privacy consideration. I personally don’t think it’s the hill I’d prefer to die on, but I welcome your contributions.

                  • sinnerdotbin@lemmy.ca
                    link
                    fedilink
                    English
                    arrow-up
                    1
                    arrow-down
                    1
                    ·
                    2 years ago

                    Thanks! I’m for mass adoption and want admins to succeed. That starts with keeping users educated (and admins covered).

    • Venator@kbin.social
      link
      fedilink
      arrow-up
      1
      ·
      2 years ago

      That’s not really true, since on reddit only the one host can see the votes, as opposed to anyone who is willing to put the effort in.

      • UrbenLegend@lemmy.ml
        link
        fedilink
        arrow-up
        1
        arrow-down
        1
        ·
        2 years ago

        That’s exactly what I mean when I said:

        whereas Reddit is only hosted by the company and they can keep it under wraps.