How do you feel about the massive influx of users?

    • hglman@lemmy.ml
      link
      fedilink
      arrow-up
      3
      ·
      1 year ago

      Thats certainly not the right kida of storage system for a site like this.

    • federico3@lemmy.ml
      link
      fedilink
      arrow-up
      2
      ·
      edit-2
      1 year ago

      Indeed PostgreSQL is not designed for large scale horizontal sharding with eventual consistency. Also ClickHouse is designed for OLAP workloads likely making it even less suitable.

      Regardless of database choice, Lemmy is still centralized. Discussion groups are cached across instances but not truly distributed. This is the big blocker.

    • 777@lemmy.ml
      link
      fedilink
      arrow-up
      2
      ·
      1 year ago

      I think probably a pluggable storage backend is the best move. For example, any cloud hosted instance could use a native document storage format such as dynamodb, which is often quite cheap or free for small use-cases.

      • bobaduk@lemmy.world
        link
        fedilink
        arrow-up
        2
        ·
        1 year ago

        Bit of a pain to store in Dynamo, though. You’d need to write a bunch of different views, I think.

        One comment thread makes sense as a partition, but listing threads is going to be awkward, and search is basically a no-no.

        • AbominableSlinky@lemmy.world
          link
          fedilink
          arrow-up
          2
          ·
          1 year ago

          Not necessarily a pain, you just have to model the data very differently in something like DynamoDB. Those views are secondary indexes.

          Search, though, you’re right. You’d be running ElasticSearch along side it and the cost and complexity starts to go up. Or just abandon having a functional search entirely, like Reddit did…

          • bobaduk@lemmy.world
            link
            fedilink
            arrow-up
            1
            ·
            1 year ago

            Ja, but you need an index for each thread, some kind of time partitioned thread index for each community, same for all.

            Then you need to query all comments or posts by user, so that’s another index, then you need some way of querying for hot, or controversial or what have you.

            It’s doable, but fiddly. Tempted to have a go though!

            • 777@lemmy.ml
              link
              fedilink
              arrow-up
              1
              ·
              1 year ago

              I just mentioned Dynamo as an idea without thinking about it too much.

              Dynamo works well for one and two dimensional data structures but for more complex things you probably want a regular database. I expect it could be done efficiently but not at a good cost and without tons of technical difficulty.