I might take over one of these one-year-free hosted Lemmy instances on my server infrastructure, but I read several times now that Lemmy’s image hosting system Pict-rs is using a lot of storage quickly.

The server I could run this on is limited to 32gb ssd storage with no easy way to expand it.

Is there some way to limit the image storage use and automatically prune old images that are not user or community icons or such?

  • Aode (He/They)@lemmy.ml
    link
    fedilink
    arrow-up
    7
    ·
    2 years ago

    pict-rs doesn’t keep track of how often it serves different images, so there’s not a good metric for pruning old images. That said, 0.4 will introduce functionality for cleaning up processed images (e.g. resizes/thumbnails), removing their files & metadata. If they are viewed again, they will be re-generated.

    0.4 will also include the ability to scale down images on upload, rather than storing the original resolution. This is not yet implemented, but it’s on my roadmap.

    All this said, it is already possible to use pict-rs with object storage (s3-compatible), rather than block storage. That’s a good option if your hosting provider offers it

    • poVoq@lemmy.mlOP
      link
      fedilink
      arrow-up
      3
      ·
      edit-2
      2 years ago

      That sounds promising. Any idea when 0.4 will be released?

      Object-storage on large cloud providers is not an option for me for various reasons (privacy, legal etc.).

      • Aode (He/They)@lemmy.ml
        link
        fedilink
        arrow-up
        4
        ·
        2 years ago

        I can only say “when it’s ready.” I think most of what I want to include in 0.4 is there, but I don’t have a ton of time to work on it currently. I might see if I can get my last feature changes in this weekend, then it will be a matter of ensuring the 0.3 -> 0.4 upgrade is smooth, and that storage migration is solid

        • Aode (He/They)@lemmy.ml
          link
          fedilink
          arrow-up
          1
          ·
          2 years ago

          Update on this: I got the feature work done this weekend, so now I’ll be testing it a bunch for upgrades and storage migrations

    • suspended@lemmy.ml
      link
      fedilink
      arrow-up
      1
      ·
      2 years ago

      Hello there! I am one of the administrators at Beehaw. If I’m reading and understanding your comment correctly, then this could solve our most pressing problem of running out of server disk space.

      Is there a time-frame when you expect to have pict-rs 0.4 available?

    • Echedenyan@lemmy.ml
      link
      fedilink
      arrow-up
      1
      ·
      2 years ago

      Is deduplication supported by re-using images already in storage if newly upload images share the same hash with them?

  • Gaywallet (they/it)@beehaw.org
    link
    fedilink
    arrow-up
    4
    ·
    2 years ago

    We would be very interested in a better method for limitation on this as well - some kind of age and size limits or automatic pruning would be wonderful.

  • nutomic@lemmy.ml
    link
    fedilink
    arrow-up
    4
    ·
    2 years ago

    Storage requirements depend entirely on the amount of images that users upload. In case of slrpnk.net, there are currently 1.6 GB of pictrs data. You can also use s3 storage, or something like sshfs to mount remote storage.

    • poVoq@lemmy.mlOP
      link
      fedilink
      arrow-up
      3
      ·
      edit-2
      2 years ago

      How much of that is cached from federated instances though? I can hardly imagine a low-traffic community like that uploaded 1.6GB of their own images already. If it is mostly cached then that can increase very quickly as new users subscribe to additional communities on other servers.

      • nutomic@lemmy.ml
        link
        fedilink
        arrow-up
        4
        ·
        2 years ago

        There is no caching, images from other instances are loaded directly from the remote server by your browser.

        • poVoq@lemmy.mlOP
          link
          fedilink
          arrow-up
          4
          ·
          edit-2
          2 years ago

          I see, well that is one risk less then. I guess with automatic down-scaling in pict-rs 0.4 it will be mostly solved as there will not be a bunch of 5mb direct uploads.

          Edit: well thumbnails at least are definitely cached, larger images too, I just tested it on slrpnk.net Edit: odd, but not all of them. Something is strange… Ah I think I know what is happening… actual user uploads do not get cached, but images from linked websites do, even if the origin is a federated instance. But those website images are usually quite well optimized.

      • Dessalines@lemmy.mlM
        link
        fedilink
        arrow-up
        2
        ·
        2 years ago

        Now you know why there needs to be a decentralized picture storage hosting that works for the web, in the same way torrents do for even larger data like video.

        You have tons of servers hosting the exact same pictures needlessly while sharing none of the hosting costs.

        • poVoq@lemmy.mlOP
          link
          fedilink
          arrow-up
          4
          ·
          2 years ago

          That was the original idea of IPFS, no? Just that they pivoted now to trying to sell you filecoins :(

          • Aode (He/They)@lemmy.ml
            link
            fedilink
            arrow-up
            3
            ·
            2 years ago

            I’m not against including an ipfs layer in pict-rs, but the complexity would go way up. Federating an image between lemmy servers would require sending the ipfs uri between servers via activitypub, and then each receiving server sending that uri to pict-rs. pict-rs would then need to decide, on each server, if the ipfs-stored image matches that servers’ image requirements (max filesize, max dimensions, etc), and if it does, then that pict-rs server would request to pin the image. I don’t know exactly how ipfs pinning works, but ideally it would only be stored locally if it isn’t already stored in at least N other locations. If the remote image doesn’t match the local server’s configuration, it could either be rejected or downloaded & processed (resized etc).

            Serving ipfs-stored images that aren’t replicated locally might also be slow, but I won’t know for sure unless I actually try building this out.

          • Dessalines@lemmy.mlM
            link
            fedilink
            arrow-up
            1
            ·
            2 years ago

            Yeah I think so, but I have no idea how “trust” works in IPFS.

            In torrents, you have to explicitly be seeding that torrent: if you don’t want to seed the file(s), you remove the torrent. With IPFS I think people can just throw whatever in there.