Few days ago I discovered that any time you follow a fediverse account on a different instance than your one, the server you are on mirrors the original content of any toot/post, and I think this is a huge issue, because then every Fediverse server/instance could quickly get overloaded from contents from other servers! Am I right? I’ve always thought that toots/posts from other instances were just “empty shells” retrieving the content on demand.
Lemmy only mirrors text, which takes almost no space at all. What does take a lot of space is media, so mirroring that can fill up your disk really fast. Inthat case, it is necessary to delete old, cached media which can easily be automated.
I was following some Birdsitelive accounts from Mastodon, Admins regularly block every Birdsitelive instances they see being used there.
I once made a tool that indexes every image posted on the fediverse. It ran from Jan 2019 to about August 2019. I don’t remember the exact numbers but including the original full size, the thumbnail, the meta data including the author idkey, profile, and avatar, was sum total under a few gigs. It was indexing something like 2-8 images a second (it was highly variable).
Of course, the fediverse is bigger now. I wonder what it would be.
No you are not right. It’s not a problem. Pleroma doesn’t even download the media, Mastodon does but it’s not that much. Each person follows just a few hundred accounts, they don’t produce too much content. The fedivers is pretty large already and it hasn’t become a problem.
The problem is quite similar to Matrix, which also mirrors all chats. Several large Matrix servers were shut down because ever growing databases reached hundreds of gigabytes and it became too expensive to operate.
I think the fediverse has the advantage that it does not depend on an over engineered distributed database like Matrix which makes it much harder to delete old stuff, but it does still limited the size and interconnectivity a ActivityPub server can grow to as well.
The biggest servers actually have an advantage, because the overlap of followed posts between users would be huge. But yeah having lots of images and short videos does result in big amounts of data.
I was following some Birdsitelive accounts, Admin of the Mastodon I’m on regularly block every Birdsitelive instances they see being used, they answered me the reason is the overload of content.
How many users does your instance have?
It’s mastodon.uno, at now 21300 users
I don’t think it’s a good reason to block the birdsitelive.
This is the answer I got from my admin: «toots are stored locally for a simple reason: let’s say a toot goes viral, then hundreds of instances would have to upload the message with photos or video from the unfortunate person’s server, which could collapse. Uploading a copy locally saves a lot of stress on the servers.» What do you think?
That’s what I said. But that still doesn’t mean every toot ever is stored, just those that are from users that are being followed. Still not a good reason. Do they want you to use their server or not??
Isn’t it much more efficient to link (rather than to mirror) ? What am I missing here?
This is exactly my same doubt.
This is the answer I got from my admin: «toots are stored locally for a simple reason: let’s say a toot goes viral, then hundreds of instances would have to upload the message with photos or video from the unfortunate person’s server, which could collapse. Uploading a copy locally saves a lot of stress on the servers.»
What do you think?
Makes sense for viral content, but most content isn’t viral.
Epicyon has a maximum number of remote/federated posts, and above some configurable number will delete or archive the oldest ones (unless they are DMs). This helps to keep the storage space within a finite upper bound and prevent running out of disk space.