this post was submitted on 17 Jul 2023
508 points (97.9% liked)
Asklemmy
43962 readers
1283 users here now
A loosely moderated place to ask open-ended questions
If your post meets the following criteria, it's welcome here!
- Open-ended question
- Not offensive: at this point, we do not have the bandwidth to moderate overtly political discussions. Assume best intent and be excellent to each other.
- Not regarding using or support for Lemmy: context, see the list of support communities and tools for finding communities below
- Not ad nauseam inducing: please make sure it is a question that would be new to most members
- An actual topic of discussion
Looking for support?
Looking for a community?
- Lemmyverse: community search
- sub.rehab: maps old subreddits to fediverse options, marks official as such
- !lemmy411@lemmy.ca: a community for finding communities
~Icon~ ~by~ ~@Double_A@discuss.tchncs.de~
founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
It'll happen if Lemmy gets big enough. I only worry about search engines getting tangled in the natural duplication of Lemmy posts.
Like, if a web crawler sees a Beehaw post, and then seees Lemmy.ml's mirrored page of that same post, could it just show up as two different results? Could it work against the SEO in that it gets marked as "duplicate" or "spam" content in some way?
The ideal solution is that the page has a canonical tag, telling search engines what the main URL for the content is: https://ahrefs.com/blog/canonical-tags/. I don't know if Lemmy already does this, nor do I know how well canonical tags work cross-domain as I've only ever used them for content on the same domain.
I checked and it does, this post's canonical is:
<link data-inferno-helmet="true" rel="canonical" href="https://merv.news/post/26663">
Weirdly it uses OP's instance, in this case merv.news. Shouldn't it be the instance where it was posted?
Canonical tags were added in 0.18.2.
I would think it's because users only interact with their own instance. They would need to post it to their instance first before it can be forwarded to the appropriate community's instance.
If/When Lemmy and other federated services grow to the point that's an issue in major search engines, said search engines should be smart enough to group and/or suppress mirrored results.
You can see that sort of thing in Google now for major sites like Reddit and StackOverflow, though it's more along the lines of "the same question in a different post".
You can also, in the interim, just pick an instance and add,
site:lemm.world
or whatever instead of just "lemmy".It might help it, as well. I believe in the Yandex source code leak they detail their algorithms SEO techniques. Might be a good lead