Proposal: Delete old manuals from our site

IMHO the old pages should stay available but not be indexed after some point. It’s quite likely that search engines already give penalties for the (heavily) duplicated content. That means we could disallow robots from indexing anything that’s not supported anymore.

1 Like

I do this actually in my PR for some of the manuals which are still relevant: 1.15 - 1.20. The rest is moved into PDFs and there is redirect PR in foreman-infra also with custom 404 as a safety net. So it’s a combination, best of the both approaches.

Oh, I must have missed that then, I’ll recheck the PRs :stuck_out_tongue: Good job :slight_smile:

Yeah, I thought about robots.txt but I’ve learned that this is now not recommended and using HTML meta tag is preferred. So I am adding those tags in the PR instead.

Still have 1.7 running

Great to hear that, you should be able to download manual in PDF format.

Hello,

I would like to draw attention to the PR, Greg started review but now as he’s busy with other things I need someone to review this and merge:

https://github.com/theforeman/theforeman.org/pull/1256

We really need to do something with old manuals, these are just copies of the same content over and over again. It’s about 2GB of data which we don’t probably need for an online access, I suggest to create tarball backups with the old versions in the PR providing a sane custom 404 error explaining what happened and where to find the contents.

Busy yes, but still paying some attention :wink:

What little support I had for this idea was based on the concept that hosting was expensive. Now that our bandwidth usage is much much lower (thanks Evgeni!) and disk space is cheap, I see no reason to make life difficult for users. Why is hosting 2Gb of files difficult?

I agree not indexing them is good, we don’t want them in search engines. But it should be no harder for a user to get the 1.5 manual than the 1.19 manual. Can we just add them to robots.txt and be done with it?

1 Like

The main reason is not space on our server but the fact that it takes about a minute on my beefy system to build manuals (or the site) locally. This is crazy, I feel like 20 years back waiting for a C++ project to build… :wink:

Disk space is cheap is really not an argument for let’s make it clean. Grabbing a PDF however is a valid point. I still think it is worth cleaning it up

Then we’re back to the disagreement we had on Jan 28th - that, in my view, 1 minute of your time is categorically not more important than the ease-of-access of our users to correct documentation. If even 10 users struggle for an extra 6 seconds to find their docs, then the net result is more time wasted by the community as a whole.

This just sheer impatience. Put the needs of users (all users, not just those on the latest version) first, and go get yourself a coffee.

(NB, still a +1 to no-indexing for old manuals, that’s a great idea)

Well, I am going to redo my patch and only add stop-index meta tags then.

1 Like

I see that discovery has been merged, however I am gonna do new PR and add only stop-index flags to core and all other plugins. I am leaving to plugin maintainers to decide if they want to remove old manuals and keep them in PDF files as I did for discovery. Core stays there.

1 Like

https://github.com/theforeman/theforeman.org/pull/1375