Proposal: Delete old manuals from our site

That was my thought too, but it’s client side redirection. Both the <meta> and JS redirect methods. I don’t know if search engines also respect this and see the new page as the natural new page for all the incoming links.

Ok thanks. I haven’t realized we deploy our site ourselves :slight_smile:

Cool, are you guys fine with 5 last versions for plugin/core manuals? The remaining 4 would include NOINDEX meta tag for google so at least search engines learn fast to offer the stable one.

One more thing, do you prefer commiting .htaccess directly to the git codebase? I do, kinda assume we are running Apache httpd but I can be again wrong.

We are running Apache. https://github.com/theforeman/foreman-infra/blob/master/puppet/modules/web/templates/web.conf.erb already includes some rewrite rules.

1 Like

Thanks.

So would like to come to agreenment first because rebasing those PRs with deleted stuff won’t be easy I guess. I am gonna do PR with all manuals deleted except last 5 versions for both core and plugins. All old versions (except the stable) will have “noindex” meta tag to prevent search engines from indexing. And for all deleted pages I will create redirects in Apache configuration.

@tbrisker @Gwmngilfen

I’d rather not see that content deleted entirely. We do offer the packages right back to 1.0, and if a user needs to do an upgrade from an ancient version, they’ll want to see the upgrade notes / known issues for each version as they follow the upgrade path.

That said, I do agree the search index issue is annoying. Could we move older manuals to a PDF download? Then the people who really need them can still get them, but it’s de facto not indexable :slight_smile:

+1 on the rest of this, especially redirects in Apache (awkward, but what can you do?). I just don’t want to make it harder for users of old versions to upgrade - otherwise they won’t upgrade :stuck_out_tongue:

Ok how about moving them out to some /legacy/ directory? No markdown just pure HTML if that’s possible. But amount of files in a git checkout won’t go down. Maybe tarballs would do it!

Edit: I like tarballs on our downloads.tf.org site with a link from our docs.

PDFs would be a good idea I think

Unfortunately, our manuals are not in an easy format to generate PDFs from. I mean, if there is a volunteer let’s do it. Here are my next steps, speak up now or never:

  • Gonna create HTML tarballs from all old manuals until version 1.14
  • Versions 1.14-1.19 and nightly directory stays untouched but I will prevent search engines from indexing them
  • Stable version (1.20) stays as-is.

Here is the PR I call for help with review.

Actually, I think I found an easy way to generate PDF - Print to PDF for a page, works fine. So will be providing PDFs then.

One question tho - where to put those PDF files? Do we do regular backup of downloads.tf.org? Can I simply create a folder there @ekohl or @Gwmngilfen ?

We don’t at the moment, but we could do so, the backup system is all in Puppet anyway.

I’ll review the PDFs once you have them, they need to be of a high quality to replace the website manuals. I’m still not convinced that shaving a few 10s of seconds off the compile time of the site is a good excuse for deleting user-facing content, so they’d better be good :wink:

I created them using print feature from a web browser, our formatting switches over to “printer friendly” and it looks just fine. See it yourself when you do a print preview in our browser.

I expect to cut the generation time to half or something like that. We need to start some day, status quo is not a solution. If you have some numbers showing that we still have users from some particular release let’s say 1.10+ I have no problems moving the cut around.

Correct me if my maths is wrong, but half of one minute is 30s, which is indeed “tens of seconds”.

I cannot prove usage numbers, as the RSS widget that we use to get an idea of version stats only went live in 1.17. However, I don’t think it matters - you’re suggesting using compile time (something that only really affects core devs) as a reason to delete content, which has no dependence on compile time. To put that another way - from the user perspective, they lose something (content) for no gain (because they aren’t compiling the site).

I’ll note that I don’t see issues with inotify, and compile time for jekyll serve is currently ~56s. I think that’s fine. If the build were broken it’d be different, but it isn’t.

I realise I previous backed some of this in my earlier post but the more I think about it, the more this feels like convenience for the few over hardship for the many. I will still back this, but only once I’m satisfied that the end user can still get at this content easily. Ideally that means the switcher in the manual page should offer the PDFs when selecting an old version - I don’t want users having to hunt for them, or come into the IRC channel to ask where they are. It should be obvious from the manual page where to get them. I’ll go add that comment to the PR now :stuck_out_tongue:

IMHO the old pages should stay available but not be indexed after some point. It’s quite likely that search engines already give penalties for the (heavily) duplicated content. That means we could disallow robots from indexing anything that’s not supported anymore.

1 Like

I do this actually in my PR for some of the manuals which are still relevant: 1.15 - 1.20. The rest is moved into PDFs and there is redirect PR in foreman-infra also with custom 404 as a safety net. So it’s a combination, best of the both approaches.

Oh, I must have missed that then, I’ll recheck the PRs :stuck_out_tongue: Good job :slight_smile:

Yeah, I thought about robots.txt but I’ve learned that this is now not recommended and using HTML meta tag is preferred. So I am adding those tags in the PR instead.

Still have 1.7 running

Great to hear that, you should be able to download manual in PDF format.