Proposal: Delete old manuals from our site


#1

Hello,

generating of our site is super slow (a minute on my beefy PC), it went up to the point when you need to increase kernel option for inotify

FATAL: Listen error: unable to monitor directories for changes.
Visit https://github.com/guard/listen/wiki/Increasing-the-amount-of-inotify-watchers for info on how to fix this.
rake aborted!

While I was working on a PR that blacklists old manuals from search engines (https://github.com/theforeman/theforeman.org/pull/1256) I figured out that it would be beneficial to simply delete old manuals (core + plugins) from the site completely or replace them with some stub or redirect.

Opinions?


#2

Kind of funny, but this would help us with Katello 3.1 vs. 3.10 confusion in the docs.

As a policy, we support the last two releases. I think we should go ahead and make a similar policy for the docs as well. Maybe the last 5 or 10 releases? Katello rarely gets bug reports from 10 releases back, but 6 isn’t uncommon (e.g. we still get some 3.4 bug reports).

I also wonder if we’ve done any research on making Jekyll faster (I don’t see a -j flag or anything).


#3

I agree on that…docs should only mention versions in active support and additionally maybe 2-3 versions back.


#4

Disappearing links are always annoying and bad for search engines. Luckily there’s a solution: redirects. We probably need to implement this in Apache since it’s a static site.


#5

This would work: https://help.github.com/articles/redirects-on-github-pages/

So let’s just agree on last N releases? I like 5 that’s fair.


#6

Only the manual isn’t really a GH page but a jekyll site we deploy on our server (not that that prevents us from doing redirects).


#7

That was my thought too, but it’s client side redirection. Both the <meta> and JS redirect methods. I don’t know if search engines also respect this and see the new page as the natural new page for all the incoming links.


#8

Ok thanks. I haven’t realized we deploy our site ourselves :slight_smile:

Cool, are you guys fine with 5 last versions for plugin/core manuals? The remaining 4 would include NOINDEX meta tag for google so at least search engines learn fast to offer the stable one.


#9

One more thing, do you prefer commiting .htaccess directly to the git codebase? I do, kinda assume we are running Apache httpd but I can be again wrong.


#10

We are running Apache. https://github.com/theforeman/foreman-infra/blob/master/puppet/modules/web/templates/web.conf.erb already includes some rewrite rules.


#11

Thanks.

So would like to come to agreenment first because rebasing those PRs with deleted stuff won’t be easy I guess. I am gonna do PR with all manuals deleted except last 5 versions for both core and plugins. All old versions (except the stable) will have “noindex” meta tag to prevent search engines from indexing. And for all deleted pages I will create redirects in Apache configuration.

@tbrisker @Gwmngilfen


#12

I’d rather not see that content deleted entirely. We do offer the packages right back to 1.0, and if a user needs to do an upgrade from an ancient version, they’ll want to see the upgrade notes / known issues for each version as they follow the upgrade path.

That said, I do agree the search index issue is annoying. Could we move older manuals to a PDF download? Then the people who really need them can still get them, but it’s de facto not indexable :slight_smile:

+1 on the rest of this, especially redirects in Apache (awkward, but what can you do?). I just don’t want to make it harder for users of old versions to upgrade - otherwise they won’t upgrade :stuck_out_tongue:


#13

Ok how about moving them out to some /legacy/ directory? No markdown just pure HTML if that’s possible. But amount of files in a git checkout won’t go down. Maybe tarballs would do it!

Edit: I like tarballs on our downloads.tf.org site with a link from our docs.


#14

PDFs would be a good idea I think