RFC - Cleanup old WIKI pages

Many times while I was searching for some stuff related to Foreman I end up at Foreman wiki.
This is not ideal for our users, specially the ones who starts with Foreman, because the wiki pages are usuallt in on of these free states:

  • Outdated, a lot of articles are from 2012
  • 404 - links in page are dead
  • Moved to theforeman.com

Feel free to browse the Wiki by yourself and see the content, here are some examples:

Hitting these pages is time consuming and doesn’t provide any useful information, so I’d like to start discussion what we can do with it and how to do proper cleanup.

What we can do:

I’m not sure what options do we have with Redmine, from what I’ve heard redirecting wiki pages to different site is not possible, so I’m open to any suggestions / ideas.

Last thing to mention, I’ve been looking only at Foreman project, but we shouldn’t forget to check all other projects/plugins and ensure that they don’t have the same problem.

4 Likes

We absolutely need to fix this, good find. Ideally, removing the content completely and providing links to up-to-date sites/documents so people coming from Google searches will find it. This feels like the easiest thing to do.

Shall we setup some kind of “Fix Our Wiki Friday” event where we would brainstorm over a meeting fixing these issues?

1 Like

A certain :+1: for cleaning up the wiki.

I would start with a redirect to the current content. There are some infra docs at https://github.com/theforeman/foreman-infra/tree/master/docs for example. So it can be as simple as replacing the whole page with a “this content is now at NEW LOCATION” text. Foreman Architecture - Foreman is IMHO a good example. We could also include [MOVED] in the title.

Looking at https://projects.theforeman.org/projects/foreman/wiki/index we can at least say that the various team sprints can probably be ignored or even deleted. Perhaps @Marek_Hulan can weigh in on that.

@lstejska if you want to work on this and lack permissions, I’d be happy to help out with that.

Yeah not bad idea, I actually planned it for myself as perfect Friday task :slight_smile:

Ideally, removing the content completely and providing links to up-to-date sites/documents so people coming from Google searches will find it. This feels like the easiest thing to do.

@ares had good point about history for each wiki article. Even if we delete the content & say "hi, content is moved here: " there are still x history versions that can be accessed from search engine, for example history for Tutorials article.

I would start with a redirect to the current content

Not sure if my account is missing permissions but I can’t see any option for redirect while editing the wiki articles.

So it can be as simple as replacing the whole page with a “this content is now at NEW LOCATION” text. We could also include [MOVED] in the title.

[MOVED] in title is good idea, we should do that.

One easy thing that we could do is to stop indexing our wiki pages.
I assume that Redmine is running behind Apache, so we could add no-index header to all wiki pages, something like this:

<IfModule mod_headers.c>
    <If "%{THE_REQUEST} =~ m#\s/+wiki/?[?\s]#">
        Header Set X-Robots-Tag "noindex, noarchive, nosnippet"
    </If>
    <Else>
</IfModule>

(note: code above shamelessly copy-pasted from stack overflow without any actual testing, just POC)

1 Like

Add me to the CC then!

We can create a list of pages that can be deleted as well, like Ewoud mentioned, and then someone from infra team could delete them for us in a batch or script. I am assuming these are either files or database records with some name.

No idea if Redmine is behind Apache, but noindex is a good idea for sure.

All Wiki pages:

1 Like

These both look empty but if you look at all pages (Foreman Maintain - Foreman & Hammer CLI - Foreman) there are still a few left. Mostly sprint reports but this page List of Plugins - Hammer CLI - Foreman still looks relevant. If we have a new home for that, I’d be happy to disable the wiki modules for those projects.

Another potential solution: we can use Github topics. That way https://github.com/topics/hammer-cli-plugin becomes a dynamic list. I’ve added the hammer-cli-plugin topic to the repos on that page where I had access. The Katello and ATIX plugins are thus missing and probably more repos.

Note that Foreman itself also links to some wiki pages. We can’t just turn those off. It needs a patch to Foreman. Ideally we’d have a redirect once we turn the wiki off.

I can at least see that https://projects.theforeman.org/projects/foreman/wiki/Mail_Notifications is linked but there may be more.