Discussion: anonymous stats reporting

(apologies for cross-posting, this is relevant to both lists)

Hi all,

Recently the issue of reporting some stats back to the project has
come up [1] and as this is a contentious issue, I'd thought I'd bring
it up on the lists, to get feedback.

Anonymous stat reporting isn't new (I can't recall how long popcon has
been in Debian, but it's a while) and provides valuable data to the
developers about how to prioritize workload. Against that, there is
always privacy concern to be handled, which is what makes this a
treacherous area.

What follows are just my thoughts, to start the discussion:

It seems to me that if we have:

  • opt-in
  • anonymised
  • aggregated

data, then privacy concerns should be mitigated. The opt-in is
especially important; I'd even say it needs to be more effort than a
Setting, and probably something like enabling a cronjob. For
anonymising, whatever we set up to receive incoming stats would drop
things like IP address immediately, and just add the useful data to a
store.

For data, I can see uses for:

  • Foreman version
  • Proxy version
  • Plugins enabled and versions
  • Compute resources enabled

Basically, all the stuff on the About page :). For an example psuedo-json:

{'stats':
"foreman_version": "1.9.2",
"proxy_versions": ["1.9.2","1.8.4"],
"compute_resources": ["libvirt","ec2"]
"plugins": {
"discovery": "4.0.0"
}
}

We might also want to look at (a) proxy plugins, and (b) hammer
plugins, but that could be tricky.

Obviously all this would be developed in the open (PRs to foreman-core
for sending, foreman-infra for the reciever). I'd also like to display
the current data via something (possibly on the website or a custom
redmine plugin) - if it's properly anonymised, then we can share it.

We'd like to know your thoughts!

Greg

··· -- [1] http://theforeman.org/issues/11898

> We'd like to know your thoughts!

As a sub-opt-in option, we could also notify users about security
updates (by email or via warning bubbles or other means).

··· -- Later, Lukas #lzap Zapletal

> (apologies for cross-posting, this is relevant to both lists)
>
> Hi all,
>
> Recently the issue of reporting some stats back to the project has
> come up [1] and as this is a contentious issue, I'd thought I'd bring
> it up on the lists, to get feedback.
>
> Anonymous stat reporting isn't new (I can't recall how long popcon has
> been in Debian, but it's a while) and provides valuable data to the
> developers about how to prioritize workload. Against that, there is
> always privacy concern to be handled, which is what makes this a
> treacherous area.
>
> What follows are just my thoughts, to start the discussion:
>
> It seems to me that if we have:
>
> * opt-in
> * anonymised
> * aggregated
>
> data, then privacy concerns should be mitigated. The opt-in is
> especially important; I'd even say it needs to be more effort than a
> Setting, and probably something like enabling a cronjob. For
> anonymising, whatever we set up to receive incoming stats would drop
> things like IP address immediately, and just add the useful data to a
> store.

My first thought was opt-in so I fully agree that should be an absolute
minimum.

> For data, I can see uses for:
>
> * Foreman version
> * Proxy version
> * Plugins enabled and versions
> * Compute resources enabled
>
> Basically, all the stuff on the About page :). For an example psuedo-json:
>
> {'stats':
> "foreman_version": "1.9.2",
> "proxy_versions": ["1.9.2","1.8.4"],
> "compute_resources": ["libvirt","ec2"]
> "plugins": {
> "discovery": "4.0.0"
> }
> }
>
> We might also want to look at (a) proxy plugins, and (b) hammer
> plugins, but that could be tricky.

Optionally the number of hosts (on a logarithmic scale rounded to 1
decimal to anonymise). Having an idea of the average install size could
be very helpful though I suspect the most interesting installations
wouldn't call home in the first place.

··· On Tue, Oct 06, 2015 at 12:23:30PM +0100, Greg Sutcliffe wrote:

Obviously all this would be developed in the open (PRs to foreman-core
for sending, foreman-infra for the reciever). I’d also like to display
the current data via something (possibly on the website or a custom
redmine plugin) - if it’s properly anonymised, then we can share it.

We’d like to know your thoughts!

Greg

[1] Feature #11898: As a user, I would like to see if a new update of Foreman is out. - Foreman

I think that was the intent by the original request, but the issue is that
this becomes sort of reporting, e.g.
a foreman installation (either the browser that connects to foreman, or the
foreman server itself) tries to connect to a service (that foreman team
maintains) and checks if there is a new version compared to the current
running version.

That by itself, will provide (very basic) anonymous usage information (e.g.
IP addresses of foreman / browsers) and potentially the plugins and their
versions.

Once you start thinking of this, it could grow very fast (e.g. one idea was
to create a plugin repository / forge, for users to view potential plugins
and install easily).

I believe Greg is trying to figure out how people feel about potentially
having their foreman server(s) or browsers call home every now and then
before we start building infrastructure to support it.

Ohad

··· On Tue, Oct 6, 2015 at 2:36 PM, Lukas Zapletal wrote:

We’d like to know your thoughts!

As a sub-opt-in option, we could also notify users about security
updates (by email or via warning bubbles or other means).

Exactly, thanks for clarifying. It's true that we could solve the
original ticket with no kind of phone-home behaviour, but if people
are ok with it, then we might as well solve the whole problem in one
go.

Greg

··· On 6 October 2015 at 12:42, Ohad Levy wrote: > I believe Greg is trying to figure out how people feel about potentially > having their foreman server(s) or browsers call home every now and then > before we start building infrastructure to support it.

> > I believe Greg is trying to figure out how people feel about potentially
> > having their foreman server(s) or browsers call home every now and then
> > before we start building infrastructure to support it.
>
> Exactly, thanks for clarifying. It's true that we could solve the
> original ticket with no kind of phone-home behaviour, but if people
> are ok with it, then we might as well solve the whole problem in one
> go.

Sure, I was just throwing in an idea, got that.

··· -- Later, Lukas #lzap Zapletal