Consolidating The Console

UXabre · September 24, 2018, 10:53pm

Lately I’ve started working on the console and the parts supporting it. A big stream of ideas rush in when I fantasize on what could be the next bits that we could add. I hope this could be the area where we can discuss possible other features, needs, hopes, dreams & aspirations.

So far, these are the topics I’m currently either working on or would love to get some feedback on wants & needs. Of course, as always, I’m open to discuss things I haven’t discussed in this introduction:

VNC: removing VNC from javascript/vendors folder as to always have a chance to have the latest greatest (and IMO big plus = increase our coverage by the refactor)
SPICE: the obvious runner-up, change from (outdated) ruby gem to npm package_, which probably requires a new package to consume the latest greatest code, however latest official release from the html5-spice team, however, is 2 years old, I’m trying to get into contact with that community to better understand what the status is of that project, how releases will roll out; etc, seems that there’s contribution from time-to-time_
websockify: currently several ports need to be open to access the console remotely, so far I get push-back from my IT to do this (not unreasonable IMO), even more, this leads to more work if we’re behind a firewall. Would be great if we’d adapt this to be behind a reverse proxy which passes through to websockify’s token based target selection. Think we’d get away from the port-selection process + the security-issues currently joining the current implementation.
Will draw out a more detailed design of this implementation, based on what (I think) I’ve learned so far
websockify: remove the code from javascript/vendors and move to a OS-package which provides this instead
react: of course we’d like to refactor this further to a react-component,there’s already somebody working on a react console, perhaps this can be used for our needs as well
SSH: if we have a console package, we could probably enhance it to also support plain-old SSH
UX: would also like to discuss additional UX-improvement opportunities,
- “in your face”-kinda animations & logging to directly show what’s going on in terms of status and connectivity (…aka the creative fun stuff ) - I have some ideas here, so will probably adapt this with some mock-ups
- automatically reconnect on connection loss
- copy-paste of clipboard content (as far as I can tell, NoVNC supports it)
- popping out the console as an overlay - again, will see to draw some mock-up
- possibility to support multi-monitor set-ups
General: would it also not be great if we had the possibility to configure if a server has support for VNC? Currently console is supported for a few compute resources, but some bare-metal systems provide a VNC endpoint as well (could be that this only works under very specific conditions, but it might be worth the further investigation)?

PS, this is my first post on RFC, not sure if I’m doing this to the expectation? Or should I create separate RFC’s for these? Some of these are probably direct action points that could be handled in the issue tracker instead; but as the title said, I wanted to consolidate anything I have in my head so far regarding the console

lzap · September 25, 2018, 8:10am

But this reverse proxy would need to be websocket reverse proxy. Keep in mind we don’t want to introduce another component to the stack, or at least be prepared for some resistance across all levels.

Can you elaborate please? Not sure what you mean, VNC/Spice are very different from SSH protocol. Note there is some work around Cockpit, it already provides a SSH console over websockets kind of service.

That is a very thin ice to be, it’s incompatible and different from vendor to vendor. I think it is not worth the effort, what could be interesting however is IPMI-console over SSH, there is some standard implemented in OpenIPMI.

UXabre · September 25, 2018, 8:36am

I was actually thinking here to try and use mod_proxy_wstunnel, as I would also like to just try and use the tools that are available right now
Please note that I’m still figuring out all the details but, as said, will try to draw this out in a schematic to better communicate the idea.

Sorry, didn’t mean to put them in the same bin of protocol types, rather functionality wise they kinda serve a similar purpose: I can remotely control my systems. In my case, for instance, most OSes are text-based, no GUI. In such a case, VNC/Spice provides a lot of overhead and, quite frankly, isn’t very responsive. If we could have an option somewhere to tell the system that we prefer a text-based console that would be nice. I haven’t used cockpit yet, will definitely try it out! If that fits the use-case; then this can be considered redundant and not needed.

Absolutely agree, I think that would probably fit my use-cases better as well. In my preliminary research I also found a lot of differences between vendors.

Anyway, thank you for you valuable insights and input

lzap · September 25, 2018, 8:59am

Looks like it’s in RHEL7, that’s good. Installer changes ahead tho

I am not sure if this is a good approach. VNC/Spice serves as remote console for VM or cloud compute resources. It’s there to be able to get into BIOS or do things in Grub2, before SSH actually starts. It’s a complementary tool, it cannot be replaced.

However I understand the need of having SSH in the WebUI, I’d love to see that for ad-hoc management. This needs Foreman Server to have direct connection to managed hosts, or have some kind of ssh proxy (hop) in between. Take this into consideration when designing this feature.

UXabre · October 12, 2018, 12:02am

Allright, so, finally I can come back with an update on where I got so far. Any reviews on the subject are more then welcome.

First, the overview of the parts that will need changing/implementation (I put them in yellow, embraced by a dotted rectangle)

To make this a bit more consumable, I’ve also drawn a sequence diagram

I’ve also tried to make a breakdown:

(POC) Use file-based token plugin and mock front-end to use different token
- Learn: is token-based concept understood
- Learn: ws_proxy also supports wss out-of-the-box?
A. Create token plugin for websockify to read out HOST:PORT for compute resource
- Python → Foreman: how will we read out this information? Needs investigation

B. If plan A fails or if the benefits are not greater then the pain: write tokens / file and use standard file-based plugin
- Although it would work, perhaps it seems redundant to write down all these connection-tokens (manage file-lifecycle even?)
Adapt apache installer module to include ws_proxy and configure it correctly
- Shifts dependency on websockify to apache
We’ll also need to create a package (deb/rpm) and add this as dependency to apache installation

Last but not least, I’ve tried to list the benefits that I believe this effort will have

Single entry point è piggyback on single HTTPS port!
Out-of-the-box WSS-support by re-using existing SSL-certs
No more scanning for available TCP-ports etc
Removed logical console limit (was 30, already a nice and high number but now logically unconstrained)
Completely severed life-cycle for Websockify (run it as a service?)
Implied: easier to get latest greatest websockify & patches

For sure, the security aspects might be a bit naive from my end; would really like to learn if someone could take a look at this approach if I’m not proposing a security hole of biblical proportions (for instance, because I’d piggy back on the HTTPS endpoint of apache to provide TLS)

Very interested to hear what you guys think, nevertheless I’ll already do the PoC as it doesn’t seem to be a lot of work to get that working (in a dramatically rough form nonetheless)

Cheers!

TimoGoebel · October 12, 2018, 7:58am

Not a full reply, yet. But there is also a ruby implementation of websockify. I strongly advise to use this instead of a python one to keep the stack consistent. We could even integrate this into smart-proxy, which would be super awesome.

TimoGoebel · October 12, 2018, 8:07am

Oh, and have you checked out what ManageIQ does? I think they implemented the wsproxy as part of their rails app. That also sounds nice, but I don’t know if that plays nice with mod_passenger.

lzap · October 12, 2018, 12:32pm

I am with Timo, if you are going to tackle this thing make sure there is no better way of adopting something in Ruby.

Second, @ekohl reported he is playing with an idea of ditching Apache with Passenger for some more lightweight solution (I think he thought Puma) with Ngix.

Also it is worth making sure the communication towards hypervisor is made though foreman-proxy. Currently it runs on Webrick but we are thinking running it on Puma. That would allow websockets too (I don’t think Webrick supports them).

TimoGoebel · October 15, 2018, 7:01am

Yeah, if we go for puma instead of mod_proxy we should be able to use websockets. That way we can easily implement the Websocket -> VNC Proxy in our Rails app. I think that would be the best option. Just take a look at the links I shared above and you should have all the building blocks right at your disposal.

UXabre · October 15, 2018, 8:53pm

Hey everyone,
Thanks for the great feedback.

Based upon this I did some further reading in both ManageIQ and the websockify project. By this, I have come to the following conclusion: based upon the fact that websockify’s ruby implementation is ancient, we’d have to rewrite a very big portion of it to handle single-entry-point-to-multi-TCP-endpoint (or SEPTMTE in short). This also seems to be the path taken by ManageIQ (by looking at their code at least).

The idea would be fairly similar as before, a token would result in a lookup for a connection to a specific endpoint. Each time one connects a WS using the specified token, a new TCP connection is created to the remote endpoint, if the WS connection is closed, so is the accompagnying TCP connection. Looks pretty straightforward, and the transceiver from websockify could serve as inspiration on what things we need to consider. To top this off, tests will be written to verify it’s good behaviour. Security wise, I think I can actually borrow a bit from the notifications part. These are also sent via websocket and is probably protected by the user credentials passed. If that is the case, we’re protected by that design and thus the token-based lookup could be as simple passing a UID. As far as understood from looking at ManageIQ it seems to create a secret for each request; which I would not like to do as it could get messy to keep everything cleaned up.

On the other notes: huge fan of the work that is being done for making this work with nginx, love what the team has been doing and this is most definitely the right way. As far as my experience goes, I don’t believe this to be and impediment for the above-mentioned word, all will work just the same (but will look smoother, data wise).

Last by not least: I see that it’s important for the remote hypervisor to be connected via the smart-proxy; I love this idea but I don’t see this connection in code? AFAIK this is currently done by routing the data to the subnet where the hyper-visor is set. Perhaps there’s something that I am overlooking, but I’m guessing that, we’d at least would need to set the subnet for a compute source in order for me to figure to which smart-proxy I’d need to send my date, in order for it to be proxied to the remote hyper-visor, no?
In any case, I very much agree that that is the way to go (using the smart proxy as ws proxy to the remote hyper-visor). I would, however, ask for this task to be split-up as follows:

The current task: write ws proxy
New task: Re-use ws proxy in smart proxy project (and probably set the subnet/smart proxy per compute source)

Again, could very much be that I have not understood a part of the current design of foreman and that my last “new task” is already available by some means I am not aware of.

If this is understood and agreed upon and no other questions or request for more info would be required at this point. I can probably start work on this beginning next month. Will create a PR tagged as WIP, so that this can be reviewed as we go. Perhaps new questions will come up or improvements pop-up.

(sorry for the long posts guys)

ekohl · October 15, 2018, 10:14pm

Would it be possible to use websockify as a separate daemon with its own port? We can reverse proxy it in Apache(/nginx) if needed. I’d prefer using an existing project over writing the code ourselves if it suits our needs. Ideally it would understand a clustered without sticky sessions. This may need a small plugin in websockify or it could mean websockify could be insufficient. Just looking for the trade off here.

Doing it over the smart proxy could mean a double proxy since the Foreman <-> proxy communication could happen on a separate (non-routed) subnet. Given in the current setup Foreman directly talks to the virtualization and proxies know nothing about it I’d consider this out of scope and leave it for a future improvement as you propose.

TimoGoebel · October 16, 2018, 4:54am

I know, I brought up the smart-proxy myself, but you are totally right. I was just thinking that we could use the smart-proxy to workaround the issue that we cannot use websockets with mod_passenger (read: in core). But if we switch to puma, the limitation is gone. So I’d really make this part of the rails app in core.

The downside is that we have to teach the daemon to talk to Foreman’s database to validate the tokens, it’ll use it’s own logging, … That’s why I vote for making this part of the rails app.

Actually, we don’t use websockets in Foreman core at all. But if you make the proxy part of the rails project you should be able to use Foreman’s permission system for the websocket endoint.

UXabre · October 16, 2018, 6:47am

Oh, you’re right; don’t know where I got the idea that the notifications were done via WS. So, I’ll have to read up on what method I could use here to secure our websocket access so only authorized people that know the token can access the console.

lzap · October 16, 2018, 7:23am

Currently the API requests go straight, we would love to change that but we are not finding time. It’s unfortunate, but it is what it is. However new functionality should have been written in a proper way

Now, you are thinking right - in order to support this, you would need to create new smart-proxy module that would report new “feature”. Once new feature is reported into foreman, for each subnet you would be able to define which smart proxy is assigned for the feature. And this you can use before making a connection.

The smart proxy module can be either “dummy” (no functionality provided) or better it would be responsible for doing the connection or at least spawning external program that would do the same.

So what is the final proposal you would like to implement? Keep using websockify.py? Rewriting ManageIQ impl? I am not sure.

I second this, a separate project properly packaged in RPM/DEB as a daemon is ideal solution for us, we can still send them patches or bugfixes but we don’t want to own anything in our codebase.

I disagree, we are not doing it right, this is not a good example to follow. We will be happy if the POC is without smart-proxy of course, but this should not be reason to give up on the start line.

UXabre · October 16, 2018, 3:32pm

For sure, this was the original course of action. However, websockify is written in python (from what I understood this is not something we’d like?). From there on, it was proposed to use the ruby implementation; but, a lot of features are missing, more specifically in the token-based proxy area. Currently, this would imply that we’d have to rewrite a big part of the ruby implementation (next to a custom lookup plugin).

It is possible nevertheless, but perhaps my way of implementing it (which would probably involve dependency injecting a lookup object) is not in line with the expectation? This might also make it a bit harder for the security aspect of validating the login (but this is currently just guessing as I haven’t done an in-depth review of the possibilities).

This depends on what we want; however the bottom line is always a rewrite of the ruby websockify implementation. If I would want to dependency inject some lookup plugin, then I’d have to fork the process to run. But, if we’d want to daemonize, we’d probably need to write a different plugin which can access foreman via some API call (not sure if there’s already an API call which can return me the console’s connection details) or, another possibility (which seems dangerous IMO) is to simply pass an object of these connection details to a very simple lookup plugin which uses this to connect directly (which could thus open up for connecting to random ports on random hosts…which breaks a big security rule).

Which brings me back to the original issue: if there’d be a way to, in a decoupled way (REST API preferred), access the connection details; does it matter if we’re using python as a 3rd party component? Perhaps the break-down is easier like this:

Create API for retrieving Console Connection details (if it doesn’t exist already, the website knows so probably there is a public API available)
Create Foreman Plugin for websockify (easiest if we simply used the python version, less effort and more community); we could probably even push this back to the websockify project. → would be nice as the API is already secured by certs, so the plugin would probably only have to know the API URL + certs to use for authorization.
(this remains the same in any case): pass-through websocket from webserver to internal websocket proxy
(stage 2 IMO): feature console in smart-proxy, for correctly relaying information. This is an effort on it’s own, but, complexity wise, i’d first like to implement the things mentioned before, before doing this one. I completely agree, however, that we shouldn’t shy away from doing the right thing; I’m just trying to split this up in consumable pieces of work

The latter proposal strikes my fancy, as the benefits are greatest if we’d use the existing websockify implementation as a 3rd party provider (for which the language doesn’t matter IMO).

Very curious about what you guys think

lzap · October 16, 2018, 5:38pm

Foreman project does not suffer from NIH problem, we are quite reasonable. It looks like websockify worked fine in the past and as far I can tell it does support token plugins already today, it is configurable so you can provide your own one easily. Therefore I see no reason for maintaining our own Ruby implementation, this can get out of control easily (websockify is 800 commits by 50 contributors already).

I am not sure if pushing Foreman token plugin brings much to the websockify community, I think it can easily live in our own gitrepo. There are some implementations already in JSON, maybe we can reuse their example plugins.

I agree that smart-proxy component is not a must, but just keep this in mind when designing things.

UXabre · October 16, 2018, 5:50pm

That’s what I hoped actually. But, altough keeping the lookup plugin in our GIT repository is fine by me, but do keep in mind that this is written in python and as I recall, we didn’t want to add yet another language in the mix? Perhaps I’m missing something here but all my later proposals were based on this specific fact. Or was this more to be considered as a “ideally don’t add python, but if there’s really no other (economical) way…” statement?

So, to consolidate this (probably from my end misinterpreted) fact (deviation from previous in bold):

Create API for retrieving Console Connection details (if it doesn’t exist already, the website knows so probably there is a public API available)
Create Foreman Plugin for websockify (easiest if we simply used the python version, less effort and more community); which we store in the foreman repo
(this remains the same in any case): pass-through websocket from webserver to internal websocket proxy
(stage 2 IMO): feature console in smart-proxy, for correctly relaying information. This is an effort on it’s own, but, complexity wise, i’d first like to implement the things mentioned before, before doing this one. I completely agree, however, that we shouldn’t shy away from doing the right thing; I’m just trying to split this up in consumable pieces of work probably best if the plugin would live here in the future than

UXabre · October 16, 2018, 9:58pm

[UPDATE]
Looking at the current API, there’s in fact an API call for returning the vm_compute_attributes for each host. The only thing that appears to be missing, in fact, is, (at least in oVirt) we can start a console from the compute resource directly, for which the previous API (logically) wouldn’t work.

A second thing I’ve noticed, is that the api/hosts API is controlled by a previously selected organization/location. Meaning, if I change the location in the UI, the results will be different here. IMO a bit curious why this wasn’t resolved with a filter query instead? Nevertheless, this is the current state of the art as far as I can tell.

Provided that the API is what it is and we shouldn’t & mustn’t break the current behaviour, perhaps it’s easiest to just focus my attention on the ComputeResource API for reading out the equivalent of vm_compute_attributes there. This way I at least overcome the case where there is no host and unify the lookup method a bit.

This immediatelly also tells me the lookup key will have to be a composite, in the genre of compute_resource_id/unique_identifier_within_compute_source.

ohadlevy · October 18, 2018, 12:09pm

just FYI, manageIQ folks were not happy with the ruby implementation, mostly due to limit of how many concurrence open connections to the database (exhausting the poolsize) the websocket process introduced. (due to the fact each ruby WS server could handle less clients), there were talking about using https://github.com/anycable/anycable instead.

UXabre · December 20, 2018, 4:53pm

Allright guys, time for some (much needed) update! It took me some time to wrap my head around an, in my opinion, sound solution which ought to be scalable for the future as well.

I’ve re-evaluated all input and this is my game-plan

Add token-plugin to websockify to support JWT token as it supports our case greatly (again, IMO). Would go for assymetric authentication here to validate the data has not been altered along the way. Big pro here is that there’s no need for a feedback loop to foreman to request connection details from the compute resource anymore. Even more, it’s not specific to foreman in anyway => finishing up on this one already, tests are written and code is almost done…because jwt has much support
After this is merged, I’d probably need some help from you guys to update the official CentOS RPMs to reflect these changes. Pro here is that we kick out code copying and let both projects have their fun; instead of carrying around an older copy of the websockify code!
After this is ready: setting up foreman-proxy to allow WS connections and to forward them to a websockify daemon on the smart-proxy. Getting the data closer to the end-point at least (at least, I think this is valuable step in the right direction of not having to expose our compute resource).
Finally we only need to proxy WS data from foreman towards the correct smart-proxy, this would be done the same way we configure other proxy-functionalities. Ideally, by then, we’d have nginx as our webserver, which has good WS support. I expect installer changes here anyway to configure this one

Before continuing down the rabbit hole: let me know what you think about this one (or what could/should be improved)!