Smart Proxy: Future Design, Scaling and Use Cases

lzap · January 15, 2021, 3:57pm

This is indeed a good point, something I have never considered myself. By the way, to this day, we have never finished SELinux policy for proxy. I started the effort, it does exist, but it’s not installed by default as too much work had to be done to finish it with some technical issues I had along the way.

Okay, I am gonna be harsh. I’d say it should use language and stack that is performance friendly, not Ruby. Let me explain. Today, I’ve spent good two hours on IRC chatting with lero about his performance issues with facts. It turned out that they had some custom facts, way too many of them, which was all uploading into Foreman and the fact endpoint was choking. The reason for that was Ruby dealing with large hash tables trying to process data as we do ton of stuff before it even goes into the database.

Even if we rewrite our fact parsing code to be more efficient, it will not help because the moment a large JSON hits Ruby on Rails endpoint, it parses it and only this operation is slow. I wish Foreman had a component running on the network edge (where smart proxy runs today) written in let’s say an approachable language with high-concurrent/high-performance HTTP stack where all of this processing could be offloaded. (*)

It’s not just facts, you can put a new bullet on your list: I am currently working on Optimized Reports Storage RFC - a prototype and a new plugin with brand new way of storing reports in more efficient way. And this week @Marek_Hulan had a great idea - what if we offload processing of incoming reports to smart proxy (or the new “dumb” smart proxy @ekohl mentioned above). Reports coming out from various sources needs to be transformed to some reasonable common format, number of warning, info, error messages must be counted and summary must be built before it can be stored into database. This all, again, can be done on the proxy side, granted for Puppet I am going to change JSON format to be more efficient without those log/resource/message hashes.

My point is, Ewould mentioned an interesting aspect and if we want to increase security, a good solution would be to break smart proxy into two separate processes and wrap everything with SELinux, specifically the client-facing process. And if we started a new process, we could intentionally build it with Performance First in mind.

(*) Thus I am thinking Golang