RFC: Changing how we handle Webpack Building

In general though, the relative path shouldn’t change if it’s a top level module though? e.g. lodash/map.js will always be at lodash/map.js unless the lodash project itself changes the publish structure?

That is correct, but if the dependency tree changes that may cause a module that was a sub-dependency to become a top level dependency or vice versa due to de-duping. Also, the exact de-duping may vary between node/npm versions, so a module that is top level on one system may be a sub-module on another.

This idea is based on my limited knowledge, so hoping that folks can steer this in the right direction. Could we move to a model where plugins assume that a “module” exists named foremanVendor and all requires for thirdparty modules are requested through the foremanVendor module?

If something like foremanVendor.lodash was a constant I could rely on as a plugin, then I wouldn’t have to care about the relative paths or dependencies shifting inside vendor.js (I think). This seems sorta what an alias in webpack is designed for. Similar to how all of the React is reference-able using the foremanReact alias.

Excuse my asking of obvious questions, but what problem does the node modules directory structure changing cause? It seems like if you had a package-lock.json or a way to reference package versions you could recreate the bundle.

What is the idea behind this? To ensure our specified packages are always in the same hierarchy in the node_modules folder?

I think alias is more for code that you have locally that you want to reference in imports, but don’t want to package and ship a new npm package for. I would think putting third-party code in an alias is an anti-pattern, but this being the wild west of javascript, I could be wrong :man_shrugging:

As I mentioned, there are several different causes leading to different structures - changes in module versions (which package lock may solve) is just one of them. I’m not sure that running with the same pacakge lock would yield the same structure e.g. on different npm versions. The problem it causes is in the identifiers assigned to modules during the webpack compilation - which are their path relative to node_modules folder.

Ok, gotcha, I’m trying to wrap my mind around doing this in a packaging context so thanks for your patience :slight_smile:

If I understand correctly, the idea is if we are doing this new way of packaging, we want to just be able to rebuild vendor.js and keep the foreman/plugin bundles the same. So if you recompile vendor.js because a package was added and the node_modules directory structure changes, you break all the plugin’s bundles since the references are no longer valid.

That is correct right now since the hash of vendor.js will change when anything in it changes. If we drop the hash, this will be semi correct - sometimes it would still work (i.e. when the path doesn’t change) and other times it will break without us knowing, which is why the hash was introduced. The goal is indeed to allow changes in core node modules without requiring all plugins to be rebuilt.

One part I’m still confused on is how the directory structure of node_modules affects the vendor.js bundle. I can walk through a scenario to illustrate my confusion.

  • wepback creates bundled files for foreman, katello, and third-party. I’m assuming we dropped the hashes here, so lets call them bundle.js, katello.js, and vendor.js
  • some-package is added to foreman’s package.json
  • npm i is ran and node_modules structure is updated to include some-package but maybe there are some byproducts of folders being rearranged. The node_modules directory structure is different now.
  • I create a new vendor.js using webpack, but I keep the old katello.js and bundle.js. These are the files being served.

Wouldn’t the existing bundle.js and katello.js still be able to reference vendor.js correctly? As long as no packages were removed, it should have the same packages in it with some-package added. I guess you would need to ensure the existing package’s versions are correct or compatible with the old bundles, but that seems like it could be done with pinning the versions.

What I am wondering is how node_modules directory structure actually affects the bundles? It seems like webpack will ensure the packages in node_modules are available as modules in vendor.js that can be imported elsewhere. This would mean the directory structure of node_modules is inconsequential.

According to the docs, webpack uses a manifest to find where modules are located. If this is manifest.json in our case, the contents shouldn’t change if the hashes are removed.

I could be missing something obvious, just trying to get a better idea of the issue, thanks for the explanations so far.

ok, I think I (partially) answered my question right after sending it (as is always the case).

I looked into the bundled files and see numerous references to third-party paths inside node_modules, i.e node_modules/jquery-flot/jquery.flot.time.js, so the concerns make more sense now.

I’m still not totally clear on how this works client-side, what is node_modules referencing when the bundles get to the browser?

I looked into this more and had some off-thread discussions to clear up my confusion. Also, @tbrisker’s medium article helped explain a lot of background of our webpack usage and the current issue we are trying to solve. (thanks for writing that!)

I came up with a potential solution to making new vendor.js bundles compatible with other existing bundles if they have don’t have breaking changes.

I created a plugin (basing it of @tbrisker’s existing one) that creates a hash based on plugin name, author, major, and minor versions. This is used as the module identifier. The patch (or z) version is excluded from creating the hash, so updating z versions won’t create new hashes. Only major or minor breaking changes will update the module identifiers.

This way, if you have a bundle.js that was build with this plugin, it will keep the same references to the vendor.js file as long as packages didn’t break x or y versions. So you could do the following:

  • build vendor.js files with updated z-versions and use an old bundle.js files with them
  • build new bundle.js files and use old vendor.js files with them.

I created an example repo with a webpack config that resembles ours. The plugin can be found in this file and included steps to test the concept in the README.

I also found webpack has a way to create a records.json file, which maps the identifiers to modules, and can be used to persist this across builds. I enabled this in the repo and use it in the instructions. This might be something we find useful.

I’m not sure if this will fit all of our criteria and I haven’t tested it with actual foreman builds, but hopefully its a step in the right direction.

2 Likes

This sounds like a better way of ensuring the module ids are consistent and we don’t have to care about the module structure. TBH, I think it might be better to keep the module name and version as plain text instead of hash so it’s human readable in case something breaks.
There is still the issue of how we know that a certain plugin is compatible with a certain vendor file - right now, we use a hash of the vendor file to detect changes, but this is quite overkill and leads to having to rebuild every plugin every time something in the file changes, usually with no need. I’m not sure I have a good way of fixing this, but perhaps we should at least change the hash to be a hash of the list of modules in vendor.js?

1 Like

TBH: I Would really like to see foreman’s webpack code wrapped in an npm
package (even if its not published) but that plugins can depend on it, imho
this will avoid all of the extra workarounds we have in place to ensure
vendor file consistency.

2 Likes

@tbrisker That was my thought behind the records.json file. This matches hashes to the actual modules. Perhaps this could be created and kept in the build somehow? I’m still learning the specifics around our building so others can tell me if this is realistic. This being said, I’m not opposed to plain-text identifiers either.

I was wondering about this as well. A hash based on the list of modules sounds like it could work. I don’t know what we would do with the bundle.js file though, as that won’t have a list of modules. I’ll look into it more when I get a chance.

I’ll do my best to get a PR open so we can see what this actually looks like with foreman + plugins and run through some build scenarios

@ohadlevy Sounds interesting, can you go into some details on how this would work?

Sure, instead of creating a webpack vendor file foreman would export all of the libraries in the same manner like patternfly-react expose a react-bootstrap component, lets take the React package as an example:

Today, in foreman package.json, React is a dependency, and in order for plugins to write React code, its also a dependency of the plugin, so you have two places where you depend on a package, and we magically hope that webpack will know to use the same react npm package.

further, it gets even more interesting, where you want to run react based tests, so without the react code, you can’t run tests (which makes sense) so you end up with a plugin node_modules that have another copy of react (regardless of the node_modules for foreman itself).

My suggestion is that instead, foreman will export all of the packages via some namespace lets call it ForemanPackage, and each plugin will depend on the npm package of foreman, in code it would look something like this (not tested):

import React from 'react';

export { default as React } from React;

and then in another place, e.g. a plugin which wants to use React:

Import React from 'ForemanPackage';
...

then you could simply delete your plugin dependency on React as a package, and whenever you run your tests, you know you are getting the version of React that is defined in foreman.

This will also (IMHO - needs to be tested) should eliminate the code duplication that happens today (if you run npm run analyze you can see that each plugin is copying foreman react code…

3 Likes

The link there is wrong, and I’ve missed the window to edit my post. The docs for webpack’s record file is here

This is what I was suggesting earlier (RFC: Changing how we handle Webpack Building) and glad to finally see a response echoing the idea!

I do think @John_Mitsch idea of switching to package detection based on name/version could help but it’d be nice to test this theory of the “meta” package. Which I think @John_Mitsch is going to do.

@ohadlevy Thanks for clarifying, I’ll test out the meta package idea in my sandbox repo, which will give us an idea of how webpack treats a package like this.

One clarifying question though, if this package is unpublished, how are we making it available? Should we use a method similar to the foremanReact package we make available now?

I think you can use a GitHub reference in your package.json file. As long
as you can point to a branch I think we can avoid publishing?

1 Like

AFAIK you can install directly from a github branch:
npm install --save username/repo#branch-name

1 Like

@ohadlevy @ehelms I created a branch in my sandbox repo to test out using a meta package to hold dependencies. It seems to fit our needs well.

I wrote up a walkthrough of the functionality in the README to show how webpack is treating the meta package in the bundles.

There are some details to consider and I’m not sure how this will fit in our build process, but this at least a good jumping off point for more discussion. Let me know your thoughts, hope this helps!

1 Like