Tackling the new npm@3 dependency tree
February 25, 2016
0 mins readUntil recently Snyk’s CLI tool only supported npm@2. That all changed when we released snyk@1.9.0 and added full support for the new npm@3 directory structures.
We wanted to share some of the technical challenges involved and the new tooling that came out of the process.
What’s different about npm@3
With npm@2 node dependencies would be installed into the node_modules
directory of each respective node package. For example, the directories for the request module looks like this:
request/node_modules
├── aws-sign2
├── aws4
│ └── node_modules
│ └── lru-cache
│ └── test
├── bl
│ ├── node_modules
│ │ └── readable-stream
│ │ ├── doc
│ │ │ └── wg-meetings
│ │ └── node_modules
│ │ ├── core-util-is
...snipped
As you can see from the snippet above the node_modules
appears a number of times already. This isn’t so bad, but it can lead to a lot of duplication. Popular utilities like lodash can appear in a project many, many times!
Originally, npm@2 does some work to de-duplicate this, but npm@3 completely flattens this directory structure in a bid to completely remove the duplication. The same request package looks like this with npm@3:
request/node_modules
├── ansi-regex
├── ansi-styles
├── asn1
│ └── lib
├── assert-plus
├── async
│ └── lib
├── aws-sign2
├── aws4
├── bl
│ └── test
...snipped
As you can see, it’s completely different, but importantly the way node requires modules is not affected at all.
How does this affect Snyk?
To start with, the CLI package walking logic needed a complete rewrite. Originally Snyk would walk your node_modules
directory, then iterate through each sub-directory and build up a tree representation of your packages. Relatively simple really.
Except now, the flat directory structure with npm@3 does not represent your package relationships at all. For example the async
package in the npm@3 listing above, is actually a dependency of the form-data
package, which in turn is a dependency of request
. But you can’t see that from the file tree.
So Snyk’s package resolution has been completely rewritten and extracted out into a standalone module called snyk-resolve-deps (open source under an Apache 2 license).
This module is used inside of the snyk CLI tool but can also be installed as a standalone CLI tool (installed using npm install -g snyk-resolve-deps
gives a utility called snyk-resolve
).
What the snyk-resolve-deps does is: first pass through the entire directory structure building up the physical tree. This physical tree is then passed to the next stage that creates a logical tree, which is the structure that represents where packages can be loaded from.
This means that both npm@2 and npm@3 directory structures are supported and create a virtual tree looking like this:
❯ snyk-resolve
request@2.69.1
├── aws-sign2@0.6.0
├─┬ aws4@1.2.1
│ └── lru-cache@2.7.3
├─┬ bl@1.0.2
│ └─┬ readable-stream@2.0.5
│ ├── core-util-is@1.0.2
...snipped
Ultimately this means Snyk can now happily support both your npm@3 installed projects just as well as npm@2. We’re able to find the correct paths for patching and able to report all the vulnerable paths accurately.
How is this different from ‘npm ls’?
If you’re familiar with npm’s tools, you would have heard of npm ls
which is useful to see these trees.
The big difference between snyk-resolve-deps
and npm ls
is that our tool will show the complete logical tree and shows all ways through which a package entered your project.
If we look at how the request
is included in the npm
code (on the 2.x
branch) we can see that npm ls
is telling us one story (of what’s available on disk):
Whereas our own method of resolving tells a different story. This doesn’t mean that npm ls
is wrong, it’s that we want to know exactly what is loading request
, and our own snyk-resolve-deps
can give us that:
As you can see, the request
module is depended upon by many more packages than you might initially think. If that package had a vulnerability, the vulnerable paths are clearly known to Snyk now.
snyk-resolve-deps
The package snyk-resolve-deps is available today, under an Apache 2 open source license. Once installed globally, it’s available under the alias of snyk-resolve
.
The CLI utility has a number of filters and flags, including --disk
view (which reports very similarly to npm ls
) and --filter X
and --count X
to filter and find occurrences of a specific dependency. You can find out more with snyk-resolve --help
.
You can npm install -g snyk
today and anonymously test any package or github repo and with a free account, you can start to monitor your projects for vulnerabilities today.
Get started in capture the flag
Learn how to solve capture the flag challenges by watching our virtual 101 workshop on demand.