Why is swc fast?
swc
is fast. Very fast. It's 18x faster than babel on a single-core benchmark, and on a parallel benchmark, it's 68x faster than babel on a 4 core (8 HT) machine. Why? Just because it's written in rust? No. Its fundamental design differs from any other tool.
#
The languageJust rewriting something in another language does not make it faster.
This is partially correct. Rewriting something in a language with an optimizing compiler can make difference. Some compilers including rustc, which uses llvm as a backend, are good at optimizing codes.
Edit: makes -> can make, created an optimization hints
section,
#
Optimization hintsSometimes, rustc cannot deduce that it's dead code even if it is dead. In swc, we give enough hint if it's the case.
For example, not all passes touch all of the ast. Many passes return the given ast node as-is. Especially, most passes have nothing to do with types. So those passes return the typings as-is.
So, it would be cool if those handlers are optimized out.
rustc
failed to optimize it out by default because it changes semantic, but we can help it.
See it on github.
Now, it's optimized out. However, what matters is allocations. Fold<Vec<T>>
is implemented as below.
move_map()
is a hack to prevent reallocation.
When Fold<T>
is an identity function, Fold<Vec<T>>
is cleary no-op.
Let's see the generated assembly for the code below. The link contains the actual implementation of move_map
.
Great. Although it's a bit longer than no-op, no allocation occurs and thus it's fast enough.
The same goes for the Box<T>
. See https://godbolt.org/z/5fDXQK if you want.
#
Design#
String cachingIdentifiers are inherently used multiple times. swc
utilizes string_cache from the servo project to cache strings. Common identifiers like Object
is stored as a constant.
#
Less scope analysisScope analysis is done only 2 ~ 4 times per file. One at the start of processing, one at the end of processing, one to strip type-only imports (typescript only), and the last one to transcompile it into the other type of modules. (cjs, amd, umd)
I'll explain how it works, and why it's fast.
First, before applying any other transformations, resolver
pass colors the identifiers.
Note that the symbol is not changed. ___
denotes the context number and it's not part of the symbol.
becomes
Then, other passes insert identifiers if required. There's a helper macro to create a private ident.
Let's suppose that a pass appended a private identifier named foo
to the top level.
It's now
and with expanded context number, it's
The last pass, named hygiene, removes context numbers and changes symbol to the appropriate one. It becomes
It means that passes between resolver
and hygiene
can inject identifiers as they want without any heavy operation.
For comparison, babel maintains the scope while transcompiling and uses it like scope.rename('require')
to ensure that the name does not conflict.
However, given the fact that babel does renaming directly instead of queueing, api like this means that all identifiers in the file are visited on each call. For common js module, babel does like
and it results in 5 scope analysis.
The basic idea of hygiene
is taken from the macro system of rustc. Note that no compiler works in this way and I call this approach identifier hygiene. (Please correct me if I'm wrong.)
Note that this will also help writing a fast bundler. While bundling modules, only a single invocation of the resolver pass is enough to distinguish identifiers from multiple modules. Hygiene pass is invoked only once at the end.
Edit: Added how babel works
#
No use of graph data structuresswc
does not use graph structures for some reason.
The most important one is the fact that rust is not a language with a garbage collector.
After parsing, we have ast node, which consists of some vectors.
To use a graph data structures like petgraph::DiGraph
, we need to destruct a vector and create a graph of nodes.
Also, as js code generator takes ast node, we have to reconstruct the nodes from graph structures.
This is an obvious overhead, and I wanted to avoid it.
Some graph traversal are possible with vectors, so I decided to go without graph data structures.
Edit: swc
uses dfs and bfs, so we can't say it does not do graph traversal
--