2025 August 10 - @zerotrickpony@messydesk.social
Typescript's duck-typing philosophy has many advantages, but sometimes it permits coersions that I'd rather have prevented at compile time. This is especially true of situations where a subtype of a primitive type (like string or number) has some semantic constraint which shouldn't permit coercion, even though they are implemented as primitives. What techniques can we use to improve type safety of constrained primitive types?
Recently I wrote an application that works with a lot of absolute file path strings. These paths strings are sprinkled all over the codebase, including in some inner loops where re-checking well-formedness and existence had a significant impact on the performance of an already-slow disk scanning tool. Something like:
;
;
Unfortunately recursiveScan
is repeatedly validating that the path is well formed.
This is wasted effort because I know that the user input was already sanitized, and join()
on a known directory will always produce a valid directory path.
I want a Typescript type to encapsulate the idea of this validation being done already, so that downstream code can be assured that the file paths are valid at compile time. Maybe something like:
type DirPath = // ... ????
function parseUserInput(): DirPath {
const str = getSomeInput();
checkValidDir(str);
return makeDirPathSomehow(str);
}
function recursiveScan(path: DirPath, stats?: Counters): Counters {
// checkValidDir(path); // don't need this anymore
stats = stats ?? new Counters();
for (const item of listdir(path.toString())) {
if (item.isDirectory()) {
recursiveScan(join(path.toString(), item.name()), stats);
} else {
stats.countFile(item);
}
}
return stats;
}
const dirpath: DirPath = parseUserInput();
const stats = recursiveScan(dirpath);
When recursiveScan is passed a valid DirPath
, it should now be safe to omit the bounds
checking on every recursion. How should DirPath be defined?
Naively, we could define DirPath
simply as an alias to Typescript's primitive string
type.
That's an obvious intended use of Typescript's alias feature, it avoids runtime performance
overhead, and DirPaths can interoperate nicely with various
file path operations like join()
.
But there is a snag:
// if DirPath is just an alias for string...
;
// ...then this is NOT a compile error
`invalid nonsense`, stats;
With this approach, any string will coerce to a DirPath
, and we will not get the bounds checking
assurance from the Typescript type checker. We could certainly assume
that the DirPath alias is a sufficient signal to future maintainers that a constrained
value is expected, but there is no enforcement. It is merely documentation, prone to
mistakes.
And as a secondary annoyance, type analysis tooling like VSCode will show any variables and
properties of alias DirPath as "string"
in various hover cards and tooltips, instead of using
the more descriptive name:
During a recent refactor of this code, this behavior of VSCode kept inducing me to re-check the definitions of my interfaces and worry that I had forgotten to fix the types of the fields. A bit annoying.
We could instead define DirPath
as a full fledged class, wrapping the string path data
and perhaps also tracking some useful additional properties. This is a perfectly reasonable approach
and is used by patterns like TypeID.
Here's how that could look:
// A wrapper object which encapsulates validity checking
And now we can write application code that trusts that DirPaths are well-formed and already bounds checked by the Typescript type checker:
function parseUserInput(): DirPath {
const str = getSomeInput();
return DirPath.parse(str);
}
function recursiveScan(path: DirPath, stats?: Counters): Counters {
stats = stats ?? new Counters();
for (const item of path.list()) {
if (item.isDirectory()) {
recursiveScan(path.join(item.name()), stats);
} else {
stats.countFile(item);
}
}
return stats;
}
// Validity checking is now encapsulated by DirPath
const dirpath: DirPath = parseUserInput();
const stats = recursiveScan(dirpath);
// and invalid strings give a nice typecheck error
recursiveScan(`invalid nonsense`, stats);
In this approach, Typescript will enforce agreement with the DirPath
class
throughout the code, and we can be assured that the bounds checking done at
construction time need not be repeated in subsequent usage sites.
Unfortunately we'll have to write some helper functions like join
to adapt our
bespoke DirPath
class to file path utilities that take strings. And we'll have
the runtime overhead (speed, memory allocation) of the wrapper objects being created and
referenced throughout the code.
Don't worry about the runtime performance cost of wrapper objects.
Well... what if we do want to worry about performance? I have some benchmark results on this below, but here is an idea:
Here's an approach that tsc will typecheck like a wrapper object, but has (almost) no runtime overhead:
// This type looks like a wrapper object during typechecking,
// but it's actually a primitive:
In this approach, we abuse Typescript's coercion overrides to ask certain strings to be
treated like DirPath classes during typechecking. Within the implementation of DirPath, we have
various naughty coercions through as unknown
so that we can appease Typescript's type checker,
but no wrapper objects are ever actually allocated at runtime.
Application code looks nearly the same for a static wrapper as in the "wrapper objects" approach above, except that the helpers are static methods rather than member methods. Like:
function parseUserInput(): DirPath {
const str = getSomeInput();
return DirPath.parse(str);
}
function recursiveScan(path: DirPath, stats?: Counters): Counters {
stats = stats ?? new Counters();
for (const item of DirPath.list(path)) {
if (item.isDirectory()) {
recursiveScan(DirPath.join(item.name()), stats);
} else {
stats.countFile(item);
}
}
return stats;
}
// Validity enforcement works the same as wrapper objects
const dirpath: DirPath = parseUserInput();
const stats = recursiveScan(dirpath);
// ...as do typecheck errors
recursiveScan(`invalid nonsense`, stats);
We still need to have some degree of runtime overhead to call wrapper functions like
DirPath.join()
, so this technique does not entirely avoid runtime overhead.
But runtime allocation overhead is entirely avoided.
TLDR:
string
or number
: 1.2X to 5X slowdown or more, depending on environment. See below.More details:
I ran some benchmarks on four Javascript VM environments, comparing the wrapper objects (B) approach to primitive types, and the static wrappers (C) approach to primitive types. Each bar represents the measured slowdown of that approach as compared to using primitive types:
Here a value of "1.0" indicates that the approach had no difference in its benchmark speed vs.
the same code run on a primitive type. Both string
and number
primitives were compared to these two wrapper techniques. Wrapper objects imposed a performance
penalty of between 1.2x - 5.0x or more, depending on the environment and workload.
The right side of the chart shows the same benchmarks except where a very large amount of memory was retained during the tests. This memory pressure caused garbage collection overhead to be larger, and therefore demonstrate a larger difference in the approaches which allocate objects.
(Note that these bars state relative performance of the techniques within each environment. It wasn't my intention to compare the absolute performance of the four environments. They did not perform similarly; the x64 machine was four times faster at the benchmarks than the Apple Silicon, and both of those environments were vastly faster than the rental cloud machine from Digital Ocean. For fun I also tested Firefox to see if Spidermonkey would perform differently from V8 on the same ARM processor.)
This "static wrappers" pattern has good runtime performance, but is fairly clumsy. It has disadvantages like:
as unknown
coercions which effectively turn off the type checker.A future version of Typescript could offer more control over alias coercion behavior. As a strawman, imagine a new kind of type expression somewhere between an alias and a class, which typechecks like a class but passes through at runtime like a primitive. Something like:
// Magic class expression which makes a primitive at runtime,
// but prevents implicit widening
If this approach were possible, it would permit application code to use helpers with a more natural member method syntax, but still without incurring allocator costs:
function recursiveScan(path: DirPath, stats?: Counters): Counters {
stats = stats ?? new Counters();
for (const item of path.list()) {
if (item.isDirectory()) {
recursiveScan(path.join(item.name()), stats);
} else {
stats.countFile(item);
}
}
return stats;
}
Since Typescript seems to prefer to be in the business of typechecking rather than compiling, this feature may never be offered in the core language. A code generator or preprocessor could potentially serve this purpose, but if it's not in the widely known language then it fails to avoid the "astonishment" problem.
It's also worth noting that other languages like Rust with more aggressive compilers may already have effective solutions to this problem, because encapsulated primitive types can often be flattened away during optimization.
Anyway, let me know if you had this need in your Typescript projects, and what you did about it! I'm always interested in learning more.
Techniques
Performance experiments