Skip to content

Exponential increase in compile time and memory usage #38339

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
squidfunk opened this issue May 5, 2020 · 38 comments
Closed

Exponential increase in compile time and memory usage #38339

squidfunk opened this issue May 5, 2020 · 38 comments
Assignees
Labels
Needs Investigation This issue needs a team member to investigate its status.

Comments

@squidfunk
Copy link

squidfunk commented May 5, 2020

This bug report is a follow-up to this tweet where I write about OOM crashes in 3.9.1-rc. @DanielRosenwasser asked me to report an issue, but I find it really hard to narrow it down.

I have a project with ~2k lines of code. It uses another internal package with ~18k lines of code. The problem is that the compiler doesn't terminate anymore. I then tried to compile a subset of the project (which is a recursive decent parser, BTW), which runs in under 6 seconds. If I add more and more files, it will get slower and slower to a point where it will run into OOM. The interesting thing is that there's no big bump in files, lines, types, instantiations or symbols:

Files:             1588      1597      1600
Lines:            65199     80674     81049
Nodes:           278678    349628    350339
Identifiers:      90620     91032     91288
Symbols:          64556    465900    478162
Types:             3027    109800    134328
Instantiations:   31435    115295    222586
Memory used:    125785K   295562K   952063K
I/O read:         0.19s     0.13s     0.90s
I/O write:        0.02s     0.05s     0.06s
Parse time:       3.78s     3.27s     4.77s
Bind time:        0.80s     0.78s     1.07s
Check time:       0.96s     5.37s    94.43s
Emit time:        0.25s     0.74s     0.77s
Total time:       5.79s    10.15s   101.04s

Yes, there are twice as many instantiations, which I assume results from the use of rather large discriminated unions (.e.g when restructuring them), but the increase in memory is almost three times as high. The CPU profile when compiling looks like this:

EW69odEWkAIX0SS

My current guess is that the exponential increase is somehow related to Instantiations ⨉ Instantiations, as most of the time is spent in someTypeRelatedToType.

I tried breaking it down to a reproducible case, but as the project with which I'm experiencing this issue is not Open Source, stripping everything down takes too much time. I understand that this is not an actionable error report, but maybe there's an obvious reason why this recent change in behavior may be causing this. If not, feel free to close it

TypeScript Version: 3.9.1-rc
Search Terms: crash, out of memory, oom

Code

Could not be provided

Expected behavior:

TypeScript compiles the code

Actual behavior:

TypeScript runs out of memory

UPDATE:

I managed to get all packages to build with 3.9.1 by removing mapped discriminated union types, i.e. functions with signatures like function<T extends FooKind>(kind: T): FooLike<T>, whereas FooLike<T> uses conditionals to select a specific type from a huge union with ~1k types. Build times now are:

package-c          3.8.3     3.9.1

Files:              1626      1628
Lines:             84005     84000
Nodes:            357333    357139
Identifiers:       93411     93314
Symbols:          488307    487405
Types:            128763    124300
Instantiations:        -    167777
Memory used:    1092025K   979008K
I/O read:          0.17s     0.33s
I/O write:         0.13s     0.22s
Parse time:        3.29s     5.44s
Bind time:         1.20s     1.64s
Check time:       60.46s    92.90s
Emit time:         2.41s     2.85s
Total time:       67.37s   102.82s

package-b          3.8.3     3.9.1

Files:              1588      1590
Lines:             64154     64149
Nodes:            276974    276780
Identifiers:       89681     89584
Symbols:           64608     64983
Types:              3484      3621
Instantiations:        -      9232
Memory used:     131259K   130999K
I/O read:          0.25s     0.22s
I/O write:         0.10s     0.08s
Parse time:        4.70s     4.62s
Bind time:         0.99s     1.14s
Check time:        0.99s     1.12s
Emit time:         1.16s     1.38s
Total time:        7.83s     8.26s

package-a          3.8.3     3.9.1

Files:              1577      1579
Lines:            115025    115020
Nodes:            329307    329113
Identifiers:       88729     88632
Symbols:          154484    151390
Types:             45980     45135
Instantiations:        -     36085
Memory used:     372082K   274693K
I/O read:          0.27s     1.54s
I/O write:         3.98s     3.98s
Parse time:        6.17s     7.64s
Bind time:         1.63s     1.77s
Check time:       21.55s     4.72s
Emit time:        13.94s    15.12s
Total time:       43.29s    29.25s
@amcasey
Copy link
Member

amcasey commented May 5, 2020

@squidfunk That's really interesting. When you were ramping up the amount of code included, did you have the option of adding things back in different orders or are the dependencies pretty strict? I'd be interested to know whether something's growing quadratically (or worse) and any straw could be the one to break the camel's back or whether there was something interesting in that final delta.

Another thing that might be interesting to try (though you may have already - your notes are really thorough) would be to (temporarily) use --max-old-space-size to increase the available memory and see if the whole thing compiles. From your description, it sounds like usage blows up so quickly that no amount of space would save it, but sometimes you see interesting things when then compilation runs to completion.

Also, which version of node are you running?

@squidfunk
Copy link
Author

squidfunk commented May 6, 2020

Thanks for your quick response! I'm running Node 13.6.0. As you suggested, I tried increasing --max-old-space-size to no avail (tried 1024 and 2048). However, I tried something different, but before I explain, maybe some more detail on my use case:

As I said, I'm building a fully-typed recursive descent parser for the CSS grammar. For example, this is the type of the AST node for the border property:

export interface BorderPropertyValue {
  type: ValueType.PROPERTY
  kind: "border"
  data: Array<LineWidthValue | LineStyleValue | ColorValue>
}

Currently, I have around 1.000 of those interfaces which I assemble into a huge union:

export type PropertyValue =
  ...
  | BorderPropertyValue
  | BorderBlockPropertyValue
  | BorderBlockColorPropertyValue
  ...

Now, as the parser is based on grammars (and the resulting automata), I'm doing this like:

 export type PropertyValueLike<
   T extends PropertyValue["kind"],
   U = PropertyValue
 > =
   U extends { kind: T }
     ? U
     : never

This allows me to write functions like:

declare function parsePropertyValue<T extends PropertyKind>(
  kind: T, data: string
): PropertyValueLike<T> | undefined

... which is super awesome, as the following call will give me the correct type:

const border = parsePropertyValue("border", "1px solid red")
// => BorderPropertyValue | undefined

What I tried now: if I generate an interface with all values of PropertyKind as members pointing to their respective values, the code compiles and check time goes down to 17s:

export interface PropertyValueMap {
  ...
  ["border"]: BorderPropertyValue
  ["border-block"]: BorderBlockPropertyValue
  ["border-block-color"]: BorderBlockColorPropertyValue
  ...
}

export interface PropertyLike<
  T extends PropertyKind
> = PropertyValueMap[T]

I don't know really how to proceed here, as I need mapped conditional types for derived types of PropertyValue, but I feel that the TypeScript compiler can't cope with this amount of complexity.

@squidfunk
Copy link
Author

squidfunk commented May 6, 2020

One more interesting thing – if I nest calls with those mapped types, instantiations and check time increase big a huge degree. This could be the cause of the exponential increase we're seeing:

Case 1:

export function parseX1<T extends PropertyKind>(): PropertyValueLike<T> {
  return {} as PropertyValueLike<T>
}

export function parseX2<T extends PropertyKind>(): PropertyValueLike<T> {
  return {} as PropertyValueLike<T>
}

export function parseX3<T extends PropertyKind>(): PropertyValueLike<T> {
  return {} as PropertyValueLike<T>
}

export function parseX4<T extends PropertyKind>(): PropertyValueLike<T> {
  return {} as PropertyValueLike<T>
}

Case 2:

export function parseX1<T extends PropertyKind>(): PropertyValueLike<T> {
  return parseX2()
}

export function parseX2<T extends PropertyKind>(): PropertyValueLike<T> {
  return parseX3()
}

export function parseX3<T extends PropertyKind>(): PropertyValueLike<T> {
  return parseX4()
}

export function parseX4<T extends PropertyKind>(): PropertyValueLike<T> {
  return {} as PropertyValueLike<T>
}

Result:

                  Case 1    Case 2

Files:              1629      1629
Lines:             85292     85292
Nodes:            363476    363464
Identifiers:       94611     94608
Symbols:          495374    495370
Types:            151530    154508
Instantiations:   289778    328368
Memory used:    1007317K  1117923K
I/O read:          0.15s     0.16s
I/O write:         0.15s     0.15s
Parse time:        3.74s     3.17s
Bind time:         0.90s     0.91s
Check time:       51.16s   100.72s
Emit time:         2.22s     3.02s
Total time:       58.02s   107.82s

+38.590 instantiations! Is there a way to tell the compiler not to evaluate all possible cases and keep it generic, i.e. evaluate lazily?

@squidfunk
Copy link
Author

I'm sorry for flooding this topic, but I try to provide more insights as I gain them. I brought check times way down on certain code paths by introducing what I'd call happy paths, i.e. another level of indirection:

type Resolve<
  T extends PropertyKind,
  U = PropertyValue
> = U extends { kind: T }
  ? U
  : never

export type PropertyValueLike<
  T extends PropertyKind
> = PropertyKind extends T
  ? PropertyValue
  : Resolve<T>

Now, if the type is not specified, the union type is just returned. This brings the number of instantiations from my previous comment way down.

@squidfunk
Copy link
Author

squidfunk commented May 6, 2020

I continued iterating section by section and have brought check time down to 7 seconds for the complete project! However, there's still a very odd case:

Case 1:

export function parseDeclarationHack(
  data: Tokenizer | string
): Declaration | undefined {
  let declaration: Declaration | undefined
  return declaration
}

Case 2:

export function parseDeclarationHack(
  data: Tokenizer | string
): Declaration | undefined {
  let declaration: Declaration | undefined
  return typeof data === "string" 
    ? declaration
    : declaration
}

Result:

Files:             1629     1629
Lines:            84064    84064
Nodes:           357331   357327
Identifiers:      93383    93383
Symbols:         488675   488675
Types:           131501   131501
Instantiations:  218010   218010
Memory used:    337297K  685474K
I/O read:         0.19s    0.14s
I/O write:        0.13s    0.16s
Parse time:       3.92s    3.58s
Bind time:        0.97s    1.32s
Check time:       6.96s   29.04s
Emit time:        2.02s    1.99s
Total time:      13.88s   35.93s

Of course that condition is non-sensical, but I reduced it to that case and I don't understand why a simple conditional, which returns the exact same value increases check time from 7s to 30s. Also, note that the number of types and instantiations is the same.

Adding as any to any of the branches brings it down to 7s again. What the heck. So, I don't know how to move forward on this topic. It's super, super strange. If one of you wants to pair with me, I'm happy to assist via vidchat. As I said, I'm afraid I cannot share the code 😞

@squidfunk
Copy link
Author

I managed to reproduce the exponential increase when branching using the Preact JSX typings:
https://github.com/squidfunk/typescript-issue-38339-repro

@amcasey
Copy link
Member

amcasey commented May 6, 2020

Wow, that's a lot of info! I'll do my best to respond in an orderly way.

I'm interested in PropertyValueMap and PropertyLike, but I'm blanking on where they fit into your original code. What am I missing?

Edit: Figured it out - it takes the place of PropertyValueLike. Yes, doing it as a lookup should be much faster than iterating over the whole union.

In your ParseX example, my (unconfirmed) guess is that you're actually seeing more laziness than you want - if the generics didn't prevent things from being evaluated eagerly, they wouldn't explode when they eventually are. Also, I realize your example is intentionally contrived, but there's no way to infer a type argument that only appears in the return type and I suspect that's making things worse than they would be in normal practice.

Edit: My mistake - the return type is inferred from the contextual type (forgot which language I was in 😉), probably very expensively.

Edit 2: If you specify the type arguments explicitly, does the second version of ParseX get faster?

Do you ever pass a second type argument to Resolve or is it always the default? Also, I'm not sure how we could generalize that fix. The compiler could probably recognize the case where the type argument is exactly the constraint type, but the conclusion (in this case, that you can just use PropertyValue) seems to depend on outside knowledge.

Your reduced repro is really fun. 😄

@amcasey
Copy link
Member

amcasey commented May 6, 2020

Just to manage expectations, my current read on the problem is that TS doesn't provide a way to efficiently express the behavior you want and changes to the expressiveness of a programming language take time. In the meantime, you may need to trade off between correctness and perf (or find clever workarounds, as you seem to have done).

We're still looking at it and there may just be a bug that we can fix, but I wanted to warn you that there might not be a satisfactory answer in the short-term.

@squidfunk
Copy link
Author

Just to manage expectations, my current read on the problem is that TS doesn't provide a way to efficiently express the behavior you want and changes to the expressiveness of a programming language take time. In the meantime, you may need to trade-off between correctness and perf (or find clever workarounds, as you seem to have done).

I understand completely. I don't have any expectations. I was asked to open an issue for my observations, so I did. I also wanted to provide my insights as I debugged the situation. I'm using TypeScript for 2,5 years now, and it's the first time I run into these kinds of problems with the compiler.

Edit: My mistake - the return type is inferred from the contextual type (forgot which language I was in 😉), probably very expensively.

But why? If the return type of parseX2 is exactly the same as in parseX1, which calls parseX2, and the type parameters have the same constraints, why isn't it just "handed down"? Why is it instantiated?

Edit 2: If you specify the type arguments explicitly, does the second version of ParseX get faster?

Do you mean parseX2<T>() etc.? Actually, yes! It does seem to bring down the number of instantiations. Note the big drop in compilation time due to the two sections in the code which I found and where I fled to any, so 3.9.1 has big improvements overall:

                 3.8.3   3.9.1-rc
                   All     Case 1     Case 2      + <T>
                                  
Files:            1627       1629       1629       1629
Lines:           84094      84090      84090      84090
Nodes:          357108     356928     356916     356922
Identifiers:     93508      93414      93411      93414
Symbols:        488382     492630     491639     489657
Types:          145922     147336     145354     135456
Instantiations:      -     285166     282223     234763
Memory used:   616019K    348405K    340213K    332866K
I/O read:        0.15s      0.15s      0.26s      0.17s
I/O write:       0.14s      0.15s      0.14s      0.16s
Parse time:      3.28s      3.46s      5.87s      4.00s
Bind time:       0.96s      1.05s      1.26s      1.09s
Check time:     24.39s      6.82s      8.11s      6.40s
Emit time:       1.90s      1.99s      1.99s      2.01s
Total time:     30.52s     13.31s     17.24s     13.50s

Do you ever pass a second type argument to Resolve or is it always the default? Also, I'm not sure how we could generalize that fix. The compiler could probably recognize the case where the type argument is exactly the constraint type, but the conclusion (in this case, that you can just use PropertyValue) seems to depend on outside knowledge.

Actually, no! The second argument is just there to enable the actual narrowing. If I don't specify it and replace U with PropertyValue, it will always get inferred to PropertyValue. It has the nice property of defaulting to PropertyValue – the union type – when the type cannot be narrowed. However, this is now handled by the happy paths when the type parameters are not constrained.

Your reduced repro is really fun. 😄

The repro code is complete garbage, but it clearly shows the problem with branching which is very surprising to me, and might also be to others.

Just to manage expectations, my current read on the problem is that TS doesn't provide a way to efficiently express the behavior you want and changes to the expressiveness of a programming language take time. In the meantime, you may need to trade-off between correctness and perf (or find clever workarounds, as you seem to have done).

That's exactly what I do, but I wasn't aware of the necessity before. No hard feelings, though. If it's not possible to solve, it's fine. The improvements I'm seeing in 3.9.1 after fixing those weird places where I had to use any as an escape hatch within a function, (so no big deal) are very, very few. In fact, only two. However, it took quite some time to pin those situations down.

I'm using generics a lot. I love them. For my undertaking, I've developed an internal standard library with algorithms for trees, automata, graphs etc., and they're all generic. This is also why I'm seeing this problem because I need to preserve typing when I call those generic functions.

We're still looking at it and there may just be a bug that we can fix, but I wanted to warn you that there might not be a satisfactory answer in the short-term.

It would be amazing if the compiler could tell you where there're hot spots in the code, so where it spends the check times. In my case it was the expansion and contraction of large unions. This would be a big, big improvement. My experience digging into this topic showed me that the tooling isn't quite there yet. I understand that this might not have been necessary for the past, but as more and more projects adopt generics, they may run into the same problems. Especially when they upgrade to 3.9.1 and see the same behavior I was seeing. Think of packages like ramda or rxjs which have extraordinary typings and are all generic as they are utility libraries. So compiler hot spots - that'd be awesome!

Thanks for your time!

@amcasey
Copy link
Member

amcasey commented May 7, 2020

Disclaimer: I'm not intimately familiar with these parts of the TypeScript compiler, so you're getting my intuition based on other systems I've worked on.

I understand completely. I don't have any expectations.

Users with no expectations and lots of details are my favorite!

If the return type of parseX2 is exactly the same as in parseX1, which calls parseX2, and the type parameters have the same constraints, why isn't it just "handed down"?

That's not quite right - if parseX1<SomeType> calls parseX2<SomeType>, then they have the same return type. If parseX1<SomeType> calls parseX2 with no type argument, then it's only type inference (expensive) that tells us they're the same. That's why specifying the type argument helps so much.

The repro code is complete garbage

Fun in the sense that it's a compact representation of the problem. I used to work with someone who called repros like this "jewels", because of how compact and elegant they are.

However, it took quite some time to pin those situations down.

I'd love to hear more about how you pinned down the places you needed escape hatches. We're hoping to build a tool that will help users self-service in cases like this (i.e. where a general fix is unlikely in the short-term).

I'm using generics a lot

Generics cause the compiler to check types more lazily, because a lot depends on how they are instantiated. I don't know if you've ever used Haskell, but laziness is great until it's not. At some point, keeping track of deferred work gets more expensive than just doing the work, but the compiler doesn't know that, so you need an out - ! in Haskell. I'm not aware of a comparable escape hatch in TS, but @DanielRosenwasser may have some ideas.

It would be amazing if the compiler could tell you where there're hot spots in the code, so where it spends the check times.

I've been looking for something like that (e.g. #37785), but caching and laziness make it surprisingly difficult to tie costs back to particular statements or libraries. That's why I'm really keen to hear more about how you found your hot spots.

@amcasey
Copy link
Member

amcasey commented May 7, 2020

@ahejlsberg This is the bug we were just talking about. @squidfunk might be able to tell you more about the intent of the code that was reduced to make that repro.

@squidfunk
Copy link
Author

squidfunk commented May 8, 2020

That's not quite right - if parseX1 calls parseX2, then they have the same return type. If parseX1 calls parseX2 with no type argument, then it's only type inference (expensive) that tells us they're the same. That's why specifying the type argument helps so much.

That makes sense. I guess I've got to check my code base for further potential gains.

Fun in the sense that it's a compact representation of the problem. I used to work with someone who called repros like this "jewels", because of how compact and elegant they are.

With garbage I meant it is non-sensical, but yes, it manages to reproduce the problem. I've struggled before to achieve that but the Preact JSX typings seem to be sufficiently complex.

I'd love to hear more about how you pinned down the places you needed escape hatches. We're hoping to build a tool that will help users self-service in cases like this (i.e. where a general fix is unlikely in the short-term).

I pretty much documented the process in this thread. When I got the OOM errors and my project refused to compile, I checked how I can debug tsc, skimmed the docs and saw the diagnostics and CPU profile generation flags (awesome that they exist, btw). The profiles showed me that it seemed to be related to type checking unions, and I knew that I had enormous unions, because code completion time got worse over time and the time was spent in isRelatedToType. I've used the compiler API before to build a code generator (love it), so I knew some of the internals. Then I tried to add some console.log statements in tsc.js, but that didn't lead to any new insights.

As my project is recursive and there are some inevitable cycles due to the fact that I'm parsing a context-free grammar, I started by only compiling a very small subset of the files and commenting out functions by best-guessing what could be potential bottlenecks in checking and replacing them with (...args: any[]) => any stubs. When there was a drop in compilation time, I knew I'm on the right track, so I reverted the functions and started commenting out stuff from the body of those functions. This is also how I found the ternary problem of #38399. That took many hours.

That's basically what I did.

Generics cause the compiler to check types more lazily, because a lot depends on how they are instantiated. I don't know if you've ever used Haskell, but laziness is great until it's not. At some point, keeping track of deferred work gets more expensive than just doing the work, but the compiler doesn't know that, so you need an out - ! in Haskell. I'm not aware of a comparable escape hatch in TS, but @DanielRosenwasser may have some ideas.

Never used it, but quite curious. However, deferring and not doing the work turning out being quite expensive in the end is quite intuitive. Every student knows that 😅

I've been looking for something like that (e.g. #37785), but caching and laziness make it surprisingly difficult to tie costs back to particular statements or libraries. That's why I'm really keen to hear more about how you found your hot spots.

I've seen the compiler telling me "Union is too complex to represent" or something like that, and that is a very frustrating error, because it doesn't tell you how to solve it. In the end, using as any here (I feel dirty) solves it. However, I understand that diagnostics is a probably orthogonal problem to building a compiler with great performance.

@squidfunk
Copy link
Author

squidfunk commented May 8, 2020

Another question: I have another rather small internal project with zero external dependencies. It uses the "es2017" and "esnext" definitions (defined in tsconfig.json). The only definition that it uses from those definitions is the Readonly mapped type. Yet, toggling skipLibCheck has a massive impact on compile time:

skipLibCheck      false    true

Files:              101     101
Lines:            34296   34296
Nodes:           157185  157185
Identifiers:      56737   56737
Symbols:          85308   40276
Types:            58262    1454
Instantiations:  436338    3544
Memory used:    123382K  76193K
I/O read:         0.03s   0.02s
I/O write:        0.02s   0.03s
Parse time:       1.21s   1.19s
Bind time:        0.63s   0.63s
Check time:      22.95s   0.26s
Emit time:        0.43s   0.44s
Total time:      25.23s   2.51s

Check time for 3.8.3 is at around 12s with skipLibCheck set to true.

@RyanCavanaugh RyanCavanaugh added the Needs Investigation This issue needs a team member to investigate its status. label May 8, 2020
@RyanCavanaugh RyanCavanaugh added this to the TypeScript 4.0 milestone May 8, 2020
@amcasey
Copy link
Member

amcasey commented May 9, 2020

That's basically what I did.

I'm sorry you had to deal with that. I don't know whether it makes it better or worse, but that's more or less what we would have done ourselves, given a copy of your code. We'd really like to figure out a way to provide some profiling hints, but it's proved intractable so far.

Having said that, if you can spare the time, I'd love to know whether running this private build on your pre-improvement code would have highlighted the problems you figured out the hard way.

I've seen the compiler telling me "Union is too complex to represent" or something like that, and that is a very frustrating error, because it doesn't tell you how to solve it

When there's a clear fix for an error we try to put it in the error message and/or provide a code fix. Unfortunately, I don't think there's a general purpose solution to this problem (though, as you point out, there might be a general purpose mitigation), but it might be interesting to post a discussion of how to approach the problem on the blog or in the handbook. @DanielRosenwasser?

I understand that diagnostics is a probably orthogonal problem to building a compiler with great performance.

Error messages are one of the most important outputs of a compiler - if you think about it, most people can't type in error-free code on the first attempt, so nearly every successful compilation will be preceded by one or more failing compilations. As a result, producing clear, actionable diagnostics is really a top priority - especially for a compiler like tsc that leaves most input code untouched. So, yes, there's a balance we have to strike between running quickly and producing useful diagnostics (which tend to be expensive), but my personal feeling is that an unactionable diagnostic is basically a compiler bug.

I have another rather small internal project with zero external dependencies

You can use --listFiles or --listFilesOnly to see a list of the files that were consumed. I think you'll find that a whole bunch of default library files were picked up.

The only definition that it uses from those definitions is the Readonly mapped type. Yet, toggling skipLibCheck has a massive impact on compile time

--skipLibCheck causes all checking ("Check time" in the --extendedDiagnostics output) to be skipped for .d.ts files. If you don't pass that flag, then they're treated like any other input files - the whole thing is checked because the error could be anywhere. You might want to leave that checking on in CI, but, personally, I always disable it for my "inner loop".

@amcasey
Copy link
Member

amcasey commented May 9, 2020

Oh, it's only tangentially related, but I wanted to plug this really nice blog post from @andrewbranch about another JSX.IntrinsicElements perf issue: https://blog.andrewbran.ch/polymorphic-react-components/

And, obviously, if he feels like jumping in with an alternative solution for this particular coding pattern, that would be great too. 😉

@squidfunk
Copy link
Author

Oh, it's only tangentially related, but I wanted to plug this really nice blog post from @andrewbranch about another JSX.IntrinsicElements perf issue: https://blog.andrewbran.ch/polymorphic-react-components/

Great article! Thanks for sharing.

Having said that, if you can spare the time, I'd love to know whether running this private build on your pre-improvement code would have highlighted the problems you figured out the hard way.

I removed the as any escape hatches, it compiles now, but I'm afraid the compiler still doesn't complain about the statements that introduce the complexity. It also uses 1.2GB of RAM as opposed to 0.34GB after my fixes:

Files:             1582      1582
Lines:            83223     83223
Nodes:           354210    354231
Identifiers:      92745     92754
Symbols:         488402    487906
Types:           132101    146843
Instantiations:  170514    170215
Memory used:    338361K  1296494K
I/O read:         0.12s     0.14s
I/O write:        0.21s     0.24s
Parse time:       2.75s     3.03s
Bind time:        0.79s     0.87s
Check time:       5.98s    83.77s
Emit time:        2.85s     3.47s
Total time:      12.37s    91.14s

When there's a clear fix for an error we try to put it in the error message and/or provide a code fix. Unfortunately, I don't think there's a general purpose solution to this problem (though, as you point out, there might be a general purpose mitigation), but it might be interesting to post a discussion of how to approach the problem on the blog or in the handbook. @DanielRosenwasser?

I think what we're looking at is the Instantiation² problem. From my knowledge of the TypeScript internals and the way the AST nodes are represented, I would've guessed that the compiler could detect the case when it's necessary to compare two unions and just do the math. For me, it's a union of

You can use --listFiles or --listFilesOnly to see a list of the files that were consumed. I think you'll find that a whole bunch of default library files were picked up.

That's pretty handy. Indeed, a lot of default library files were picked up. However, only two files were added in 3.9.1: lib.esnext.string.d.ts and lib.esnext.promise.d.ts. If I remove esnext from the compilerOptions.lib, compilation time drops by a few seconds, but is still way worse in 3.9.1:

               3.8.3    3.9.1  -esnext
Files:           102      104       92
Lines:         34563    34558    34096
Nodes:        158019   157825   157167
Identifiers:   57093    56996    56754
Symbols:       78449    85820    83352
Types:         54967    58376    56825
Instanations:      -   436632   417334
Memory used: 118964K  133056K  130831K
I/O read:      0.02s    0.01s    0.02s
I/O write:     0.02s    0.02s    0.02s
Parse time:    0.82s    0.78s    0.82s
Bind time:     0.42s    0.46s    0.47s
Check time:    4.08s   13.80s   10.03s
Emit time:     0.29s    0.27s    0.36s
Total time:    5.61s   15.32s   11.69s

I'll see if I reduce the issue to another reproducible case.

@squidfunk
Copy link
Author

After testing the --listFiles flag, I managed to isolate the problem. I have a mono repo that installs some dependencies in the root. One of these dependencies is @types/ramda. If I remove this dependency, compilation time drops, so it's definitely related to the ramda typings.

@types/ramda   without     with
Files:              92       95
Lines:           24714    33729
Nodes:           97467   165851
Identifiers:     35029    56504
Symbols:         34021    97727
Types:           10026    56313
Instantiations:  10639   343360
Memory used:    60292K  129163K
I/O read:        0.01s    0.01s
I/O write:       0.00s    0.00s
Parse time:      0.73s    0.85s
Bind time:       0.35s    0.53s
Check time:      1.11s   10.89s
Emit time:       0.06s    0.05s
Total time:      2.26s   12.33s

I understand that this is a downstream issue, but it may impact a lot of users. I'm not even using the dependency in the files that are compiled (but in other subpackages of the mono repo), but tsc picks it up as it's automatically resolved from @types.

@AviVahl
Copy link

AviVahl commented May 15, 2020

@squidfunk you can use the "types" field of tsconfig to limit it to only load specific @types packages. Bit of a hustle to manage though.

@amcasey
Copy link
Member

amcasey commented May 18, 2020

@squidfunk you can use the "types" field of tsconfig to limit it to only load specific @types packages. Bit of a hustle to manage though.

I don't think that works for types that are included in a non-types npm package, but I could be mistaken.

@andrewbranch
Copy link
Member

It does! 😁

@aecorredor
Copy link

@squidfunk +1 about the @types/ramda. I notice exactly the same thing on my project (also a monorepo with more than 2K lines). I'm wondering about using the types in the tsconfig. Wouldn't that mean that all ramda typings would be lost if ramda types are excluded?

@squidfunk
Copy link
Author

@aecorredor I'm not defining the types in tsconfig.json, as it would mean you have to list all types explicitly. I'm relying on TypeScript's automatic resolution. And yes, if you wouldn't include @types/ramda, typings would be missing/lost. However, you may try to use "skipLibCheck": true in tsconfig.json - it sped things up for me.

@aecorredor
Copy link

aecorredor commented Jun 23, 2020

@squidfunk unfortunately I already had "skipLibCheck": true in my project.

@amcasey
Copy link
Member

amcasey commented Jun 23, 2020

@aecorredor Do you have an example project we can play with?

@aecorredor
Copy link

aecorredor commented Jun 23, 2020

@amcasey it's a close source project. But let me see if I can put together a reproducible repo and send it to you guys. I'm swamped right now but I'll do it as soon as I get a chance. By now I'm pretty sure it's related to ramda types. It happens wherever I'm editing files where curry is being imported. If I take off the import things go back to normal.

@amcasey
Copy link
Member

amcasey commented Jun 23, 2020

Thanks @aecorredor! I agree with your assessment that it's probably ramda, but that library has so much surface area that it's hard for us to pinpoint problems without a repro.

@andrewbranch
Copy link
Member

This is probably tangential to the problem at hand, but to clear up one point of possible confusion:

I'm not defining the types in tsconfig.json, as it would mean you have to list all types explicitly.... And yes, if you wouldn't include @types/ramda, typings would be missing/lost.

This is a very common misconception. When you set a types array, you don’t need to list all the types you’re using; you only need to list ones that you don’t reference any other way. For example, say your types array contains ramda and jest, and you have a source file like

import { compose } from 'ramda';

it('works', () => {
 // ...
});

Here, you’re explicitly referencing the ramda types, and implicitly referencing the jest types. Because you have an import from ramda, its entry in your types array is completely redundant—you can (and should, in my opinion) delete it. However, since you are not directly importing or triple-slash referencing (/// <referenc types="jest" />) jest anywhere in your project, you do need to reference it in your types array for its globals to be visible in your program. Realistically, this usually means that you only need to list global-providing types in your types array.

As @squidfunk correctly pointed out, if you don’t specify types at all, the default behavior is to automatically include everything under node_modules/@types.

@ahejlsberg
Copy link
Member

@amcasey Anything you want me to do here?

@amcasey
Copy link
Member

amcasey commented Jun 30, 2020

@ahejlsberg This is loosely tracking the fact that ramda seems to have perf issues. We're hoping to hear back from @aecorredor with a repro. I can self-assign until it's more actionable.

@amcasey amcasey assigned amcasey and unassigned ahejlsberg Jun 30, 2020
@aecorredor
Copy link

aecorredor commented Jun 30, 2020

@ahejlsberg @aecorredor sorry guys, I've been swamped. Since the issue for us is happening in a big monorepo, coming up with a reproducible one won't be as easy as I thought. Also, just to make this clear, it's not an issue with ramda (since no types are shipped with the library) but with @types/ramda. Myself and others have confirmed that the issue goes away if we remove the types package. You can follow that convo here: #39284 (comment).

@amcasey amcasey closed this as completed Jun 30, 2020
@amcasey amcasey reopened this Jun 30, 2020
@aecorredor
Copy link

@amcasey so, disappointing news: I spent a while trying to come up with a big enough monorepo for the problem to show up, and I just couldn't replicate it. Ours is maybe just too custom/too big, and has too many variables/configs that may be playing a role in causing the issue. I honestly don't have much more time to spend on coming up with the correct reproducible repo. What we're sure of is that it happens everywhere curry is imported, and that if we remove @types/ramda the issue goes away. Sorry I couldn't offer you guys a reproducible repo. If at some point we do pinpoint exactly where the issue is and can make a reproducible one, I'll report back. Thanks either way for being responsive in this thread.

@amcasey
Copy link
Member

amcasey commented Jul 1, 2020

@aecorredor Definitely disappointing, but I appreciate the effort. Thanks for following up!

@aecorredor
Copy link

aecorredor commented Jul 28, 2020

@amcasey someone might have found the repro that I couldn't: #39398

@millsp
Copy link
Contributor

millsp commented Jul 30, 2020

I'll be working during this year on more performant ways to write curry with overloads. In the meantime, if the actual Curry types cause problems, it should be reverted to what it used to be... But that will make some people scream! Or maybe [email protected] will help fix these perf issues since it introduces variadic tuple types #39094 (which are planned to be used in the upcoming ts-toolbelt@7.0.0). For info,this PR cut the type instantiation count by 60%.

Another thing that I could do is to try to further optimize Curry types, not sure if it's possible, though. I would need a repro that fails, I'm not able to reproduce the error at the moment. This would give me ground to work on optimizations... Thanks if someone can do this 👍

@amcasey
Copy link
Member

amcasey commented Jul 31, 2020

Woohoo! Does the update mentioned in #39398 work for you too, @squidfunk?

@millsp
Copy link
Contributor

millsp commented Aug 2, 2020

So I've released another update for Curry types, and it considerably improved performance. infer was not able to UnCurry such types because of their complexity (causing crashes), until this last release. I think that infer was revealing the very same problem this issue pointed out. Let me know!

@k-yle
Copy link
Contributor

k-yle commented Aug 3, 2020

@millsp wow, updating ts-toolbelt to 6.13.39 made tsc 10x faster! Our 35k SLOC project uses ramda extensively and previously took 160-180 seconds to compile, now just 17 seconds. Thank you!

@squidfunk
Copy link
Author

@amcasey – yes, it does 🚀 Check times dropped from 8.10s to 1.55s, which means I can disable skipLibCheck again. So for my part this issue can be closed, as there are now more specific issues. Thanks to all for the great collaboration on this topic!

Before

  • @types/ramda 0.27.6
  • ts-toolbelt 6.9.4
Files:                         148
Lines:                       37337
Nodes:                      175829
Identifiers:                 60365
Symbols:                    100265
Types:                       58395
Instantiations:             349483
Memory used:               134059K
Assignability cache size:    14438
Identity cache size:            19
Subtype cache size:             11
Strict subtype cache size:      24
I/O Read time:               0.02s
Parse time:                  0.42s
ResolveTypeReference time:   0.03s
ResolveModule time:          0.09s
Program time:                0.61s
Bind time:                   0.29s
Check time:                  8.10s
transformTime time:          0.09s
Source Map time:             0.01s
commentTime time:            0.03s
printTime time:              0.39s
Emit time:                   0.39s
I/O Write time:              0.03s
Total time:                  9.40s

After

  • @types/ramda 0.27.14
  • ts-toolbelt 6.13.39
Files:                         148
Lines:                       37468
Nodes:                      176001
Identifiers:                 60534
Symbols:                     93910
Types:                       41027
Instantiations:             123311
Memory used:               122212K
Assignability cache size:     7170
Identity cache size:            27
Subtype cache size:             11
Strict subtype cache size:      24
I/O Read time:               0.02s
Parse time:                  0.40s
ResolveTypeReference time:   0.02s
ResolveModule time:          0.09s
Program time:                0.58s
Bind time:                   0.29s
Check time:                  1.55s
transformTime time:          0.08s
Source Map time:             0.01s
commentTime time:            0.03s
printTime time:              0.33s
Emit time:                   0.33s
I/O Write time:              0.02s
Total time:                  2.75s

@amcasey amcasey closed this as completed Aug 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Investigation This issue needs a team member to investigate its status.
Projects
None yet
Development

No branches or pull requests

9 participants