@@ -505,6 +505,128 @@ control bits specified by the ELF AMD64 ABI.
505
505
506
506
The x87 floating-point control word is not used by Go on amd64.
507
507
508
+ ### arm64 architecture
509
+
510
+ The arm64 architecture uses R0 – R15 for integer arguments and results.
511
+
512
+ It uses F0 – F15 for floating-point arguments and results.
513
+
514
+ * Rationale* : 16 integer registers and 16 floating-point registers are
515
+ more than enough for passing arguments and results for practically all
516
+ functions (see Appendix). While there are more registers available,
517
+ using more registers provides little benefit. Additionally, it will add
518
+ overhead on code paths where the number of arguments are not statically
519
+ known (e.g. reflect call), and will consume more stack space when there
520
+ is only limited stack space available to fit in the nosplit limit.
521
+
522
+ Registers R16 and R17 are permanent scratch registers. They are also
523
+ used as scratch registers by the linker (Go linker and external
524
+ linker) in trampolines.
525
+
526
+ Register R18 is reserved and never used. It is reserved for the OS
527
+ on some platforms (e.g. macOS).
528
+
529
+ Registers R19 – R25 are permanent scratch registers. In addition,
530
+ R27 is a permanent scratch register used by the assembler when
531
+ expanding instructions.
532
+
533
+ Floating-point registers F16 – F31 are also permanent scratch
534
+ registers.
535
+
536
+ Special-purpose registers are as follows:
537
+
538
+ | Register | Call meaning | Return meaning | Body meaning |
539
+ | --- | --- | --- | --- |
540
+ | RSP | Stack pointer | Same | Same |
541
+ | R30 | Link register | Same | Scratch (non-leaf functions) |
542
+ | R29 | Frame pointer | Same | Same |
543
+ | R28 | Current goroutine | Same | Same |
544
+ | R27 | Scratch | Scratch | Scratch |
545
+ | R26 | Closure context pointer | Scratch | Scratch |
546
+ | R18 | Reserved (not used) | Same | Same |
547
+ | ZR | Zero value | Same | Same |
548
+
549
+ * Rationale* : These register meanings are compatible with Go’s
550
+ stack-based calling convention.
551
+
552
+ * Rationale* : The link register, R30, holds the function return
553
+ address at the function entry. For functions that have frames
554
+ (including most non-leaf functions), R30 is saved to stack in the
555
+ function prologue and restored in the epilogue. Within the function
556
+ body, R30 can be used as a scratch register.
557
+
558
+ * Implementation note* : Registers with fixed meaning at calls but not
559
+ in function bodies must be initialized by "injected" calls such as
560
+ signal-based panics.
561
+
562
+ #### Stack layout
563
+
564
+ The stack pointer, RSP, grows down and is always aligned to 16 bytes.
565
+
566
+ * Rationale* : The arm64 architecture requires the stack pointer to be
567
+ 16-byte aligned.
568
+
569
+ A function's stack frame, after the frame is created, is laid out as
570
+ follows:
571
+
572
+ +------------------------------+
573
+ | ... locals ... |
574
+ | ... outgoing arguments ... |
575
+ | return PC | ← RSP points to
576
+ | frame pointer on entry |
577
+ +------------------------------+ ↓ lower addresses
578
+
579
+ The "return PC" is loaded to the link register, R30, as part of the
580
+ arm64 ` CALL ` operation.
581
+
582
+ On entry, a function subtracts from RSP to open its stack frame, and
583
+ saves the values of R30 and R29 at the bottom of the frame.
584
+ Specifically, R30 is saved at 0(RSP) and R29 is saved at -8(RSP),
585
+ after RSP is updated.
586
+
587
+ A leaf function that does not require any stack space may omit the
588
+ saved R30 and R29.
589
+
590
+ The Go ABI's use of R29 as a frame pointer register is compatible with
591
+ arm64 architecture requirement so that Go can inter-operate with platform
592
+ debuggers and profilers.
593
+
594
+ This stack layout is used by both register-based (ABIInternal) and
595
+ stack-based (ABI0) calling conventions.
596
+
597
+ #### Flags
598
+
599
+ The arithmetic status flags (NZCV) are treated like scratch registers
600
+ and not preserved across calls.
601
+ All other bits in PSTATE are system flags and are not modified by Go.
602
+
603
+ The floating-point status register (FPSR) is treated like scratch
604
+ registers and not preserved across calls.
605
+
606
+ At calls, the floating-point control register (FPCR) bits are always
607
+ set as follows:
608
+
609
+ | Flag | Bit | Value | Meaning |
610
+ | --- | --- | --- | --- |
611
+ | DN | 25 | 0 | Propagate NaN operands |
612
+ | FZ | 24 | 0 | Do not flush to zero |
613
+ | RC | 23/22 | 0 (RN) | Round to nearest, choose even if tied |
614
+ | IDE | 15 | 0 | Denormal operations trap disabled |
615
+ | IXE | 12 | 0 | Inexact trap disabled |
616
+ | UFE | 11 | 0 | Underflow trap disabled |
617
+ | OFE | 10 | 0 | Overflow trap disabled |
618
+ | DZE | 9 | 0 | Divide-by-zero trap disabled |
619
+ | IOE | 8 | 0 | Invalid operations trap disabled |
620
+ | NEP | 2 | 0 | Scalar operations do not affect higher elements in vector registers |
621
+ | AH | 1 | 0 | No alternate handling of de-normal inputs |
622
+ | FIZ | 0 | 0 | Do not zero de-normals |
623
+
624
+ * Rationale* : Having a fixed FPCR control configuration allows Go
625
+ functions to use floating-point and vector (SIMD) operations without
626
+ modifying or saving the FPCR.
627
+ Functions are allowed to modify it between calls (as long as they
628
+ restore it), but as of this writing Go code never does.
629
+
508
630
## Future directions
509
631
510
632
### Spill path improvements
0 commit comments