Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Conversation

sergign60
Copy link

@sergign60 sergign60 commented Oct 19, 2016

This PR implements profiler ELT callbacks for AMD64 Linux. It's tested with historical debugger.
The main differences from appropriate Windows implementation:

  1. The arguments to profile callbacks are passed via rdi & rsi registers (DONE)
  2. It should be taken into account that rax & rdx and xmm0 & xmm1 are integer and float return registers (DONE)
  3. Two integer registers can be used for passing arguments of structure value (DONE)
  4. The struct _PROFILE_PLATFORM_SPECIFIC_DATA should be expanded for storing arguments registers: xmm0-xmm7 & rdi, rsi, rdx, rcx, r8, r9 ( @sivarv: it is correct to assume that on am64 (both unix and windows) the argument registers are intact (or untouched) at the point of generating Enter call back) (DONE)
  5. The temporary registers r10 & r11 are used for temporary computation - for storing addresses of callbacks and inside them (DONE)
  6. There is the problem to pass original values of rdi & rsi into profiling callback. This PR proposes to use r12&r13 for it
    @sivarv: I don't think this (regSet.AddMaskVars(REG_R10 | REG_R11 | REG_R12 | REG_R13);) is sufficient to have R10-R13 saved in prolog.
    Note then genProfilingEnterCallback() is called after OS prolog and zero init of frame. By the time we are generating Enter callback, we have already generated the method prolog that has saved all required callee saved regs.
  7. @BruceForstall: Can you please update Documentation/botr/clr-abi.md, specifically the "# Profiler Hooks" section, to be correct, and include information about, the Linux AMD64 ABI? (DONE)

@sergign60
Copy link
Author

CC @kvochko @Dmitri-Botcharnikov

@BruceForstall
Copy link

There's a big "TODO" in codegenxarch.cpp, genReturn(), that it looks like you haven't addressed yet:

    // TODO-AMD64-Unix: If the profiler hook is implemented on *nix, make sure for 2 register returned structs
    //                  the RAX and RDX needs to be kept alive. Make the necessary changes in lowerxarch.cpp
    //                  in the handling of the GT_RETURN statement.
    //                  Such structs containing GC pointers need to be handled by calling gcInfo.gcMarkRegSetNpt
    //                  for the return registers containing GC refs.

@BruceForstall
Copy link

cc @dotnet/jit-contrib

@sergign60
Copy link
Author

@BruceForstall thanks for your comment!

Is it not enough to add push rax; push rdx; ... ; pop rdx; pop rax; into ProfileLeaveNaked for this TODO?

@BruceForstall
Copy link

The issue isn't the values in registers getting preserved, it is proper reporting of the GC information for those registers.

If you look at Compiler::impFixupStructReturnType(), you can see how multi-reg returns of 8-15 byte structs are massaged:

        // In case of multi-reg struct return, we force IR to be one of the following:
        // GT_RETURN(lclvar) or GT_RETURN(call).  If op is anything other than a
        // lclvar or call, it is assigned to a temp to create: temp = op and GT_RETURN(tmp).

In genReturn(), I believe that under #ifdef FEATURE_UNIX_AMD64_STRUCT_PASSING, you would need to check if the return type is struct, and then, if so, check each eight-byte to see if it is a GC type, using something like varTypeIsGC(GetEightByteType(structDesc,...)). Then, call the gcMarkRegPtrVal() / gcMarkRegSetNpt() functions like is currently done for "normal" single-register returns.

@sergign60
Copy link
Author

sergign60 commented Nov 9, 2016

@BruceForstall I've placed the following code in the CodeGen::genReturn function

#ifdef FEATURE_UNIX_AMD64_STRUCT_PASSING 
       if (varTypeIsStruct(compiler->info.compRetType))
       {
            SYSTEMV_AMD64_CORINFO_STRUCT_REG_PASSING_DESCRIPTOR structDesc;
            CORINFO_CLASS_HANDLE retClsHnd = compiler->info.compMethodInfo->args.retTypeClass;
            assert(retClsHnd != NO_CLASS_HANDLE);
            compiler->eeGetSystemVAmd64PassStructInRegisterDescriptor(retClsHnd, &structDesc);
            for (unsigned int i = 0; i < structDesc.eightByteCount; i++)
            {
                if (varTypeIsGC(compiler->GetEightByteType(structDesc, i)))
                {
                    gcInfo.gcMarkRegPtrVal(REG_INTRET, compiler->GetEightByteType(structDesc, i));
                }
            }
            genProfilingLeaveCallback();
            for (unsigned int i = structDesc.eightByteCount; i > 0; i--)
            {
                  if (varTypeIsGC(compiler->GetEightByteType(structDesc, i - 1)))
                  {
                      gcInfo.gcMarkRegSetNpt(REG_INTRET);
                  }
            }
            return;
        }
#endif // FEATURE_UNIX_AMD64_STRUCT_PASSING

but I got the following assertion

Regset after BB07 gcr=00000001 {rax}, byr=00000000 {}, regVars=00000000 {}
genCodeForBBList assert: nonVarPtrRegs 1 RBM_NONE 0

Assert failure(PID 5015 [0x00001397], Thread: 5015 [0x1397]): Assertion failed 'nonVarPtrRegs == RBM_NONE' in 'System.RuntimeTypeHandle:GetMetadataImport(ref):struct' (IL size 18)

    File: .. coreclr/src/jit/codegenlinear.cpp Line: 423

Could you give me a hint to perform it by the right way? Thanks in advance

@sergign60
Copy link
Author

sergign60 commented Nov 9, 2016

@BruceForstall as I see the current version of coreclr generate the following code for a method that returns struct value

G_M58419_IG08:
       488B45C0             mov      rax, gword ptr [rbp-40H]
       488B55C8             mov      rdx, qword ptr [rbp-38H]
       48BF501F3DCA357F0000 mov      rdi, 0x7F35CA3D1F50
       488D7510             lea      rsi, [rbp+10H]
       49BB005A8B43367F0000 mov      r11, 0x7F36438B5A00
       41FF13               call     qword ptr [r11]CORINFO_HELP_PROF_FCN_LEAVE
       90                   nop      

G_M58419_IG09:
       488D65F8             lea      rsp, [rbp-08H]
       415D                 pop      r13
       5D                   pop      rbp
       C3                   ret      

Could you give some sample template of generated code for this case ?

@sivarv
Copy link
Member

sivarv commented Nov 9, 2016

@sergign60

On unix structs of size > 8 and <=16 gets returned in two return registers RAX/XMM0 and RDX/XMM1. Consider a struct with one or two gc-ref type fields being returned. The gc-ref could be in RAX or RDX. Whichever register contains gc-ref that needs to be explicitly reported.

In the code that you have posted above you are iterating through each of the 8-bytes of the struct and always marking REG_INTRET (which is RAX) as gcptr or non-ptr. That could be the reason why you are hitting an assert. Here is the right way to do it if method is returning multi-reg return type struct

#ifdef FEATURE_UNIX_AMD64_STRUCT_PASSING
if (compiler->compMethodReturnsMultiRegRetType())
{
    ReturnTypeDesc retTypeDesc;
    retTypeDesc.InitializeStructReturnType(compiler, varDsc->lvVerTypeInfo.GetClassHandle());

   gcInfo.gcMarkRegPtrVal(retTypeDesc.GetABIReturnReg(0), retTypeDesc.GetReturnRegType(0))
   gcInfo.gcMarkRegPtrVal(retTypeDesc.GetABIReturnReg(1), retTypeDesc.GetReturnRegType(1))

   genProfilingLeaveCallback();

   gcInfo.gcMarkRegSetNpt(retTypeDesc.GetABIReturnReg(0))
   gcInfo.gcMarkRegSetNpt(retTypeDesc.GetABIReturnReg(1))
}
#endif

Apart from the generated code, what you need to verify is that gcinfo is correctly reported for return regs containing gc-refs around the call to profile leave callback.

@sergign60
Copy link
Author

@sivarv many thanks for your hint

unsigned __int8 offset1 = 0;
var_types type0 = TYP_UNKNOWN;
var_types type1 = TYP_UNKNOWN;
compiler->GetStructTypeOffset(structDesc, &type0, &type1, &offset0, &offset1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this code got duplicated. How about abstracting this in Compiler::GetStructTypeOffset() overload whose first argument is a CORINFO_CLASS_HANDLE and has out params to return offset and type of eightbytes of the struct?

genEmitHelperCall(CORINFO_HELP_PROF_FCN_ENTER, 0, EA_UNKNOWN, REG_ARG_2);
#else
genEmitHelperCall(CORINFO_HELP_PROF_FCN_ENTER, 0, EA_UNKNOWN, REG_R11);
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific reason why we want to use REG_ARG2 on amd64 windows and REG_R11 on Unix?
Please note that without specifying a callTargetReg, genEmitHelperCall() would use REG_RAX.

#if FEATURE_VARARG
if (compiler->info.compIsVarArgs && varTypeIsFloating(loadType))
#if defined(UNIX_AMD64_ABI)
#ifdef FEATURE_UNIX_AMD64_STRUCT_PASSING
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think FEATURE_UNIX_AMD64_STRUCT_PASSING #ifdef alone is sufficient here.

regNumber argReg = varDsc->lvArgReg;
getEmitter()->emitIns_S_R(ins_Store(storeType), emitTypeSize(storeType), argReg, varNum, 0);
#if defined(UNIX_AMD64_ABI)
#ifdef FEATURE_UNIX_AMD64_STRUCT_PASSING
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think UNIX_AMD64_ABI is not required here.

genEmitHelperCall(helper, 0, EA_UNKNOWN, REG_ARG_2);
#else
genEmitHelperCall(helper, 0, EA_UNKNOWN, REG_R11);
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any specific reason why we don't want to use default call target reg (rax)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sivarv I guess that it would be better to use r10 & r11 for intermediate computations on Linux

@@ -1152,6 +1152,39 @@ void CodeGen::genReturn(GenTreePtr treeNode)
// Also, there is not much to be gained by materializing it as an explicit node.
if (compiler->compCurBB == compiler->genReturnBB)
{
#if defined(_TARGET_AMD64_)
#ifdef FEATURE_UNIX_AMD64_STRUCT_PASSING
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This #ifdef is redundant since FEATURE_UNIX_AMD64_STRUCT_PASSING is defined only on amd64.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what you mean is that the first #if is redundant, as it is implied by the #ifdef


In reply to: 87924333 [](ancestors = 87924333)


unsigned regCount = retTypeDesc.GetReturnRegCount();

if (varTypeIsGC(compiler->info.compRetType))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

      if (varTypeIsGC(compiler->info.compRetType)) [](start = 1, length = 55)

This condition would always be false if the return type is a struct (i.e. TYP_STRUCT) returned in two regs.
This if-check is not required and should be deleted. This comment also applies to if-check on line 1176.

Copy link
Author

@sergign60 sergign60 Nov 15, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sivarv I tried it but I have got the assertion in codegenlinear.cpp

System.RuntimeTypeHandle:GetMetadataImport(ref):struct: Regset after BB07 gcr=00000001 {rax}, byr=00000000 {}, regVars=00000000 {}

Assert failure(PID 27642 [0x00006bfa], Thread: 27642 [0x6bfa]): Assertion failed 'nonVarPtrRegs == RBM_NONE' in 'System.RuntimeTypeHandle:GetMetadataImport(ref):struct' (IL size 18)

    File: /home/signatov/TEST/historical_debugging/coreclr/src/jit/codegenlinear.cpp Line: 422
    Image: /home/signatov/TEST/historical_debugging/coreclr/bin/Product/Linux.x64.Debug/corerun

In my test case this assertion arises only on the forth method where the code with gcMarkRegPtrVal was inserted. Three methods with this code

System.RuntimeTypeHandle:GetIntroducedMethods(ref):struct
IntroducedMethodEnumerator:GetEnumerator():struct:this
System.RuntimeMethodHandle:GetUtf8Name(long):struct

are compiled successfully before this assertion. May be you have some ideas?

Copy link
Author

@sergign60 sergign60 Nov 15, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sivarv I inserted debug prints and see the following

     === System.RuntimeTypeHandle:GetMetadataImport(ref):struct: regCount 2
00000000    00000000
in GCInfo::gcMarkRegPtrVal type 0xe TYP_LONG 0xa TYP_REF 0xe TYP_BYREF 0xf
in GCInfo::gcMarkRegSetGCref
                            GC regs: 00000000 {} => 00000001 {rax}
                            Byref regs: (unchanged) 00000000 {}
00000001    00000000
in GCInfo::gcMarkRegPtrVal type 0xa TYP_LONG 0xa TYP_REF 0xe TYP_BYREF 0xf
in GCInfo::gcMarkRegSetNpt
                            GC regs: (unchanged) 00000001 {rax}
                            Byref regs: (unchanged) 00000000 {}
00000001    00000000
--------->>>>>>>>
in GCInfo::gcMarkRegSetNpt
                            GC regs: (unchanged) 00000001 {rax}
                            Byref regs: (unchanged) 00000000 {}
in GCInfo::gcMarkRegSetNpt
                            GC regs: (unchanged) 00000001 {rax}
                            Byref regs: (unchanged) 00000000 {}
System.RuntimeTypeHandle:GetMetadataImport(ref):struct: Regset after BB07 gcr=00000001 {rax}, byr=00000000 {}, regVars=00000000 {}

Assert failure(PID 12181 [0x00002f95], Thread: 12181 [0x2f95]): Assertion failed 'nonVarPtrRegs == RBM_NONE' in 'System.RuntimeTypeHandle:GetMetadataImport(ref):struct' (IL size 18)

    File: /home/signatov/TEST/historical_debugging/coreclr/src/jit/codegenlinear.cpp Line: 422
    Image: /home/signatov/TEST/historical_debugging/coreclr/bin/Product/Linux.x64.Debug/corerun

In the previous successfully compiled methods arg type is TYP_LONG.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I see the argument of gcMarkRegSetNpt is regMaskTP not var_types, so I should replace

gcInfo.gcMarkRegSetNpt(var_types);

by

gcInfo.gcMarkRegSetNpt(genRegMask(var_types));

Am I right?

@@ -7298,13 +7298,15 @@ void CodeGen::genProfilingEnterCallback(regNumber initReg, bool* pInitRegZeroed)
return;
}

#if defined(_TARGET_AMD64_) && !defined(UNIX_AMD64_ABI) // No profiling for System V systems yet.
#if defined(_TARGET_AMD64_)
unsigned varNum;
LclVarDsc* varDsc;

// Since the method needs to make a profiler callback, it should have out-going arg space allocated.
noway_assert(compiler->lvaOutgoingArgSpaceVar != BAD_VAR_NUM);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this first noway_assert also be under the following #if? I believe it's only defined under FEATURE_FIXED_OUT_ARGS which is only true for amd64.

gcInfo.gcMarkRegSetNpt(retTypeDesc.GetABIReturnReg(i));
}
}
return;
Copy link
Member

@sivarv sivarv Nov 15, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

      return; [](start = 1, length = 18)

returns in the middle of the method are very hard to notice and could lead to accidental bugs. Instead I would suggest this to restructure this logic as follows. Note that the below logic requires no #ifdefs.

e.g.

       ReturnTypeDesc retTypeDesc;
        if (compiler->compMethodReturnsMultiRegRetType())
        {
            CORINFO_CLASS_HANDLE retClsHnd = compiler->info.compMethodInfo->args.retTypeClass;
            retTypeDesc.InitializeStructReturnType(compiler, retClsHnd);
        }

        if (varTypeIsGC(compiler->info.compRetType))
        {
            gcInfo.gcMarkRegPtrVal(REG_INTRET, compiler->info.compRetType);
        }
        else if (compiler->compMethodReturnsMultiRegRetType())
        {
              for (unsigned i = 0; i < retTypeDesc.GetReturnRegCount(); ++i)
                {
                    gcInfo.gcMarkRegPtrVal(retTypeDesc.GetABIReturnReg(i), retTypeDesc.GetReturnRegType(i));
                }
        }

       genProfilingLeaveCallback();

        if (varTypeIsGC(compiler->info.compRetType))
        {
            gcInfo.gcMarkRegSetNpt(REG_INTRET);
        }
       else  if (compiler->compMethodReturnsMultiRegRetType())
       {
                for (unsigned i = 0; i < retTypeDesc.GetReturnRegCount(); ++i)
                {
                    gcInfo.gcMarkRegSetNpt(retTypeDesc.GetABIReturnReg(i));
                }
       }

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sivarv I think that it is the good idea, thanks

@@ -7526,11 +7581,12 @@ void CodeGen::genProfilingLeaveCallback(unsigned helper /*= CORINFO_HELP_PROF_FC
// Need to save on to the stack level, since the helper call will pop the argument
unsigned saveStackLvl2 = genStackLevel;

#if defined(_TARGET_AMD64_) && !defined(UNIX_AMD64_ABI) // No profiling for System V systems yet.

#if defined(_TARGET_AMD64_)
// Since the method needs to make a profiler callback, it should have out-going arg space allocated.
noway_assert(compiler->lvaOutgoingArgSpaceVar != BAD_VAR_NUM);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here too I think that this needs to be only under !UNIX_AMD64_ABI

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CarolEidt sorry, but I didn't quite catch you. This code is for both Unix && Windows on AMD64.

// - integer argument registers (rcx, rdx, r8, r9)
// - floating point argument registers (xmm1-3)
// - volatile integer registers (r10, r11)
// - volatile floating point registers (xmm4-5)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should add here something like:

  • upper halves of ymm registers on AVX (which are volatile)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CarolEidt ok, I'll insert it

push_argument_register rcx
push_argument_register rdx
push_nonvol_reg r10
push_nonvol_reg rax

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are these pushed registers represented in GC info? Or does the GC "know" about these?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ELT profiler helpers are considered NO GC helpers. That is GC won't happen within the helper call.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great - thanks for clarifying that.

@sivarv
Copy link
Member

sivarv commented Nov 15, 2016

CC : @noahfalk - to review profiler helpers.

// Upon entry :
// rdi = clientInfo
// rsi = profiledRsp

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On windows these arguments are passed in rcx and rdx. Does Linux use a different 64bit ABI in general or is there a reason for the difference?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linux has different ABI for x64. It has 6 argument registers - RDI, RSI, RDX, RCX, R8 and R9

Copy link
Author

@sergign60 sergign60 Nov 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@noahfalk According to the Unix ABI, the first 6 integer or pointer arguments to a function are passed in registers. The first is placed in rdi, the second in rsi, the third in rdx, and then rcx, r8 and r9.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, makes sense then

push_nonvol_reg r10
push_nonvol_reg rax

lea rax, [rsp + 0x10] // caller rsp
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rarely deal with assembly, but it doesn't seem like this would calculate the caller's rsp correctly?

At this point in the function we have already pushed r15, r14, r13, r12, rbp, rbx, rsi, rdi, r10, and rax. Did those pushes generate code that doesn't update rsp?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this is not correct. It was this way in the Windows version of this function when only RAX was pushed on the stack.
@sergign60 what is the reason for pushing all the callee saved registers plus RSI, RDI and R10 when Windows version doesn't do that (I guess that's why the function name is ProfileEnterNaked)

Copy link
Author

@sergign60 sergign60 Nov 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@noahfalk @janvorli I removed all redundant register savings, added only rdx saving in Leave because of struct return ABI



// setup ProfilePlatformSpecificData structure
xor r11, r11 // nullify r11
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason not to use r8 so that we retain consistency with the windows implementation?

Copy link
Author

@sergign60 sergign60 Nov 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@noahfalk I guess that it would be better to use r10 & r11 for intermediate computations on Linux

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense now that I understand the linux ABI difference

// Upon entry :
// rdi = clientInfo
// rsi = profiledRsp

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same questions as above on ProfileEnter

// } PROFILE_PLATFORM_SPECIFIC_DATA, *PPROFILE_PLATFORM_SPECIFIC_DATA;
//
.equ SIZEOF_PROFILE_PLATFORM_SPECIFIC_DATA, 0x8*11 + 0x4*2 // includes fudge to make FP_SPILL right
.equ SIZEOF_OUTGOING_ARGUMENT_HOMES, 0x8*6
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not right. AMD64 ABI on Linux doesn't home arguments. On Windows, there are always 4 stack slots reserved for storing the 4 argument registers before the return address. So you can just remove this constant and also the OFFSETOF_PLATFORM_SPECIFIC_DATA below, since they are both zero.

Copy link
Author

@sergign60 sergign60 Nov 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janvorli thanks for this note.

@sergign60
Copy link
Author

@janvorli
Sorry I was ill last week.

@sergign60 sergign60 force-pushed the profelt branch 2 times, most recently from 95b6b51 to b4c7322 Compare April 2, 2017 14:47
@sergign60
Copy link
Author

sergign60 commented Apr 2, 2017

@janvorli Now I get the another set of failed tests, but the reason is not this profiling callbacks implementation.

For example,

coreclr/Tests/JIT/Regression/VS-ia64-JIT/V1.2-M01/b10827/b10827

fails (segmentation fault) with

COMPlus_JitELTHookEnabled=1
COMPlus_GCStress=0xC
COMPlus_TailcallStress=1

both with the profiler's callback calls and when I commented their calls.

@sergign60 sergign60 force-pushed the profelt branch 3 times, most recently from ad9fa9a to 65eb9e6 Compare April 4, 2017 14:42
@sergign60
Copy link
Author

@sergign60
Copy link
Author

#10857

@sergign60
Copy link
Author

sergign60 commented Apr 10, 2017

@janvorli I've got the results on Windows. Please look at [AMD64 Windows ELT] Some CoreCLR tests fail with ELT hooks enabled and GCStress & TailcallStress 10857

@sergign60 sergign60 force-pushed the profelt branch 4 times, most recently from 24e48d9 to e180fe6 Compare April 14, 2017 09:18
@hseok-oh
Copy link

@dotnet-bot test Ubuntu arm Cross Release Build

@danmoseley
Copy link
Member

@janvorli please see @sergign60 note above.

@@ -7654,6 +7709,7 @@ void CodeGen::genProfilingEnterCallback(regNumber initReg, bool* pInitRegZeroed)
//
void CodeGen::genProfilingLeaveCallback(unsigned helper /*= CORINFO_HELP_PROF_FCN_LEAVE*/)
{

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather remove such new empty lines

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rartemev fixed

// UINT64 flt1;
// UINT64 flt2;
// UINT64 flt3;
// #if defined(UNIX_AMD64_ABI)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it source file used only for Unix? If so I think it would be good to remove this #ifdef bracket

mov [rsp + 0x88], rdx // -- struct rdx field
mov [rsp + 0x90], rcx // -- struct rcx field
mov [rsp + 0x98], r8 // -- struct r8 field
mov [rsp + 0xa0], r9 // -- struct r9 field

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to save those scratch registers (rdi-r9)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rartemev these registers are not scratch registers, they are argument registers.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants