Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Remove PAL FileAlignment restriction #10959

Merged

Conversation

sdmaclea
Copy link

@sdmaclea sdmaclea commented Apr 13, 2017

This change removes the requirement of crossgen native images on PAL systems to have FileAlignmnet set to a multiple of the OS_PAGE_SIZE.

For arm64 with 64K pages this saves 3*64K pages of zero padding in the smallest native images. For larger native images this may result in 50% of this savings.

4K page platforms this only save 8K for the smallest native images.

@sdmaclea
Copy link
Author

@jkotas @janvorli PTAL

@sdmaclea
Copy link
Author

Ran 10870 test on 4K page on [Arm64/Unix] all tests passed.

@@ -55,7 +55,8 @@ void ZapWriter::Initialize()
m_FileAlignment = 0x200;
}

#define SECTION_ALIGNMENT 0x1000
#define SECTION_ALIGNMENT m_FileAlignment
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to do this for Unix (FEATURE_PAL) only.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkotas Done

@sdmaclea
Copy link
Author

Also ran tests on 64K [Arm64/Unix] platform with updated 64K patch. Tests all passing.

@sdmaclea
Copy link
Author

sdmaclea commented Apr 14, 2017

Centos is seeing this failure during crossgen of System.Private.Corelib.dll

{0x857b-0xb42800} ASSERT [VIRTUAL] at /mnt/resource/j/workspace/dotnet_coreclr/master/debug_centos7.1_prtest/src/pal/src/map/virtual.cpp.2101: Expression: (allocationSize & VIRTUAL_PAGE_MASK) == 0

@sdmaclea
Copy link
Author

The CentOS failure is actually in crossgen with option /CreatePerfMap. I had not been running this with my testing.

@sdmaclea
Copy link
Author

Looks like all/most "Ubuntu arm Cross Release Build" PR tests are failing

@@ -172,7 +183,9 @@ void PEImageLayout::ApplyBaseRelocations()
dwOldProtection = 0;
}

IMAGE_SECTION_HEADER *pSection = RvaToSection(rva);
USHORT fixup = fixupsCount ? VAL16(fixups[0]) : 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is hard to see whether the fixupsCount==0 case works here. I think it would be better to handle it upfront and do nothing for it - add if (fixupsCount ==0) continue; right after the line that computes the fixupsCount.

It should not ever happen anyway.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkotas Sorry. Missed this request. Will do.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkotas The assertions on lines 162 and 163 effectively guarantee fixupsCount is non-zero. I suspected it was not supposed to happen, but I was not sure, but now that I see the asserts, I know it is designed to not happen.

Therefore I think it is OK to just simplify line 186. Otherwise I can move line 242 to 159 and add an if (fixupsCount ==0) continue, but it is unnecessary code complexity

@jkotas
Copy link
Member

jkotas commented Apr 15, 2017

How does this work when images generated before this change are loaded on runtime with this change, and vice versa?

@sdmaclea
Copy link
Author

@jkotas

images generated before this change are loaded on runtime with this change

There should effectively be no change. OS with 4K pages should work correctly. OS with 64K pages should continue to fail

vice versa

Currently I think these will all fail on *nix PAL systems.

If you wanted to support this vice/versa combination, the PAL file alignment could be kept at 4K by reverting the changes to zapimage.cpp. If that was done, the changes to pedecoder.cpp and peimagelayout.cpp could also be reverted.

@hseok-oh
Copy link

@dotnet-bot test Ubuntu arm Cross Release Build

pvBaseAddress = mmap(addr, len, prot, flags, fd, offset);
off_t adjust = offset & (VIRTUAL_PAGE_MASK);

pvBaseAddress = mmap(static_cast<char *>(addr) - adjust, len + adjust, prot, flags, fd, offset - adjust);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand it correctly, when the addr is not aligned to the page size, we end up mapping some sections to virtual address space multiple times, possibly with different protection settings. So if there was a R/W section next to an executable section, we would create a potential security hole that would allow an attacker to modify the executable code through the adjacent R/W mapping.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, on amd64 where we use the executable allocator to ensure that the virtual address space where we map the image is located close to the libcoreclr.so, this change breaks that functionality. When a section cannot be mapped to the requested address because the previous section mappings have consumed more virtual address space than expected, mmap can map it to a random address in the whole VA space. That would effectively kill the performance benefits we get from mapping close to the libcoreclr.so.
To fix that, it seems one way would be to augment the sectionBase in the MAPMapPEFile for each call to MAPmmapAndRecord to reflect the expected mapping address based on the page size.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janvorli Regarding suggested security hole. The mmap of the two pages with different security permissions should not create the security hole you describe. It will create two mappings. Probably even to the same physical memory initially, but the "writable" one will be marked for copy on write so that it cannot modify the clone. This is because we asked for the mapping to be PRIVATE.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janvorli I do not think the amd64 issue is real either. We are using the FIXED flag which is supposed to error if we cannot mmap the specified address.

Copy link
Author

@sdmaclea sdmaclea Apr 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmap for MAP_PRIVATE documentation is unclear about whether two PRIVATE mappings of the same file affect each other.

MAP_PRIVATE
        Create a private copy-on-write mapping.  Updates to the mapping are 
        not visible to other  processes  mapping  the same file, and are not 
        carried through to the underlying file.  It is unspecified whether 
        changes made to the file after the mmap() call are visible in the 
        mapped region.

For the amd64 issue, the mmap documentation is sufficiently clear.

MAP_FIXED
          Don't interpret addr as a hint: place the mapping at exactly that 
          address.  addr must be a multiple  of  the  page size.   If  the  
          memory region specified by addr and len overlaps pages of any 
          existing mapping(s), then the overlapped part of the existing 
          mapping(s) will be discarded.  If the specified address cannot be  
          used,  mmap()  will fail.  Because requiring a fixed address for a 
          mapping is less portable, the use of this option is discouraged.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sdmaclea I am sorry, I've missed the MAP_PRIVATE flag presence, so you are right and there is no security hole.
As for the amd64 issue I also missed the MAP_FIXED flag, so you are right too that it is not causing the issue I was referring to. The mappings of multiple consecutive sections that are not page aligned would just overlap. But that brings another concern for all targets. What if these sections have different protection? Then it seems that after the file is mapped, all sections that map into a single virtual address space page would have the protection of the last section on the page. And what if one of these sections except the last one was executable? Wouldn't that kill its executability? And vice versa. This issue would not reproduce in 100% cases since it depends on the actual sizes and order of the sections. Have you tried to run CoreFX tests with your change on amd64? I guess that would have a higher chance of reproducing the issue.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janvorli The intent is that two sections never map to the same virtual page. That is why the virtual address is padded by the maximum page size in the zapWriter.cpp changes included in the PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've again missed that. I am sorry, I probably have an off day today. But in that case, it seems it would be good to set the PAL_MAX_PAGE_SIZE / SECTION_ALIGNMENT differently for 32 bit architectures (probably the same as we do for non-PAL in your change) to reduce the virtual address space consumption.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janvorli That sounds reasonable. Do you have a particular guard name to recommend? We could just use the larger PAL_MAX_PAGE_SIZE for TARGET_ARM64.... I was using the generic PAL because of something @jkotas had said. I also liked the idea of this being in amd64 CI testing, since arm64 testing is so light.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use BIT64 && FEATURE_PAL to enable it for ARM64 and AMD64, since the cross tools are built for the same bitness of host and target architecture. I guess that if we changed that in the future, we would introduce something like TARGET_BIT64.

@@ -2445,7 +2439,7 @@ void * MAPMapPEFile(HANDLE hFile)
// First try to reserve virtual memory using ExecutableAllcator. This allows all PE images to be
// near each other and close to the coreclr library which also allows the runtime to generate
// more efficient code (by avoiding usage of jump stubs).
loadedBase = ReserveMemoryFromExecutableAllocator(pThread, virtualSize);
loadedBase = ReserveMemoryFromExecutableAllocator(pThread, ((virtualSize-1) & ~VIRTUAL_PAGE_MASK) + VIRTUAL_PAGE_SIZE);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This calculation is incorrect. The allocated size needs to be the size of the virtual address space range that is needed to map the whole executable file on the current system. That can be more than just the virtualSize aligned up to the page size (each section mapped with adjustment requires one more page of virtual memory)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janvorli virtualSize covers the whole virtual size of the file including the holes. It seems to be correct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sdmaclea your explanation of the virtual address padding explains this as well.

@@ -55,7 +55,13 @@ void ZapWriter::Initialize()
m_FileAlignment = 0x200;
}

#if defined(FEATURE_PAL) && defined(BIT64)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this change broke arm32

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forced the file alignment back to 4K for 32 bit systems.

@sdmaclea
Copy link
Author

@janvorli @jkotas I think I have responded to all feedback. All checks are passing.

I am working on #10981, but I am realizing the clean up of VIRTUAL_PAGE_MASK and VIRTUAL_PAGE_SIZE will introduce merge conflicts with this PR

I wanted to keep the patches separate to make them easier to review. I can remerge them if you think it is best.

I can put a temporary workaround to avoid the merge conflict. I think I can make all the changes to #10981 except the ones to src/pal/src/map/map.cpp.

Also simplify previous section code
sdmaclea added a commit to sdmaclea/coreclr that referenced this pull request Apr 20, 2017
@sdmaclea
Copy link
Author

@dotnet-bot
test Ubuntu arm Cross Release Build
test Tizen armel Cross Debug Build
test Windows_NT x64 Debug Build and Test

@sdmaclea
Copy link
Author

Ubuntu arm Cross passed once and failed twice with segmentation faults in two different tests.

@dotnet-bot
test Ubuntu arm Cross Release Build

@gkhanna79
Copy link
Member

@sdmaclea When do you expect to have this merged? I believe the changes you have made in this PR will help my PR (#11040) as the alignment issue you are fixing will help my change consume IL assembly of CoreLib being bound as NI image (thanks to @janvorli who helped investigate the issue on Friday) as opposed to forcing bind as IL assembly.

@sdmaclea
Copy link
Author

@gkhanna79 I am ready to have this merged.

@gkhanna79
Copy link
Member

@janvorli @jkotas Any further comments before I merge this?

@jkotas
Copy link
Member

jkotas commented Apr 24, 2017

LGTM

@gkhanna79 gkhanna79 merged commit 3f67146 into dotnet:master Apr 24, 2017
@sdmaclea sdmaclea deleted the PR-Remove-PAL-PE-FileAlignment-Restrictions branch April 24, 2017 18:01
@karelz karelz modified the milestone: 2.0.0 Aug 28, 2017
sdmaclea added a commit to sdmaclea/coreclr that referenced this pull request Sep 29, 2017
* Remove PAL FileAlignment restriction

* Address PR dotnet#10959 feedback

* Fix amd64 crossgen error

* Respond to review feedback

* Fix arm32 regression

* Prepare to remove VIRTUAL_PAGE_* from map.cpp

Also simplify previous section code

* Rename function to GetVirtualPageSize()
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants