Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Custom Marshalers in custom-ALC-loaded assemblies results in types loaded from crossing ALCs #21606

Merged
merged 18 commits into from
Jan 14, 2019

Conversation

jkoritzinsky
Copy link
Member

@jkoritzinsky jkoritzinsky commented Dec 19, 2018

When using custom AssemblyLoadContexts, if a dll is loaded into two or more custom ALCs that reference P/Invoke into a native dll that is not loaded into the default ALC, the marshalling subsystem will load the an ICustomMarshaler instance from the wrong ALC, leading to a type mismatch when marshalling. This is the underlying bug of #19654.

cc: @jeffschwMSFT @AaronRobinsonMSFT @sdmaclea

@jkoritzinsky
Copy link
Member Author

@dotnet-bot test this please

@jkoritzinsky
Copy link
Member Author

@sdmaclea I've applied your feedback to the test case. The bug still repros even without using a static field to store a singleton instance of the custom marshaler.

@jkoritzinsky
Copy link
Member Author

I've traced the bug to EEMarshalingData::GetCustomMarshalerHelper. The cache that we use for our CustomMarshalerHelper objects doesn't key the elements on the module or assembly. Additionally, the cache is stored on the BaseDomain object. So, if two custom marshalers have the same type name (as a string), cookie, and generic instantiation, we will mistake all of them for the first.

In the case here, we will always mistake the second for the first because they are from the same dll, so even if they are loaded into different domains, the type name and cookie will always be the same.

@janvorli
Copy link
Member

@jkoritzinsky it seems to me that if we moved that cache into LoaderAllocator instead, it would solve the issue, right?

@jkoritzinsky
Copy link
Member Author

jkoritzinsky commented Dec 20, 2018

That would definitely solve this bug. However, that wouldn't fix the bug where you have a two different custom marshaler types with the same name in different assemblies and don't use a MarshalCookie to distinguish them (the underlying bug).

In that case, the cache would only ever select the first-called custom marshaler. This case could happen if someone were to depend on two different libraries that have internal custom marshalers that happen to have the same name but operate on different types.

I have a repro of this locally (an extension of what is in this PR). I'll clean it up and push it out.

@jkoritzinsky
Copy link
Member Author

In any case, we should probably move the cache to the LoaderAllocator anyway so as to not have the TypeHandles for the custom marshalers block or break unloadability.

@janvorli
Copy link
Member

we should probably move the cache to the LoaderAllocator anyway

I agree.

…ixes the custom-marshaler conflict when using unloadable assembly contexts.
@jkoritzinsky
Copy link
Member Author

Moving the cache to LoaderAllocator fixed the bug when using an unloadable ALC. I've got a few ideas for the non-unloadable ALC and the different-assembly-same-type-name cases that I'll try out.

@jkoritzinsky
Copy link
Member Author

@dotnet-bot test Windows_NT x64 Release CoreFX Tests

@jkoritzinsky jkoritzinsky changed the title WIP: Custom Marshalers in custom-ALC-loaded assemblies results in types loaded from crossing ALCs Custom Marshalers in custom-ALC-loaded assemblies results in types loaded from crossing ALCs Jan 3, 2019
Copy link
Member

@AaronRobinsonMSFT AaronRobinsonMSFT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I am good with this. I would like to see how you can provide some detailed guidance on the design of the tests and how it can be expanded.

@jkoritzinsky
Copy link
Member Author

I've added some comments to help give guidance on how to expand those tests if we want to. Let me know if you think they're good or if I should add something more/different.

@AaronRobinsonMSFT
Copy link
Member

Comments appear enough to understand intent. Last comment is about the inline usage. Other than that, LGTM.

@jkoritzinsky
Copy link
Member Author

@dotnet-bot test Ubuntu arm Cross Checked no_tiered_compilation_innerloop Build and Test
@dotnet-bot test Ubuntu arm Cross Checked Innerloop Build and Test

@ctaggart
Copy link

Any chance this is going to be backported to 2.x?

@AaronRobinsonMSFT
Copy link
Member

@ctaggart My gut reaction to back porting this change is still "no". I am seeing more people hitting these issues, but this PR does build on top of changes that were done in the .NET Core 3.0 time frame. This isn't to say the changes can't be untangled and the port made, but it wouldn't be simple due to the detangling and testing that would need to be done. I am willing to be convinced by @jkoritzinsky that this isn't too bad though.

Once 3.0 is release, will it be possible for you to move up to the latest version?

@bording
Copy link

bording commented Mar 20, 2019

@AaronRobinsonMSFT I believe the backporting request is coming from errors being hit in Source Link (dotnet/sourcelink#236) which relies on LibGit2Sharp, and LibGit2Sharp does use custom marshalers.

Would it at least be possible to verify that the source of the problem is this bug? If it instead turns out that there is a bug in LibGit2Sharp that could be fixed, that would be good to know.

I'd try to dig into this myself, but I don't even know where to begin when a potential CLR bug is involved!

@AaronRobinsonMSFT
Copy link
Member

@bording This issue definitely impacts dotnet/sourcelink#236 and should fix it. The problem with porting is purely the cost of the work.

I'd try to dig into this myself, but I don't even know where to begin when a potential CLR bug is involved!

A cherry-pick of the commit yielded enough of a diff that it indicates the port is going to take a bit of time since this area is sensitive enough that we would need to port the testing as well. If you wanted to take a shot at porting this to the release/2.2 branch I would be more than willing to review it and answer any question you have.

picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
…aded from crossing ALCs (dotnet/coreclr#21606)

* Create repro for dotnet/coreclrdotnet/coreclr#19654

* Update ICustomMarshaler.csproj

* Update ICustomMarshaler.csproj

* Clean up repro per feedback.

* Add test case for different assemblies with the same CustomMarshaler name.

* Move EEMarshalingData cache from AppDomain to LoaderAllocator. This fixes the custom-marshaler conflict when using unloadable assembly contexts.

* Internalize the LoaderHeap* parameter.

* Add the pointer to the requesting assembly to the hashtable key.

* Fix linux-musl build break.

* Move Crst out of FEATURE_COMINTEROP block.

* Make sure to copy over the assembly pointer to the key that's actually stored in the hash table.

* Add comment for m_invokingAssembly.

* Move all usages of EEMarshallingData to hang off the correct loader allocator instead of always the global one.

* Change to m_InteropDataCrst since this EEMarshallingData can be used in preemptive GC mode.

* Always init m_InteropDataCrst (since it's used by EEMarshallingData as well as COM).

* PR Feedback.

* Remove extraneous inlines.


Commit migrated from dotnet/coreclr@b465094
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants