Skip to content

PathMatchingResourcePatternResolver finds duplicate resources for executable jars but not for executable wars [SPR-14936] #19503

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
spring-projects-issues opened this issue Nov 23, 2016 · 4 comments
Assignees
Labels
in: core Issues in core modules (aop, beans, core, context, expression) type: enhancement A general enhancement
Milestone

Comments

@spring-projects-issues
Copy link
Collaborator

spring-projects-issues commented Nov 23, 2016

Andy Wilkinson opened SPR-14936 and commented

PathMatchingResourcePatternResolver behaves differently depending on the file extension of a Spring Boot executable archive that's been launched with java -jar. If the archive is a .jar file duplicate resources will be found, whereas if the archive is a .war file they will not. This is due to the logic in addAllClassLoaderJarRoots that provides special treatment for .jar files.

I'll attach an application that reproduces the problem.

If you package and run it

mvn clean package && java -jar duplicate-resources-0.0.1-SNAPSHOT.jar

You should see the following output:

jar:file:/Users/awilkinson/duplicate-resources-0.0.1-SNAPSHOT.jar!/BOOT-INF/classes!/a.zzz
jar:file:/Users/awilkinson/duplicate-resources-0.0.1-SNAPSHOT.jar!/BOOT-INF/classes!/nested/b.zzz
jar:file:/Users/awilkinson/duplicate-resources-0.0.1-SNAPSHOT.jar!/BOOT-INF/classes/a.zzz
jar:file:/Users/awilkinson/duplicate-resources-0.0.1-SNAPSHOT.jar!/BOOT-INF/classes/nested/b.zzz

Note that there are two URLs for a.zzz and b.zzz, one found via the nested BOOT-INF/classes "archive" and the other found via the jar root. We only want the entries found via the nested archive.

If you run it as a .war file:

cp duplicate-resources-0.0.1-SNAPSHOT.jar duplicate-resources-0.0.1-SNAPSHOT.war && java -jar duplicate-resources-0.0.1-SNAPSHOT.war

The duplicates are gone and we get the desired result:

jar:file:/Users/awilkinson/duplicate-resources-0.0.1-SNAPSHOT.war!/BOOT-INF/classes!/a.zzz
jar:file:/Users/awilkinson/duplicate-resources-0.0.1-SNAPSHOT.war!/BOOT-INF/classes!/nested/b.zzz

I'd like the behaviour to be consistent, irrespective of the file extension that's used for the archive passed to java -jar.

One final data point. If the archive is unpacked:

mkdir unpacked && cd unpacked && unzip ../duplicate-resources-0.0.1-SNAPSHOT.jar

And then run:

java -cp . org.springframework.boot.loader.JarLauncher

The duplicates do not occur:

file:/Users/awilkinson/unpacked/BOOT-INF/classes/a.zzz
file:/Users/awilkinson/unpacked/BOOT-INF/classes/nested/b.zzz

I suspect this is because the URLs are identical, i.e. they do not have the subtle / vs !/ difference. This may give us an avenue to explore for fixing the problem in Spring Boot, but I'd like this to be investigated on the Framework side too as the file extension-specific behaviour is rather surprising.


Affects: 4.3.4

Attachments:

Issue Links:

Referenced from: commits f16d453, b3e94dc

@spring-projects-issues
Copy link
Collaborator Author

Juergen Hoeller commented

As far as I see, we're simply not considering .war files as searchable classpath archives, whereas anything ending with .jar seems like a candidate (even if has nested archives). I guess we could search any kind of root archive there, just skipping classpath directories.

However, should we actually search other archives as well, simply to be consistent, even if we don't actually want to find those duplicates? Or should we explicitly ignore .jar files if they happen to have nested archive roots inside, avoiding the duplicates problem upfront? However, how exactly would we identify such to-be-ignored .jar files?

Since that part of PathMatchingResourcePatternResolver is only really there for Boot to perform searches in very root of the classpath, I'm happy to refine it towards your purposes.

@spring-projects-issues
Copy link
Collaborator Author

Andy Wilkinson commented

I guess we could search any kind of root archive there, just skipping classpath directories.

An executable war is a jar file, just one with a different extension. Searching the root of any jar: URL irrespective of file extension makes sense to me as it's consistent. As you've noted, it doesn't help with the duplicate problem though.

However, how exactly would we identify such to-be-ignored .jar files?

I think that's really the crux of the matter here. I've been wrestling with it for a while. If we had a good answer to that, whether or not .war files should be searched would become an unrelated problem as, presumably, the deduplication logic could work for any file extension.

I'd like to explore the avenue of removing the subtle !/ vs / difference from the URLs that Spring Boot creates.

Since that part of PathMatchingResourcePatternResolver is only really there for Boot to perform searches in very root of the classpath, I'm happy to refine it towards your purposes.

Thanks. Somewhat ironically, I don't think we need that behaviour for executable jars built with Spring Boot 1.4. However, it is still needed for a shaded jar and for anyone using Spring Boot 1.3.

I'll let you know how I get on with addressing this on the Spring Boot side.

@spring-projects-issues
Copy link
Collaborator Author

Juergen Hoeller commented

The problem is: Coming out of URLClassLoader.getURLs() or java.class.path, we get the plain file locations of the archives... not a "jar:" URL pointing into a JAR file. So we can only identify them by their extension, or try to resolve them and check whether they point to an archive. We can certainly try the relaxed variant here and see whether we choke on anything.

@spring-projects-issues
Copy link
Collaborator Author

Juergen Hoeller commented

Since we have defensive URL building and exists() checks there anyway, I've simply removed the specific condition for the jar extension. We're trying all such root URLs as jar files now, expecting the exists() to fail for a "jar:" URL that we're building if it is not actually a jar-like archive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in: core Issues in core modules (aop, beans, core, context, expression) type: enhancement A general enhancement
Projects
None yet
Development

No branches or pull requests

2 participants