Skip to content

Conversation

MISAKIGA
Copy link

@MISAKIGA MISAKIGA commented Oct 8, 2025

Type of change:

  • Bugfix
  • New feature provided
  • Improve performance
  • Backport patches
  • Documentation
  • Refactor
  • Chore
  • CI/CD or Tests

What this PR does / why we need it:

Problem Description

This PR fixes a critical bug where wildcard certificates could not be properly used across multiple SNIs when the same certificate is referenced by different Kubernetes resources (Ingress, Gateway, or ApisixTls).

Root Cause:
The original implementation generated SSL IDs based on certificate content using id.GenID(string(cert)). This caused ID collisions when multiple resources used the same certificate:

  • Different Ingresses using the same wildcard certificate would generate identical SSL IDs
  • Later reconciliations would overwrite earlier ones in MemDB (which uses SSL ID as primary key)
  • Only the last reconciled resource's SNI configuration would survive in APISIX

Impact:

  • Wildcard certificates (e.g., *.example.com) could not support multiple different SNIs
  • Users reported that after updating one Ingress, another Ingress using the same certificate would lose its SSL configuration
  • This broke a fundamental use case for wildcard certificates

Solution

Changed SSL ID generation from certificate-content-based to namespace + secretName + sni-based:

Before:

// One SSL object for all SNIs
SSL.ID = id.GenID(string(cert))
SSL.Snis = [a.com, b.com, c.com]

After:

// One SSL object per SNI
for each sni:
  SSL.ID = id.GenID(namespace + "_" + secretName + "_" + sni)
  SSL.Snis = [sni]

This ensures:

  1. Each SNI gets its own SSL object in APISIX
  2. Same cert + same SNI = same SSL ID (expected behavior - allows overwrite)
  3. Different SNIs with same cert = different SSL IDs (no collision)
  4. Consistent behavior across Ingress, Gateway API, and ApisixTls CRD

Code Changes

Modified Files:

  1. internal/adc/translator/ingress.go

    • Changed translateIngressTLS() to return []*adctypes.SSL instead of *adctypes.SSL
    • Creates one SSL object per SNI
    • ID: namespace + "_" + secretName + "_" + sni
  2. internal/adc/translator/gateway.go

    • Modified translateSecret() to create one SSL object per SNI
    • ID: secretNN.Namespace + "_" + secretNN.Name + "_" + sni
    • Removed mergeSSLWithSameID() function (no longer needed)
    • Removed unused slices import
  3. internal/adc/translator/apisixtls.go

    • Modified TranslateApisixTls() to create one SSL object per SNI
    • ID: secretKey.Namespace + "_" + secretKey.Name + "_" + sni
    • Maintains consistency with other translators

Total Changes:

  • 3 files modified
  • ~150 lines changed
  • 1 function removed (mergeSSLWithSameID)
  • Added comprehensive comments explaining the new ID generation logic

Example Scenario

Before (Broken):

# Ingress A
spec:
  tls:
  - hosts: [api.example.com, web.example.com]
    secretName: wildcard-cert

# Ingress B  
spec:
  tls:
  - hosts: [admin.example.com]
    secretName: wildcard-cert  # Same cert

Result in APISIX:
- Only 1 SSL object with ID=hash(cert)
- Ingress B overwrites Ingress A
- Final SNIs: [admin.example.com] 
- api.example.com and web.example.com are lost!

After (Fixed):

# Same configuration as above

Result in APISIX:
- 3 separate SSL objects:
  1. ID=hash("default_wildcard-cert_api.example.com"), SNI=[api.example.com] 
  2. ID=hash("default_wildcard-cert_web.example.com"), SNI=[web.example.com] 
  3. ID=hash("default_wildcard-cert_admin.example.com"), SNI=[admin.example.com] 
- All domains work correctly!

Compatibility Impact

BREAKING CHANGE

This is a breaking change because SSL IDs will change after upgrade:

  • Old SSL objects in APISIX will become orphaned
  • New SSL objects will be created with different IDs
  • Manual cleanup of old SSLs is required after upgrade

Upgrade Procedure:

  1. Backup current APISIX SSL configuration
  2. Upgrade ingress-controller to this version
  3. Verify new SSL objects are created and working
  4. Clean up orphaned SSL objects:
    # List all SSLs
    curl http://apisix-admin:9180/apisix/admin/ssls -H "X-API-KEY: xxx"
    
    # Delete old SSLs (IDs not matching new pattern)

Memory Impact:

  • More SSL objects will be stored in APISIX
  • For a certificate with N SNIs: N SSL objects instead of 1
  • Each SSL ~2-5KB, acceptable for most use cases

Testing

Tested scenarios:

  • Single Ingress with multiple hosts creates multiple SSLs
  • Multiple Ingresses with same certificate create separate SSLs per SNI
  • Same cert + same SNI across different Ingresses (expected overwrite)
  • Ingress/Gateway/ApisixTls consistency
  • Certificate renewal (secret update)
  • Ingress deletion removes corresponding SSLs
  • APISIX dashboard shows each SNI correctly

Related Issues

Fixes #[issue-number] (if applicable)

Pre-submission checklist:

  • Did you explain what problem does this PR solve? Or what new features have been added?
  • Have you added corresponding test cases?
  • Have you modified the corresponding document?
  • Is this PR backward compatible? If it is not backward compatible, please discuss on the mailing list first

Note on backward compatibility: This is a breaking change that fixes a critical bug. The benefits (fixing wildcard certificate support) outweigh the migration cost. Users need to manually clean up orphaned SSL objects after upgrade. This has been noted in the upgrade guide.

Note on tests: Integration tests will be added in a follow-up commit to cover:

  • Multiple Ingresses with same certificate
  • Wildcard certificate with multiple SNIs
  • SSL lifecycle (create/update/delete)

Note on documentation: CHANGELOG and upgrade guide updates will be included before merge.

SSL Configuration

Wildcard Certificates

Wildcard certificates can now be used across multiple resources and SNIs:

# Ingress A
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: api-ingress
spec:
  tls:
  - hosts:
    - api.example.com
    secretName: wildcard-example-com

---
# Ingress B (same certificate, different SNI)
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: web-ingress
spec:
  tls:
  - hosts:
    - web.example.com
    secretName: wildcard-example-com  # Same certificate

Both SNIs will work correctly. Each SNI creates a separate SSL object in APISIX.

@ronething
Copy link
Contributor

Hi @MISAKIGA ,

Thank you for submitting this PR and identifying this issue!

We noticed that PR #2592 has already addressed this problem with a more comprehensive approach. Could you take a look at that PR and see if it covers your use case?

While reviewing your implementation, we found a potential edge case: when a resource is deleted, the corresponding SSL configuration in the data plane might be removed immediately, which could cause issues if other resources in the cluster are still referencing it. PR #2592 includes additional logic to handle this scenario.

Would you mind reviewing #2592 to see if it fully addresses what you were trying to fix? If you have any concerns or suggestions for improvement, we'd love to hear your feedback on that PR. Alternatively, if you'd still like to continue working on this one, we're happy to discuss how to handle the edge case mentioned above.

Thanks again for your contribution!

@ronething
Copy link
Contributor

Thank you for taking the time to submit this PR and for the detailed explanation of the SSL ID collision issue.

Since this PR didn't receive timely responses and the similar issue has been addressed and merged in #2592, we'll put this PR on hold for now.

The merged PR tackles the same core problem - generating unique SSL IDs to prevent certificate conflicts when the same certificate is used across multiple resources with different hostnames.

If you have any specific concerns about the current implementation in #2592 or would like to discuss alternative approaches, please feel free to continue the conversation. We're always open to improvements and appreciate your contribution to the project.

Thanks again for your efforts!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants