Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

MSSQL Exporter - Negative Serial Number Errors #729

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jmbecker opened this issue Apr 21, 2025 · 1 comment
Closed

MSSQL Exporter - Negative Serial Number Errors #729

jmbecker opened this issue Apr 21, 2025 · 1 comment
Labels
bug Something isn't working

Comments

@jmbecker
Copy link

Describe the bug
The MSSQL exporter runs without issue after being set up and then starts throwing TLS handshake failed errors some time later.

To Reproduce
Steps to reproduce the behavior:

  1. Set up prometheus.exporter.mssql with a configuration like the following in Grafana Alloy:
     prometheus.exporter.mssql "standard_exporter" {
       connection_string = "sqlserver://user:pass@sql-server:1433?encrypt=true&trustservercertificate=true"
       query_config = local.file.mssql_standard.content
     }
  1. Metrics begin populating as expected.
  2. Wait several hours (up to 24).
  3. See invalid metric errors that state that a TLS handshake error has occurred:
ts=2025-04-21T14:22:09.549504734Z level=error msg="Invalid metric description." component_path=/ component_id=prometheus.exporter.mssql.standard_exporter err="[mssqlintegration,collector=mssql_standard,query=mssql_user_errors_total] TLS Handshake failed: tls: failed to parse certificate from server: x509: negative serial number"

Expected behavior
Once the MSSQL exporter has been set up, it should continue to work without interruption as long as the sql server being scraped is up and ready to receive the queries. Since I'm trusting the certificate, I'm expecting TLS to continue working.

Configuration
We're using the exporter via Grafana Alloy inside of a kubernetes cluster. We deploy the Grafana Alloy collector with the flag --feature.community-components.enabled. We're using other prometheus-type components without issue.

Additional context

  • Both the standard exporter we have set up AND a custom exporter are having the same issue.
  • This issue seems to have stemmed from a change in behavior of Go 1.23+. This issue in the Go repository mentions that it will probably not be fixed. The workaround would be to add a godebug flag like so: godebug x509negativeserial=1
  • The crypto/x509 package calls out this change in behaviour here
  • Interestingly, scaling the MSSQL container down and back up, followed by changing the SQL scrape configuration appears to allow the scrape to work again (until it stops working).
@jmbecker jmbecker added the bug Something isn't working label Apr 21, 2025
@burningalchemist
Copy link
Owner

Hey @jmbecker, thanks for reporting the issue.

As this problem originates from the Microsoft side (in our case it's the way the certificate is generated), it won't be solved by Golang, indeed. Generally, it's a violation of RFC 5280 section 4.1.2.2, which states that

The serial number MUST be a positive integer assigned by the CA to each certificate.

sql_exporter uses official microsoft/go-mssql driver, so the connection handling (including TLS) happens there, it's nothing custom on our end.

The issue was accepted and fixed by the Microsoft team and reflected in the update - https://learn.microsoft.com/en-us/troubleshoot/sql/releases/sqlserver-2022/cumulativeupdate18#3867855. So I guess you might need to check for a new image with the fix applied.

Also, one of the solutions suggested by the community is to generate a new certificate and make sure the negative numbers aren't used (microsoft/mssql-docker#895 (comment)).

If none of the above works for you, the GODEBUG workaround still can be applied without any code changes. You need to pass the setting as an environment variable to your container such as GODEBUG=x509negativeserial=1. Since you use Grafana Alloy (which is not affiliated with this project), that'd be the right container to add to. Golang runtime will pickup the setting after the restart.

Given the fact that the problem was confirmed and solved by Microsoft, I don't plan to make any changes to the code. Moreover, it won't have any effect on Grafana Alloy installations as they use sql_exporter as an external library, so the runtime settings have to be handled there.

Since it's not a bug of sql_exporter, I'll move it to discussions. But I'm happy to assist you further in solving your problem, please let me know. 👍

Repository owner locked and limited conversation to collaborators Apr 22, 2025
@burningalchemist burningalchemist converted this issue into discussion #730 Apr 22, 2025

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants