-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Description
Component(s)
receiver/sqlquery
Is your feature request related to a problem? Please describe.
At Sumo Logic, we have customers wanting to retrieve logs from SQL databases (MySQL, PostgreSQL, MS SQL Server, possibly others) into OpenTelemetry Collector. The SQL Query receiver can currently create metrics based on SQL queries. It seems natural to extend this feature to logs as well.
Describe the solution you'd like
Given the following example that creates metrics, taken from the SQL Query receiver's README:
receivers:
sqlquery:
driver: postgres
datasource: "host=localhost port=5432 user=postgres password=s3cr3t sslmode=disable"
queries:
- sql: "select count(*) as count, genre from movie group by genre"
metrics:
- metric_name: movie.genres
value_column: "count"
attribute_columns: [ "genre" ]
static_attributes:
dbinstance: mydbinstancethe following configuration seems to make sense for logs as a starter:
receivers:
sqlquery:
driver: postgres
datasource: "host=localhost port=5432 user=postgres password=s3cr3t sslmode=disable"
queries:
- sql: "select * from logs_table"
logs:
- # properties to be definedThe features that could be considered for logs are:
- defining the columns for body, severity, timestamp etc. (similar to
value_columnfor metrics) - creating structured body from the result set
- keeping track of the rows in the table that were already read/processed, e.g. by storing the highest row ID in persistent storage
- keeping consistency with metrics, e.g. with
attribute_columnsandstatic_attributes - other? "let me know in the commens" 🙂
I'd be happy to work on the implementation if this feature gets accepted.
Describe alternatives you've considered
I don't see any alternative other than creating a custom binary that does this and sends the data to the collector via OTLP.
Additional context
Here's a past proposal to achieve similar goal by adding a new component:
I believe adding the functionality in the SQL Query receiver is the right approach, as it already does the same thing for metrics. Also re-using its configuration and code for connecting to DBs, running queries etc. makes sense.