Skip to content

multiple opentelemetry fixups #27

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 22, 2025
Merged

multiple opentelemetry fixups #27

merged 1 commit into from
Apr 22, 2025

Conversation

lukaslihotzki-f
Copy link

Pull Request Checklist

  • Pull request is based on the develop branch
  • Pull request includes a changelog file. The entry should:
    • Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
    • Use markdown where necessary, mostly for code blocks.
    • End with either a period (.) or an exclamation mark (!).
    • Start with a capital letter.
    • Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
  • Code style is correct
    (run the linters)

@lukaslihotzki-f lukaslihotzki-f requested a review from a team as a code owner April 17, 2025 17:22
@lukaslihotzki-f lukaslihotzki-f changed the title refactor(opentelemetry): remove unused import multiple opentelemetry fixups Apr 17, 2025
@lukaslihotzki-f lukaslihotzki-f force-pushed the ll/opentelemetry branch 4 times, most recently from 5af0397 to 50e6b34 Compare April 17, 2025 19:00
Copy link

codecov bot commented Apr 17, 2025

Codecov Report

Attention: Patch coverage is 0% with 3 lines in your changes missing coverage. Please review.

Project coverage is 78.65%. Comparing base (7c4d1f7) to head (6197776).
Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
synapse/logging/opentracing.py 0.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master      #27      +/-   ##
==========================================
- Coverage   78.67%   78.65%   -0.03%     
==========================================
  Files         472      472              
  Lines       67635    67657      +22     
  Branches    10417    10421       +4     
==========================================
+ Hits        53215    53218       +3     
- Misses      10901    10919      +18     
- Partials     3519     3520       +1     
Files with missing lines Coverage Δ
synapse/config/logger.py 37.19% <ø> (ø)
synapse/logging/opentracing.py 70.11% <0.00%> (-3.34%) ⬇️

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1556930...6197776. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link

@jason-famedly jason-famedly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Largely, I'm ok with all these changes. Things are working again in CI, which was my major hang up.

Having said that, something is still not right in that I can not seem to get log data properly out consistently. Whenever Synapse hit's a /health request, it trips over itself and flushes thousand of log lines about a maximum recursion depth exceeded.

I think this may be as simple as changing

return None, None

at the first value to just be something 'not None', as that is what is tripping up opentelemetry. (As an option, at one of the call sites the user agent string is being produced and when it is 'not None' the just use -. Similar thing would be ok for me, as this seems to be only used for rendering text display like logs)

Probably a hundred of these

2025-04-18 17:15:41,906 - opentelemetry.attributes - 111 - WARNING - GET-0 - Invalid type NoneType for attribute 'requester' value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types

then a repeating version of

2025-04-18 17:15:41,784 - twisted - 287 - CRITICAL - sentinel - 
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/twisted/web/server.py", line 213, in process
    self.render(resrc)
  File "/usr/local/lib/python3.12/site-packages/synapse/http/site.py", line 329, in render
    self._started_processing(servlet_name)
  File "/usr/local/lib/python3.12/site-packages/synapse/http/site.py", line 463, in _started_processing
    self.synapse_site.access_logger.debug(
  File "/usr/local/lib/python3.12/logging/__init__.py", line 1527, in debug
    self._log(DEBUG, msg, args, **kwargs)
  File "/usr/local/lib/python3.12/logging/__init__.py", line 1684, in _log
    self.handle(record)
  File "/usr/local/lib/python3.12/logging/__init__.py", line 1700, in handle
    self.callHandlers(record)
  File "/usr/local/lib/python3.12/logging/__init__.py", line 1762, in callHandlers
    hdlr.handle(record)
  File "/usr/local/lib/python3.12/logging/__init__.py", line 1028, in handle
    self.emit(record)
  File "/usr/local/lib/python3.12/site-packages/opentelemetry/sdk/_logs/_internal/__init__.py", line 562, in emit
    logger.emit(self._translate(record))
                ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/opentelemetry/sdk/_logs/_internal/__init__.py", line 541, in _translate
    return LogRecord(
           ^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/opentelemetry/sdk/_logs/_internal/__init__.py", line 198, in __init__
    "attributes": BoundedAttributes(
                  ^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/opentelemetry/attributes/__init__.py", line 170, in __init__
    self[key] = value
    ~~~~^^^^^
  File "/usr/local/lib/python3.12/site-packages/opentelemetry/attributes/__init__.py", line 187, in __setitem__
    value = _clean_attribute(key, value, self.max_value_len)  # type: ignore
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/opentelemetry/attributes/__init__.py", line 111, in _clean_attribute
    _logger.warning(
  File "/usr/local/lib/python3.12/logging/__init__.py", line 1551, in warning
    self._log(WARNING, msg, args, **kwargs)
  File "/usr/local/lib/python3.12/logging/__init__.py", line 1684, in _log
    self.handle(record)
  File "/usr/local/lib/python3.12/logging/__init__.py", line 1700, in handle
    self.callHandlers(record)
  File "/usr/local/lib/python3.12/logging/__init__.py", line 1762, in callHandlers
    hdlr.handle(record)
  File "/usr/local/lib/python3.12/logging/__init__.py", line 1028, in handle
    self.emit(record)
  File "/usr/local/lib/python3.12/site-packages/opentelemetry/sdk/_logs/_internal/__init__.py", line 562, in emit
    logger.emit(self._translate(record))
                ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/opentelemetry/sdk/_logs/_internal/__init__.py", line 541, in _translate
    return LogRecord(
           ^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/opentelemetry/sdk/_logs/_internal/__init__.py", line 198, in __init__
    "attributes": BoundedAttributes(
                  ^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/opentelemetry/attributes/__init__.py", line 170, in __init__
    self[key] = value
    ~~~~^^^^^
  File "/usr/local/lib/python3.12/site-packages/opentelemetry/attributes/__init__.py", line 187, in __setitem__
    value = _clean_attribute(key, value, self.max_value_len)  # type: ignore
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/opentelemetry/attributes/__init__.py", line 111, in _clean_attribute
    _logger.warning(

Having said all that, I do not see any reason to keep this held up. Just....don't turn it on yet 😀
(Should probably get another opinion tho, just in case my eyes got crossed somewhere)

@lukaslihotzki-f
Copy link
Author

I have fixed the problem by not setting None attributes in e2601ff. I still don't know who is wrong here: Either synapse by setting None attributes on LogRecord, or opentelemetry.attributes by expecting that LogRecord attributes are not None.

https://docs.python.org/3/library/logging.html#logging.Logger.debug and https://docs.python.org/3/library/logging.html#logrecord-attributes specify nothing about the allowed types in LogRecord attributes, so I'd assume that any type can be used. I don't have read the entire documentation though. https://docs.python.org/3/library/logging.html#logging.Logger.debug suggests that LogRecord should always have the same keys present. Therefore, setting these attributes to "-" may be better than omitting them.

@lukaslihotzki-f
Copy link
Author

I still don't know who is wrong here: Either synapse by setting None attributes on LogRecord, or opentelemetry.attributes by expecting that LogRecord attributes are not None.

It was opentelemetry: open-telemetry/opentelemetry-python#4342

With the current main of opentelemetry, None attributes work fine. I have verified that the warning no longer appears with this version and replaced my custom fix with a version bump. Therefore, we have to wait for another release until we can switch to a released version of opentelemetry.

Copy link
Member

@nico-famedly nico-famedly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you squash this down before I approve it please? :)

@lukaslihotzki-f lukaslihotzki-f force-pushed the ll/opentelemetry branch 5 times, most recently from ce4eba0 to b624ff0 Compare April 22, 2025 11:35
@lukaslihotzki-f lukaslihotzki-f merged commit 6197776 into master Apr 22, 2025
54 of 55 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants