-
Notifications
You must be signed in to change notification settings - Fork 457
chore(llmobs): [MLOB-3042] improve llmobs span event #14600
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Bootstrap import analysisComparison of import times between this PR and base. SummaryThe average import time from this PR is: 272 ± 2 ms. The average import time from base is: 273 ± 2 ms. The import time difference between this PR and base is: -1.04 ± 0.1 ms. Import time breakdownThe following import paths have shrunk:
|
Performance SLOsComparing candidate nicole-cybul/improve-llmobs-span-event (16de31e) with baseline main (d5f9551) 📈 Performance Regressions (2 suites)📈 iastaspectsospath - 24/24✅ ospathbasename_aspectTime: ✅ 5.121µs (SLO: <10.000µs 📉 -48.8%) vs baseline: 📈 +21.6% Memory: ✅ 37.375MB (SLO: <39.000MB -4.2%) vs baseline: +4.8% ✅ ospathbasename_noaspectTime: ✅ 1.080µs (SLO: <10.000µs 📉 -89.2%) vs baseline: +1.3% Memory: ✅ 37.336MB (SLO: <39.000MB -4.3%) vs baseline: +4.7% ✅ ospathjoin_aspectTime: ✅ 7.366µs (SLO: <10.000µs 📉 -26.3%) vs baseline: 📈 +18.5% Memory: ✅ 37.375MB (SLO: <39.000MB -4.2%) vs baseline: +4.9% ✅ ospathjoin_noaspectTime: ✅ 2.308µs (SLO: <10.000µs 📉 -76.9%) vs baseline: +0.7% Memory: ✅ 37.336MB (SLO: <39.000MB -4.3%) vs baseline: +4.7% ✅ ospathnormcase_aspectTime: ✅ 3.509µs (SLO: <10.000µs 📉 -64.9%) vs baseline: +0.2% Memory: ✅ 37.395MB (SLO: <39.000MB -4.1%) vs baseline: +4.9% ✅ ospathnormcase_noaspectTime: ✅ 0.566µs (SLO: <10.000µs 📉 -94.3%) vs baseline: +0.4% Memory: ✅ 37.356MB (SLO: <39.000MB -4.2%) vs baseline: +4.7% ✅ ospathsplit_aspectTime: ✅ 5.785µs (SLO: <10.000µs 📉 -42.1%) vs baseline: 📈 +20.0% Memory: ✅ 37.356MB (SLO: <39.000MB -4.2%) vs baseline: +4.7% ✅ ospathsplit_noaspectTime: ✅ 1.587µs (SLO: <10.000µs 📉 -84.1%) vs baseline: +0.2% Memory: ✅ 37.395MB (SLO: <39.000MB -4.1%) vs baseline: +4.9% ✅ ospathsplitdrive_aspectTime: ✅ 4.234µs (SLO: <10.000µs 📉 -57.7%) vs baseline: 📈 +16.2% Memory: ✅ 37.395MB (SLO: <39.000MB -4.1%) vs baseline: +4.8% ✅ ospathsplitdrive_noaspectTime: ✅ 0.696µs (SLO: <10.000µs 📉 -93.0%) vs baseline: -0.5% Memory: ✅ 37.356MB (SLO: <39.000MB -4.2%) vs baseline: +4.7% ✅ ospathsplitext_aspectTime: ✅ 5.481µs (SLO: <10.000µs 📉 -45.2%) vs baseline: 📈 +20.3% Memory: ✅ 37.356MB (SLO: <39.000MB -4.2%) vs baseline: +4.7% ✅ ospathsplitext_noaspectTime: ✅ 1.381µs (SLO: <10.000µs 📉 -86.2%) vs baseline: ~same Memory: ✅ 37.375MB (SLO: <39.000MB -4.2%) vs baseline: +4.9% 📈 telemetryaddmetric - 30/30✅ 1-count-metric-1-timesTime: ✅ 3.292µs (SLO: <20.000µs 📉 -83.5%) vs baseline: +2.6% Memory: ✅ 31.497MB (SLO: <34.000MB -7.4%) vs baseline: +4.9% ✅ 1-count-metrics-100-timesTime: ✅ 218.780µs (SLO: <250.000µs 📉 -12.5%) vs baseline: ~same Memory: ✅ 31.379MB (SLO: <34.000MB -7.7%) vs baseline: +4.7% ✅ 1-distribution-metric-1-timesTime: ✅ 3.411µs (SLO: <20.000µs 📉 -82.9%) vs baseline: 📈 +13.2% Memory: ✅ 31.516MB (SLO: <34.000MB -7.3%) vs baseline: +4.9% ✅ 1-distribution-metrics-100-timesTime: ✅ 200.353µs (SLO: <220.000µs -8.9%) vs baseline: +1.7% Memory: ✅ 31.516MB (SLO: <34.000MB -7.3%) vs baseline: +5.0% ✅ 1-gauge-metric-1-timesTime: ✅ 2.176µs (SLO: <20.000µs 📉 -89.1%) vs baseline: +2.2% Memory: ✅ 31.418MB (SLO: <34.000MB -7.6%) vs baseline: +4.8% ✅ 1-gauge-metrics-100-timesTime: ✅ 124.562µs (SLO: <150.000µs 📉 -17.0%) vs baseline: -0.3% Memory: ✅ 31.418MB (SLO: <34.000MB -7.6%) vs baseline: +5.0% ✅ 1-rate-metric-1-timesTime: ✅ 3.192µs (SLO: <20.000µs 📉 -84.0%) vs baseline: -1.2% Memory: ✅ 31.516MB (SLO: <34.000MB -7.3%) vs baseline: +5.1% ✅ 1-rate-metrics-100-timesTime: ✅ 220.974µs (SLO: <250.000µs 📉 -11.6%) vs baseline: +1.2% Memory: ✅ 31.457MB (SLO: <34.000MB -7.5%) vs baseline: +4.9% ✅ 100-count-metrics-100-timesTime: ✅ 21.876ms (SLO: <23.500ms -6.9%) vs baseline: -0.2% Memory: ✅ 31.398MB (SLO: <34.000MB -7.7%) vs baseline: +4.5% ✅ 100-distribution-metrics-100-timesTime: ✅ 2.069ms (SLO: <2.250ms -8.1%) vs baseline: +2.6% Memory: ✅ 31.457MB (SLO: <34.000MB -7.5%) vs baseline: +5.1% ✅ 100-gauge-metrics-100-timesTime: ✅ 1.284ms (SLO: <1.550ms 📉 -17.1%) vs baseline: -0.1% Memory: ✅ 31.457MB (SLO: <34.000MB -7.5%) vs baseline: +5.1% ✅ 100-rate-metrics-100-timesTime: ✅ 2.209ms (SLO: <2.550ms 📉 -13.4%) vs baseline: -0.4% Memory: ✅ 31.497MB (SLO: <34.000MB -7.4%) vs baseline: +5.0% ✅ flush-1-metricTime: ✅ 4.346µs (SLO: <20.000µs 📉 -78.3%) vs baseline: +1.9% Memory: ✅ 31.733MB (SLO: <34.000MB -6.7%) vs baseline: +4.5% ✅ flush-100-metricsTime: ✅ 183.673µs (SLO: <250.000µs 📉 -26.5%) vs baseline: ~same Memory: ✅ 31.752MB (SLO: <34.000MB -6.6%) vs baseline: +4.9% ✅ flush-1000-metricsTime: ✅ 2.235ms (SLO: <2.500ms 📉 -10.6%) vs baseline: +0.3% Memory: ✅ 32.696MB (SLO: <34.500MB -5.2%) vs baseline: +5.2% 🟡 Near SLO Breach (5 suites)🟡 djangosimple - 26/26✅ appsecTime: ✅ 20.478ms (SLO: <22.300ms -8.2%) vs baseline: -0.4% Memory: ✅ 64.487MB (SLO: <66.000MB -2.3%) vs baseline: +4.8% ✅ exception-replay-enabledTime: ✅ 1.349ms (SLO: <1.450ms -6.9%) vs baseline: ~same Memory: ✅ 63.493MB (SLO: <66.000MB -3.8%) vs baseline: +4.8% ✅ iastTime: ✅ 20.503ms (SLO: <22.250ms -7.9%) vs baseline: ~same Memory: ✅ 64.460MB (SLO: <66.000MB -2.3%) vs baseline: +4.7% ✅ profilerTime: ✅ 15.211ms (SLO: <16.550ms -8.1%) vs baseline: +0.3% Memory: ✅ 52.927MB (SLO: <53.500MB 🟡 -1.1%) vs baseline: +4.9% ✅ span-code-originTime: ✅ 26.084ms (SLO: <28.200ms -7.5%) vs baseline: -0.2% Memory: ✅ 66.819MB (SLO: <68.500MB -2.5%) vs baseline: +4.8% ✅ tracerTime: ✅ 20.516ms (SLO: <21.750ms -5.7%) vs baseline: ~same Memory: ✅ 64.546MB (SLO: <66.000MB -2.2%) vs baseline: +4.9% ✅ tracer-and-profilerTime: ✅ 22.078ms (SLO: <23.500ms -6.1%) vs baseline: -0.4% Memory: ✅ 65.700MB (SLO: <67.000MB 🟡 -1.9%) vs baseline: +4.8% ✅ tracer-dont-create-db-spansTime: ✅ 19.314ms (SLO: <21.500ms 📉 -10.2%) vs baseline: -0.4% Memory: ✅ 64.546MB (SLO: <66.000MB -2.2%) vs baseline: +4.9% ✅ tracer-nativeTime: ✅ 20.481ms (SLO: <21.750ms -5.8%) vs baseline: -0.4% Memory: ✅ 65.746MB (SLO: <66.000MB 🟡 -0.4%) vs baseline: +4.8% ✅ tracer-no-cachesTime: ✅ 18.426ms (SLO: <19.650ms -6.2%) vs baseline: -0.2% Memory: ✅ 64.487MB (SLO: <66.000MB -2.3%) vs baseline: +4.9% ✅ tracer-no-databasesTime: ✅ 18.767ms (SLO: <20.100ms -6.6%) vs baseline: ~same Memory: ✅ 64.507MB (SLO: <66.000MB -2.3%) vs baseline: +4.8% ✅ tracer-no-middlewareTime: ✅ 20.213ms (SLO: <21.500ms -6.0%) vs baseline: +0.1% Memory: ✅ 64.507MB (SLO: <66.000MB -2.3%) vs baseline: +4.8% ✅ tracer-no-templatesTime: ✅ 20.338ms (SLO: <22.000ms -7.6%) vs baseline: ~same Memory: ✅ 64.527MB (SLO: <66.000MB -2.2%) vs baseline: +4.9% 🟡 errortrackingdjangosimple - 6/6✅ errortracking-enabled-allTime: ✅ 18.010ms (SLO: <19.850ms -9.3%) vs baseline: -0.5% Memory: ✅ 64.487MB (SLO: <65.500MB 🟡 -1.5%) vs baseline: +4.9% ✅ errortracking-enabled-userTime: ✅ 18.037ms (SLO: <19.400ms -7.0%) vs baseline: -0.5% Memory: ✅ 64.507MB (SLO: <65.500MB 🟡 -1.5%) vs baseline: +4.9% ✅ tracer-enabledTime: ✅ 18.083ms (SLO: <19.450ms -7.0%) vs baseline: -0.1% Memory: ✅ 64.482MB (SLO: <65.500MB 🟡 -1.6%) vs baseline: +4.9% 🟡 flasksimple - 17/17✅ appsec-getTime: ✅ 4.579ms (SLO: <4.750ms -3.6%) vs baseline: -0.6% Memory: ✅ 62.266MB (SLO: <64.500MB -3.5%) vs baseline: +4.8% ✅ appsec-postTime: ✅ 6.581ms (SLO: <6.750ms -2.5%) vs baseline: ~same Memory: ✅ 62.384MB (SLO: <64.500MB -3.3%) vs baseline: +4.8% ✅ appsec-telemetryTime: ✅ 4.581ms (SLO: <4.750ms -3.6%) vs baseline: -0.4% Memory: ✅ 62.344MB (SLO: <64.500MB -3.3%) vs baseline: +4.8% ✅ debuggerTime: ✅ 1.858ms (SLO: <2.000ms -7.1%) vs baseline: -0.2% Memory: ✅ 44.807MB (SLO: <45.000MB 🟡 -0.4%) vs baseline: +4.9% ✅ iast-getTime: ✅ 1.853ms (SLO: <2.000ms -7.4%) vs baseline: -0.5% Memory: ✅ 41.760MB (SLO: <49.000MB 📉 -14.8%) vs baseline: +5.0% ✅ profilerTime: ✅ 1.917ms (SLO: <2.100ms -8.7%) vs baseline: ~same Memory: ✅ 44.296MB (SLO: <46.500MB -4.7%) vs baseline: +4.5% ✅ tracerTime: ✅ 3.368ms (SLO: <3.650ms -7.7%) vs baseline: -0.3% Memory: ✅ 51.472MB (SLO: <53.500MB -3.8%) vs baseline: +4.7% ✅ tracer-nativeTime: ✅ 3.371ms (SLO: <3.650ms -7.7%) vs baseline: -0.5% Memory: ✅ 52.750MB (SLO: <53.500MB 🟡 -1.4%) vs baseline: +4.9% 🟡 flasksqli - 6/6✅ appsec-enabledTime: ✅ 3.953ms (SLO: <4.200ms -5.9%) vs baseline: -0.4% Memory: ✅ 62.627MB (SLO: <66.000MB -5.1%) vs baseline: +4.9% ✅ iast-enabledTime: ✅ 2.450ms (SLO: <2.800ms 📉 -12.5%) vs baseline: -0.2% Memory: ✅ 58.184MB (SLO: <59.000MB 🟡 -1.4%) vs baseline: +4.8% ✅ tracer-enabledTime: ✅ 2.077ms (SLO: <2.250ms -7.7%) vs baseline: -0.1% Memory: ✅ 51.347MB (SLO: <53.500MB -4.0%) vs baseline: +5.0% 🟡 otelspan - 22/22✅ add-eventTime: ✅ 45.146ms (SLO: <47.150ms -4.2%) vs baseline: +0.4% Memory: ✅ 44.638MB (SLO: <46.500MB -4.0%) vs baseline: +4.9% ✅ add-metricsTime: ✅ 323.214ms (SLO: <344.800ms -6.3%) vs baseline: +0.7% Memory: ✅ 553.378MB (SLO: <562.000MB 🟡 -1.5%) vs baseline: +4.8% ✅ add-tagsTime: ✅ 293.165ms (SLO: <314.000ms -6.6%) vs baseline: +0.3% Memory: ✅ 554.619MB (SLO: <563.500MB 🟡 -1.6%) vs baseline: +4.6% ✅ get-contextTime: ✅ 83.640ms (SLO: <92.350ms -9.4%) vs baseline: +1.1% Memory: ✅ 39.884MB (SLO: <46.500MB 📉 -14.2%) vs baseline: +5.0% ✅ is-recordingTime: ✅ 42.888ms (SLO: <44.500ms -3.6%) vs baseline: +0.3% Memory: ✅ 44.039MB (SLO: <46.500MB -5.3%) vs baseline: +4.9% ✅ record-exceptionTime: ✅ 61.653ms (SLO: <67.650ms -8.9%) vs baseline: +0.2% Memory: ✅ 40.121MB (SLO: <46.500MB 📉 -13.7%) vs baseline: +5.1% ✅ set-statusTime: ✅ 48.808ms (SLO: <50.400ms -3.2%) vs baseline: +0.2% Memory: ✅ 43.990MB (SLO: <46.500MB -5.4%) vs baseline: +4.7% ✅ startTime: ✅ 42.248ms (SLO: <43.450ms -2.8%) vs baseline: ~same Memory: ✅ 44.051MB (SLO: <46.500MB -5.3%) vs baseline: +4.8% ✅ start-finishTime: ✅ 82.863ms (SLO: <88.000ms -5.8%) vs baseline: ~same Memory: ✅ 33.895MB (SLO: <46.500MB 📉 -27.1%) vs baseline: +4.6% ✅ start-finish-telemetryTime: ✅ 84.550ms (SLO: <89.000ms -5.0%) vs baseline: ~same Memory: ✅ 34.013MB (SLO: <46.500MB 📉 -26.9%) vs baseline: +5.0% ✅ update-nameTime: ✅ 44.090ms (SLO: <45.150ms -2.3%) vs baseline: +0.2% Memory: ✅ 44.359MB (SLO: <46.500MB -4.6%) vs baseline: +4.9%
|
|
||
elif isinstance(content, list): | ||
for block in content: | ||
if _get_attr(block, "type", None) == "text": | ||
input_messages.append({"content": _get_attr(block, "text", ""), "role": role}) | ||
input_messages.append(Message(content=_get_attr(block, "text", ""), role=str(role))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a type hint here that _get_attr(block, "text", "")
is not necessarily a string. I tried coercing it with str()
but that makes the test_tools_sync_stream
fail because the expected content is a list. This makes me think I should update this line and the test to be consistent with the string format.
I am also confused though as to why this tool use is formatted as a text type in the anthropic stream to begin with though 🤔
The goal of this PR is to support better type checking for the
LLMObsSpanEvent
class to match what our backend expects for spans (here). I have tried to replace all of the instances where we are using regular dictionaries in favor of using theTypedDict
objects we have defined, resolving the static type hint errors that have popped up as a result. There is likely to be some places in the code that I have overlooked, but we can continue transitioning to the new format as time goes on.This PR should introduce no functional changes into the SDK.
Checklist
Reviewer Checklist