Skip to content

feat: Pretty print JSON for python. Fix missing regular expressions for masking out ephemeral information #215

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Mar 22, 2022
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ jobs:
pip install .[dev]

- name: Install Serverless Framework
run: sudo yarn global add serverless@^2.72.2 --prefix /usr/local
run: sudo yarn global add serverless@^3.7.0 --prefix /usr/local
- name: Install Crossbuild Deps
run: |
sudo apt-get update --allow-releaseinfo-change --fix-missing
Expand Down
22 changes: 20 additions & 2 deletions scripts/run_integration_tests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -175,14 +175,15 @@ for handler_name in "${LAMBDA_HANDLERS[@]}"; do
# Replace invocation-specific data like timestamps and IDs with XXXX to normalize logs across executions
logs=$(
echo "$raw_logs" |
node parse-json.js |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏

# Filter serverless cli errors
sed '/Serverless: Recoverable error occurred/d' |
# Remove RequestsDependencyWarning from botocore/vendored/requests/__init__.py
sed '/RequestsDependencyWarning/d' |
# Remove blank lines
sed '/^$/d' |
# Normalize Lambda runtime REPORT logs
sed -E 's/(RequestId|TraceId|SegmentId|Duration|Memory Used|"e"): [a-z0-9\.\-]+/\1: XXXX/g' |
sed -E 's/(RequestId|TraceId|SegmentId|Duration|init|Memory Used|"e"): [a-z0-9\.\-]+/\1: XXXX/g' |
# Normalize HTTP headers
sed -E "s/(x-datadog-parent-id:|x-datadog-trace-id:|Content-Length:)[0-9]+/\1XXXX/g" |
# Remove Account ID
Expand All @@ -204,12 +205,29 @@ for handler_name in "${LAMBDA_HANDLERS[@]}"; do
sed -E "s/(\"span_id\"\: \")[A-Z0-9\.\-]+/\1XXXX/g" |
sed -E "s/(\"parent_id\"\: \")[A-Z0-9\.\-]+/\1XXXX/g" |
sed -E "s/(\"request_id\"\: \")[a-z0-9\.\-]+/\1XXXX/g" |
sed -E "s/(\"http.source_ip\"\: \")[a-z0-9\.\-]+/\1XXXX/g" |
sed -E "s/(\"http.user_agent\"\: \")[a-z0-9\.\-]+/\1XXXX/g" |
sed -E "s/(\"function_trigger.event_source_arn\"\: \")[A-Za-z0-9\/\.\:\-]+/\1XXXX/g" |
sed -E "s/(\"duration\"\: )[0-9\.\-]+/\1\"XXXX\"/g" |
sed -E "s/(\"start\"\: )[0-9\.\-]+/\1\"XXXX\"/g" |
sed -E "s/(\"system\.pid\"\: )[0-9\.\-]+/\1\"XXXX\"/g" |
sed -E "s/(\"runtime-id\"\: \")[a-z0-9\.\-]+/\1XXXX/g" |
sed -E "s/([a-zA-Z0-9]+)(\.execute-api\.[a-z0-9\-]+\.amazonaws\.com)/XXXX\2/g" |
sed -E "s/(\"apiid\"\: \")[a-z0-9\.\-]+/\1XXXX/g" |
sed -E "s/(\"apiname\"\: \")[a-z0-9\.\-]+/\1XXXX/g" |
sed -E "s/(\"function_trigger.event_source_arn\"\: \")[a-z0-9\.\-\:]+/\1XXXX/g" |
sed -E "s/(\"event_id\"\: \")[a-zA-Z0-9\:\-]+/\1XXXX/g" |
sed -E "s/(\"message_id\"\: \")[a-zA-Z0-9\:\-]+/\1XXXX/g" |
sed -E "s/(\"request_id\"\:\ \")[a-zA-Z0-9\-\=]+/\1XXXX/g" |
sed -E "s/(\"connection_id\"\:\ \")[a-zA-Z0-9\-]+/\1XXXX/g" |
sed -E "s/(\"shardId\-)([0-9]+)\:([a-zA-Z0-9]+)[a-zA-Z0-9]/\1XXXX:XXXX/g" |
sed -E "s/(\"shardId\-)[0-9a-zA-Z]+/\1XXXX/g" |
sed -E "s/(\"datadog_lambda\"\: \")([0-9]+\.[0-9]+\.[0-9])/\1X.X.X/g" |
sed -E "s/(\"partition_key\"\:\ \")[a-zA-Z0-9\-]+/\1XXXX/g" |
sed -E "s/(\"object_etag\"\:\ \")[a-zA-Z0-9\-]+/\1XXXX/g" |
sed -E "s/(\"dd_trace\"\: \")([0-9]+\.[0-9]+\.[0-9])/\1X.X.X/g" |
# Parse out account ID in ARN
sed -E "s/([a-zA-Z0-9]+):([a-zA-Z0-9]+):([a-zA-Z0-9]+):([a-zA-Z0-9\-]+):([a-zA-Z0-9\-\:]+)/\1:\2:\3:\4:XXXX:\4/g" |
sed -E "/init complete at epoch/d" |
sed -E "/main started at epoch/d"
)
Expand Down Expand Up @@ -240,7 +258,7 @@ set -e
if [ "$mismatch_found" = true ]; then
echo "FAILURE: A mismatch between new data and a snapshot was found and printed above."
echo "If the change is expected, generate new snapshots by running 'UPDATE_SNAPSHOTS=true DD_API_KEY=XXXX ./scripts/run_integration_tests.sh'"
echo "Make sure https://httpstat.us/400/ is UP for `http_error` test case"
echo "Make sure https://httpstat.us/400/ is UP for 'http_error' test case"
exit 1
fi

Expand Down
17 changes: 17 additions & 0 deletions tests/integration/parse-json.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
'use strict'

var readline = require('readline');
var rl = readline.createInterface({
input: process.stdin,
output: process.stdout,
terminal: false
});

rl.on('line', function(line){
try {
const obj = JSON.parse(line)
console.log(JSON.stringify(obj, null, 2))
} catch (e) {
console.log(line)
}
})
1 change: 0 additions & 1 deletion tests/integration/serverless.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@ provider:
DD_TRACE_ENABLED: true
DD_API_KEY: ${env:DD_API_KEY}
DD_TRACE_MANAGED_SERVICES: true
DD_CAPTURE_LAMBDA_PAYLOAD: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you remove this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I inadvertently added it when I wrote the feature, but it includes all headers and payload data that is complex to mask out, so I removed it now.

I think my introduction of this is what led to the tests being in the state they're in now.

lambdaHashingVersion: 20201221
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Version 3 doesn't require us to use lambdaHashingVersion anymore. Should we remove it or update it? https://www.serverless.com/framework/docs/guides/upgrading-v3#lambda-hashing-algorithm

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will remove, thanks!

timeout: 15
deploymentBucket:
Expand Down
1,397 changes: 1,334 additions & 63 deletions tests/integration/snapshots/logs/async-metrics_python36.log

Large diffs are not rendered by default.

1,397 changes: 1,334 additions & 63 deletions tests/integration/snapshots/logs/async-metrics_python37.log

Large diffs are not rendered by default.

1,397 changes: 1,334 additions & 63 deletions tests/integration/snapshots/logs/async-metrics_python38.log

Large diffs are not rendered by default.

1,397 changes: 1,334 additions & 63 deletions tests/integration/snapshots/logs/async-metrics_python39.log

Large diffs are not rendered by default.

Loading