Skip to content

feat: Pretty print JSON for python. Fix missing regular expressions for masking out ephemeral information #215

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Mar 22, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ jobs:
pip install .[dev]

- name: Install Serverless Framework
run: sudo yarn global add serverless@^2.72.2 --prefix /usr/local
run: sudo yarn global add serverless@^3.7.0 --prefix /usr/local
- name: Install Crossbuild Deps
run: |
sudo apt-get update --allow-releaseinfo-change --fix-missing
Expand Down
22 changes: 20 additions & 2 deletions scripts/run_integration_tests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -175,14 +175,15 @@ for handler_name in "${LAMBDA_HANDLERS[@]}"; do
# Replace invocation-specific data like timestamps and IDs with XXXX to normalize logs across executions
logs=$(
echo "$raw_logs" |
node parse-json.js |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏

# Filter serverless cli errors
sed '/Serverless: Recoverable error occurred/d' |
# Remove RequestsDependencyWarning from botocore/vendored/requests/__init__.py
sed '/RequestsDependencyWarning/d' |
# Remove blank lines
sed '/^$/d' |
# Normalize Lambda runtime REPORT logs
sed -E 's/(RequestId|TraceId|SegmentId|Duration|Memory Used|"e"): [a-z0-9\.\-]+/\1: XXXX/g' |
sed -E 's/(RequestId|TraceId|SegmentId|Duration|init|Memory Used|"e"): [a-z0-9\.\-]+/\1: XXXX/g' |
# Normalize HTTP headers
sed -E "s/(x-datadog-parent-id:|x-datadog-trace-id:|Content-Length:)[0-9]+/\1XXXX/g" |
# Remove Account ID
Expand All @@ -204,12 +205,29 @@ for handler_name in "${LAMBDA_HANDLERS[@]}"; do
sed -E "s/(\"span_id\"\: \")[A-Z0-9\.\-]+/\1XXXX/g" |
sed -E "s/(\"parent_id\"\: \")[A-Z0-9\.\-]+/\1XXXX/g" |
sed -E "s/(\"request_id\"\: \")[a-z0-9\.\-]+/\1XXXX/g" |
sed -E "s/(\"http.source_ip\"\: \")[a-z0-9\.\-]+/\1XXXX/g" |
sed -E "s/(\"http.user_agent\"\: \")[a-z0-9\.\-]+/\1XXXX/g" |
sed -E "s/(\"function_trigger.event_source_arn\"\: \")[A-Za-z0-9\/\.\:\-]+/\1XXXX/g" |
sed -E "s/(\"duration\"\: )[0-9\.\-]+/\1\"XXXX\"/g" |
sed -E "s/(\"start\"\: )[0-9\.\-]+/\1\"XXXX\"/g" |
sed -E "s/(\"system\.pid\"\: )[0-9\.\-]+/\1\"XXXX\"/g" |
sed -E "s/(\"runtime-id\"\: \")[a-z0-9\.\-]+/\1XXXX/g" |
sed -E "s/([a-zA-Z0-9]+)(\.execute-api\.[a-z0-9\-]+\.amazonaws\.com)/XXXX\2/g" |
sed -E "s/(\"apiid\"\: \")[a-z0-9\.\-]+/\1XXXX/g" |
sed -E "s/(\"apiname\"\: \")[a-z0-9\.\-]+/\1XXXX/g" |
sed -E "s/(\"function_trigger.event_source_arn\"\: \")[a-z0-9\.\-\:]+/\1XXXX/g" |
sed -E "s/(\"event_id\"\: \")[a-zA-Z0-9\:\-]+/\1XXXX/g" |
sed -E "s/(\"message_id\"\: \")[a-zA-Z0-9\:\-]+/\1XXXX/g" |
sed -E "s/(\"request_id\"\:\ \")[a-zA-Z0-9\-\=]+/\1XXXX/g" |
sed -E "s/(\"connection_id\"\:\ \")[a-zA-Z0-9\-]+/\1XXXX/g" |
sed -E "s/(\"shardId\-)([0-9]+)\:([a-zA-Z0-9]+)[a-zA-Z0-9]/\1XXXX:XXXX/g" |
sed -E "s/(\"shardId\-)[0-9a-zA-Z]+/\1XXXX/g" |
sed -E "s/(\"datadog_lambda\"\: \")([0-9]+\.[0-9]+\.[0-9])/\1X.X.X/g" |
sed -E "s/(\"partition_key\"\:\ \")[a-zA-Z0-9\-]+/\1XXXX/g" |
sed -E "s/(\"object_etag\"\:\ \")[a-zA-Z0-9\-]+/\1XXXX/g" |
sed -E "s/(\"dd_trace\"\: \")([0-9]+\.[0-9]+\.[0-9])/\1X.X.X/g" |
# Parse out account ID in ARN
sed -E "s/([a-zA-Z0-9]+):([a-zA-Z0-9]+):([a-zA-Z0-9]+):([a-zA-Z0-9\-]+):([a-zA-Z0-9\-\:]+)/\1:\2:\3:\4:XXXX:\4/g" |
sed -E "/init complete at epoch/d" |
sed -E "/main started at epoch/d"
)
Expand Down Expand Up @@ -240,7 +258,7 @@ set -e
if [ "$mismatch_found" = true ]; then
echo "FAILURE: A mismatch between new data and a snapshot was found and printed above."
echo "If the change is expected, generate new snapshots by running 'UPDATE_SNAPSHOTS=true DD_API_KEY=XXXX ./scripts/run_integration_tests.sh'"
echo "Make sure https://httpstat.us/400/ is UP for `http_error` test case"
echo "Make sure https://httpstat.us/400/ is UP for 'http_error' test case"
exit 1
fi

Expand Down
17 changes: 17 additions & 0 deletions tests/integration/parse-json.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
'use strict'

var readline = require('readline');
var rl = readline.createInterface({
input: process.stdin,
output: process.stdout,
terminal: false
});

rl.on('line', function(line){
try {
const obj = JSON.parse(line)
console.log(JSON.stringify(obj, null, 2))
} catch (e) {
console.log(line)
}
})
8 changes: 0 additions & 8 deletions tests/integration/serverless.yml
Original file line number Diff line number Diff line change
@@ -1,12 +1,6 @@
# IAM permissions require service name to begin with 'integration-tests'
service: integration-tests-python

# As of mid-August 2021, the Serverless framework does not support Python 3.9
# and complains about an invalid configuration when we deploy Python 3.9 functions.
# This option suppresses the warning.
# Remove this when the Serverless framework supports Python 3.9.
configValidationMode: off

provider:
name: aws
region: sa-east-1
Expand All @@ -17,8 +11,6 @@ provider:
DD_TRACE_ENABLED: true
DD_API_KEY: ${env:DD_API_KEY}
DD_TRACE_MANAGED_SERVICES: true
DD_CAPTURE_LAMBDA_PAYLOAD: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you remove this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I inadvertently added it when I wrote the feature, but it includes all headers and payload data that is complex to mask out, so I removed it now.

I think my introduction of this is what led to the tests being in the state they're in now.

lambdaHashingVersion: 20201221
timeout: 15
deploymentBucket:
name: integration-tests-deployment-bucket
Expand Down
1,397 changes: 1,334 additions & 63 deletions tests/integration/snapshots/logs/async-metrics_python36.log

Large diffs are not rendered by default.

1,397 changes: 1,334 additions & 63 deletions tests/integration/snapshots/logs/async-metrics_python37.log

Large diffs are not rendered by default.

1,397 changes: 1,334 additions & 63 deletions tests/integration/snapshots/logs/async-metrics_python38.log

Large diffs are not rendered by default.

1,397 changes: 1,334 additions & 63 deletions tests/integration/snapshots/logs/async-metrics_python39.log

Large diffs are not rendered by default.

Loading