Reducing the number of telemetry logs to reduce the impact of telemetry throttling on performance of Assessment and Install Updates jobs #211
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Reducing the number of telemetry logs to reduce the impact of telemetry throttling on performance of assessment and Install Updates jobs. Reducing the logs by two ways:
(a) Removing the logs which are not required for debugging.
(b) Currently there is separate log for each line for extract dependencies from output. We don't need log for each line for debugging. Removing log for the lines which are not required for debugging. Also updating to have single log instead of multiple logs for each lines which are required for debugging. There is one problem with having single log i.e. it might exceed the limit of characters in a single log and hence there will be truncation of log. But as per below analysis, this should happen when number of dependent packages is more than 20 which is very rare case.
The limit of characters per log is TELEMETRY_MSG_SIZE_LIMIT_IN_CHARS - (TELEMETRY_BUFFER_FOR_DROPPED_COUNT_MSG_IN_CHARS + TELEMETRY_EVENT_COUNTER_MSG_SIZE_LIMIT_IN_CHARS) = 3072 - (25 + 15) = 3072 - 40 = 3032 characters
There are two kinds of logs in the output of command to get dependencies:
(i) LogType1: Logs with "Inapplicable line: " - This log take around 100 characters, it depends on package name and version
(ii) LogType2: Logs with "Dependency decteded: " - This log take around 50 characters, it depends on package name, the package version is not there in this line
For each package in the output, there is one LogType1 and LogType2.
So, total package details which can be there in single log = (Character limit per log)/(Number of characters per package) = 3032/150 = 20 packages
So, if there are more than 20 packages as dependency then only there will be some truncation in the log. The 20 packages dependency is very rare scenario.
Testing completed:
Tested with two VMs with RHEL 7.8 Gen2 and 208 updates to install. In one VM the install updates job is run without changes and in the other one it is run with the changes. The time taken for job to complete in VM without changes is 139 minutes and the time taken for job to complete in VM with the changes is 87 minutes. So, there is around (139-87)/139 = 37.41% improvement in performance.
I also looked into the logs and the logs are good for debugging.
In first iteration this change is done for only YUM package manager. When it is reviewed then the changes will be done for Apt and Zypper.