You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/content/docs/synthetics/synthetic-monitoring/private-locations/job-manager-configuration.mdx
+59-52Lines changed: 59 additions & 52 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1917,7 +1917,10 @@ FROM SyntheticCheck SELECT rate(uniqueCount(id), 1 minute) AS 'job rate per minu
1917
1917
This query calculates the average per-minute growth of the `jobManagerHeavyweightJobs` queue on a time series chart. A line above zero indicates the queue is growing, while a line below zero means it's shrinking.
1918
1918
1919
1919
```nrql
1920
-
FROM SyntheticsPrivateLocationStatus SELECT derivative(jobManagerHeavyweightJobs, 1 minute) AS 'queue growth rate per minute' WHERE name = 'YOUR_PRIVATE_LOCATION' TIMESERIES SINCE 1 day ago
1920
+
FROM SyntheticsPrivateLocationStatus
1921
+
SELECT derivative(jobManagerHeavyweightJobs, 1 minute) AS 'queue growth rate per minute'
1922
+
WHERE name = 'YOUR_PRIVATE_LOCATION'
1923
+
TIMESERIES SINCE 1 day ago
1921
1924
```
1922
1925
1923
1926
<Callout variant="tip">
@@ -1928,16 +1931,20 @@ FROM SyntheticsPrivateLocationStatus SELECT derivative(jobManagerHeavyweightJobs
1928
1931
This query finds the unique count of heavyweight monitors.
1929
1932
1930
1933
```nrql
1931
-
1932
-
FROM SyntheticCheck SELECT uniqueCount(monitorId) AS 'monitor count' WHERE location = 'YOUR_PRIVATE_LOCATION' AND type != 'SIMPLE' SINCE 1 day ago
1933
-
1934
+
FROM SyntheticCheck
1935
+
SELECT uniqueCount(monitorId) AS 'monitor count'
1936
+
WHERE location = 'YOUR_PRIVATE_LOCATION' AND type != 'SIMPLE'
1937
+
SINCE 1 day ago
1934
1938
```
1935
1939
1936
1940
**4. Find average job duration in minutes ($D_{avg,m}$):**
1937
1941
This query finds the average execution duration of completed non-ping jobs and converts the result from milliseconds to minutes. `executionDuration` represents the time the job took to execute on the host.
1938
1942
1939
1943
```nrql
1940
-
FROM SyntheticCheck SELECT average(executionDuration)/60e3 AS 'avg job duration (m)' WHERE location = 'YOUR_PRIVATE_LOCATION' AND type != 'SIMPLE' SINCE 1 day ago
1944
+
FROM SyntheticCheck
1945
+
SELECT average(executionDuration)/60e3 AS 'avg job duration (m)'
1946
+
WHERE location = 'YOUR_PRIVATE_LOCATION' AND type != 'SIMPLE'
1947
+
SINCE 1 day ago
1941
1948
```
1942
1949
1943
1950
**5. Find average heavyweight monitor period ($P_{avg,m}$):**
@@ -1972,15 +1979,15 @@ Your goal is to configure enough parallelism to handle your job load without exc
1972
1979
1973
1980
First, determine your private location's average job execution duration and job rate. Use `executionDuration` as it most accurately reflects the pod's active runtime.
1974
1981
1975
-
```
1982
+
```nrql
1976
1983
-- Get average job execution duration (in seconds)
1977
1984
FROM SyntheticCheck
1978
1985
SELECT average(executionDuration / 1e3) AS 'D_avg_s'
1979
1986
WHERE type != 'SIMPLE' AND location = 'YOUR_PRIVATE_LOCATION'
1980
1987
FACET typeLabel SINCE 1 hour ago
1981
1988
```
1982
1989
1983
-
```
1990
+
```nrql
1984
1991
-- Get jobs per 5 minutes
1985
1992
FROM SyntheticCheck
1986
1993
SELECT rate(uniqueCount(id), 5 minutes) AS 'N_m'
@@ -2018,7 +2025,7 @@ A key consideration is that a **single SJM instance has a maximum throughput of
2018
2025
2019
2026
First, get your average job execution duration in **minutes**:
2020
2027
2021
-
```
2028
+
```nrql
2022
2029
-- Get average job execution duration (in minutes)
2023
2030
FROM SyntheticCheck
2024
2031
SELECT average(executionDuration / 60e3) AS 'D_avg_m'
@@ -2059,7 +2066,7 @@ Compare your **required** parallelism ($P_{req}$) from Step 1 to the **maximum**
2059
2066
* **Action:**
2060
2067
2061
2068
1. You must **scale out by deploying multiple, separate SJM Helm releases**.
2062
-
2. See the **"Scaling Out with Multiple SJM Deployments"** section below for the correct procedure.
2069
+
2. See the **[Scaling Out with Multiple SJM Deployments](#scaling-out-with-multiple-sjm-deployments)** section below for the correct procedure.
2063
2070
3. **Do not** increase the `replicaCount` in your Helm chart.
2064
2071
2065
2072
##### Step 4: Monitor Your Queue
@@ -2068,7 +2075,7 @@ After applying your changes, you must verify that your job queue is stable and n
2068
2075
2069
2076
Run this query to check the queue's growth rate:
2070
2077
2071
-
```
2078
+
```nrql
2072
2079
-- Check for queue growth (a positive value means the queue is growing)
2073
2080
SELECT derivative(checksPending, 1 minute) AS 'Queue Growth Rate (per min)'
2074
2081
FROM SyntheticsPrivateLocationStatus
@@ -2083,45 +2090,45 @@ If the "Queue Growth Rate" is consistently positive, you need to install more SJ
2083
2090
The `parallelism` setting directly affects how many synthetics jobs per minute can be run. Too small a value and the queue may grow. Too large a value and nodes may become resource constrained.
2084
2091
2085
2092
<table>
2086
-
<thead>
2087
-
<tr>
2088
-
<th style={{ width: "300px" }}>
2089
-
Example
2090
-
</th>
2091
-
<th>
2092
-
Description
2093
-
</th>
2094
-
</tr>
2095
-
</thead>
2096
-
<tbody>
2097
-
<tr>
2098
-
<td>
2099
-
`parallelism=1`
2100
-
`completions=1`
2101
-
</td>
2102
-
<td>
2103
-
The runtime will execute 1 synthetics job per minute. After 1 job completes, the `CronJob` configuration will start a new job at the next minute. <DNT>**Throughput will be extremely limited with this configuration.**</DNT>
2104
-
</td>
2105
-
</tr>
2106
-
<tr>
2107
-
<td>
2108
-
`parallelism=1`
2109
-
`completions=6`
2110
-
</td>
2111
-
<td>
2112
-
The runtime will execute 1 synthetics job at a time. After the job completes, a new job will start immediately. After 6 jobs complete, the `CronJob` configuration will start a new Kubernetes Job. <DNT>**Throughput will be limited.**</DNT> A single long-running synthetics job will block the processing of any other synthetics jobs of this type.
2113
-
</td>
2114
-
</tr>
2115
-
<tr>
2116
-
<td>
2117
-
`parallelism=3`
2118
-
`completions=24`
2119
-
</td>
2120
-
<td>
2121
-
The runtime will execute 3 synthetics jobs at once. After any of these jobs complete, a new job will start immediately. After 24 jobs complete, the `CronJob` configuration will start a new Kubernetes Job. <DNT>**Throughput is much better with this or similar configurations.**</DNT>
2122
-
</td>
2123
-
</tr>
2124
-
</tbody>
2093
+
<thead>
2094
+
<tr>
2095
+
<th style={{ width: "300px" }}>
2096
+
Example
2097
+
</th>
2098
+
<th>
2099
+
Description
2100
+
</th>
2101
+
</tr>
2102
+
</thead>
2103
+
<tbody>
2104
+
<tr>
2105
+
<td>
2106
+
`parallelism=1`
2107
+
`completions=1`
2108
+
</td>
2109
+
<td>
2110
+
The runtime will execute 1 synthetics job per minute. After 1 job completes, the `CronJob` configuration will start a new job at the next minute. <DNT>**Throughput will be extremely limited with this configuration.**</DNT>
2111
+
</td>
2112
+
</tr>
2113
+
<tr>
2114
+
<td>
2115
+
`parallelism=1`
2116
+
`completions=6`
2117
+
</td>
2118
+
<td>
2119
+
The runtime will execute 1 synthetics job at a time. After the job completes, a new job will start immediately. After 6 jobs complete, the `CronJob` configuration will start a new Kubernetes Job. <DNT>**Throughput will be limited.**</DNT> A single long-running synthetics job will block the processing of any other synthetics jobs of this type.
2120
+
</td>
2121
+
</tr>
2122
+
<tr>
2123
+
<td>
2124
+
`parallelism=3`
2125
+
`completions=24`
2126
+
</td>
2127
+
<td>
2128
+
The runtime will execute 3 synthetics jobs at once. After any of these jobs complete, a new job will start immediately. After 24 jobs complete, the `CronJob` configuration will start a new Kubernetes Job. <DNT>**Throughput is much better with this or similar configurations.**</DNT>
2129
+
</td>
2130
+
</tr>
2131
+
</tbody>
2125
2132
</table>
2126
2133
2127
2134
If your `parallelism` setting is working well (keeping the queue at zero), setting a higher `completions` value (e.g., 6-10x `parallelism`) can improve efficiency by:
@@ -2131,7 +2138,7 @@ If your `parallelism` setting is working well (keeping the queue at zero), setti
2131
2138
2132
2139
It's important to note that the `completions` value should not be too large or the CronJob will experience warning events like the following:
2133
2140
2134
-
```
2141
+
```sh
2135
2142
8m40s Warning TooManyMissedTimes cronjob/synthetics-node-browser-runtime too many missed start times: 101. Set or decrease .spec.startingDeadlineSeconds or check clock skew
2136
2143
```
2137
2144
@@ -2166,7 +2173,7 @@ Setting the `fullnameOverride` is highly recommended to create shorter, more man
2166
2173
2167
2174
For example, to install two SJMs named `sjm-alpha` and `sjm-beta` into the `newrelic` namespace (both using the same `values.yaml` with your fixed parallelism):
0 commit comments