Skip to content

Commit 1f507e8

Browse files
Adham Fariskuba-moo
Adham Faris
authored andcommitted
net/mlx5: Expose NIC temperature via hardware monitoring kernel API
Expose NIC temperature by implementing hwmon kernel API, which turns current thermal zone kernel API to redundant. For each one of the supported and exposed thermal diode sensors, expose the following attributes: 1) Input temperature. 2) Highest temperature. 3) Temperature label: Depends on the firmware capability, if firmware doesn't support sensors naming, the fallback naming convention would be: "sensorX", where X is the HW spec (MTMP register) sensor index. 4) Temperature critical max value: refers to the high threshold of Warning Event. Will be exposed as `tempY_crit` hwmon attribute (RO attribute). For example for ConnectX5 HCA's this temperature value will be 105 Celsius, 10 degrees lower than the HW shutdown temperature). 5) Temperature reset history: resets highest temperature. For example, for dualport ConnectX5 NIC with a single IC thermal diode sensor will have 2 hwmon directories (one for each PCI function) under "/sys/class/hwmon/hwmon[X,Y]". Listing one of the directories above (hwmonX/Y) generates the corresponding output below: $ grep -H -d skip . /sys/class/hwmon/hwmon0/* Output ======================================================================= /sys/class/hwmon/hwmon0/name:mlx5 /sys/class/hwmon/hwmon0/temp1_crit:105000 /sys/class/hwmon/hwmon0/temp1_highest:48000 /sys/class/hwmon/hwmon0/temp1_input:46000 /sys/class/hwmon/hwmon0/temp1_label:asic grep: /sys/class/hwmon/hwmon0/temp1_reset_history: Permission denied In addition, displaying the sensors data via lm_sensors generates the corresponding output below: $ sensors Output ======================================================================= mlx5-pci-0800 Adapter: PCI adapter asic: +46.0°C (crit = +105.0°C, highest = +48.0°C) mlx5-pci-0801 Adapter: PCI adapter asic: +46.0°C (crit = +105.0°C, highest = +48.0°C) CC: Jean Delvare <[email protected]> Signed-off-by: Adham Faris <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Reviewed-by: Gal Pressman <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]> Acked-by: Guenter Roeck <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
1 parent 383a4de commit 1f507e8

File tree

9 files changed

+463
-141
lines changed

9 files changed

+463
-141
lines changed

drivers/net/ethernet/mellanox/mlx5/core/Kconfig

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ config MLX5_CORE
1212
depends on MLXFW || !MLXFW
1313
depends on PTP_1588_CLOCK_OPTIONAL
1414
depends on PCI_HYPERV_INTERFACE || !PCI_HYPERV_INTERFACE
15+
depends on HWMON || !HWMON
1516
help
1617
Core driver for low level functionality of the ConnectX-4 and
1718
Connect-IB cards by Mellanox Technologies.

drivers/net/ethernet/mellanox/mlx5/core/Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -82,7 +82,7 @@ endif
8282
mlx5_core-$(CONFIG_MLX5_BRIDGE) += esw/bridge.o esw/bridge_mcast.o esw/bridge_debugfs.o \
8383
en/rep/bridge.o
8484

85-
mlx5_core-$(CONFIG_THERMAL) += thermal.o
85+
mlx5_core-$(CONFIG_HWMON) += hwmon.o
8686
mlx5_core-$(CONFIG_MLX5_MPFS) += lib/mpfs.o
8787
mlx5_core-$(CONFIG_VXLAN) += lib/vxlan.o
8888
mlx5_core-$(CONFIG_PTP_1588_CLOCK) += lib/clock.o

0 commit comments

Comments
 (0)