diff --git a/README.md b/README.md
index c5242b2..22deb92 100644
--- a/README.md
+++ b/README.md
@@ -1,39 +1,39 @@
# tuned-amdgpu
-Hacky solution to integrate AMDGPU power control and overclocking in `tuned` with Ansible
+Hacky solution to integrate AMDGPU power profile control in `tuned` with Ansible
Takes a list of existing `tuned` profiles and creates new ones based on them. These new profiles include AMDGPU power/clock parameters
An attempt is made to discover the active GPU using the 'connected' state in the `DRM` subsystem, example:
-
-```bash
-~ $ grep -ls ^connected /sys/class/drm/*/status | grep -o card[0-9] | sort | uniq | sort -h | tail -1
+```
+$ grep -ls ^connected /sys/class/drm/*/status | grep -o card[0-9] | sort | uniq | sort -h | tail -1
card1
```
-_Warning_: This is only tested with `RX6000` series GPUs, it is probable that other generations will *not* work properly. Use at your own risk!
+
+_Warning_: This is only tested with `RX6000` series GPUs, it is probable that older AMD GPUs will not work properly. Use at your own risk!
## Profiles
-Two _'profiles'_ are in each name:
-
-- before `amdgpu` is the source profile provided with `tuned`
-- after `amdgpu` tells the GPU clock profile offered, outlined below
+An example of the output/provided profiles follow
| Output profile | Description |
|:---|---|
| `balanced-amdgpu-default` | Includes the (assumed) existing `balanced` tuned profile.
Only adjusts the GPU power limit (typically lower). Clocks/voltage curve remain the default. |
-| `desktop-amdgpu-overclock` | Includes the (assumed) existing `desktop` tuned profile.
Adjusts the GPU power limit, clocks, _and_ the voltage curve. |
-| `desktop-amdgpu-peak` | Includes the (assumed) existing `desktop` tuned profile.
Same as the `overclock` profile, but locks clocks to their highest configured values |
+| `desktop-amdgpu-VR` | Includes the (assumed) existing `desktop` tuned profile.
Adjusts the GPU power limit, clocks, _and_ the voltage curve.
Uses the predefined `VR` profile in the driver. See `/sys/class/drm/card*/device/pp_power_profile_mode` |
+| `latency-performance-amdgpu-custom` | Includes the existing `latency-performance` tuned profile.
Like the existing GPU profiles (eg: _VR)), this also adjusts the GPU power limit, clocks, _and_ the voltage curve.
This differs by using the `custom` profile in the driver. This opens up further tweaking of the power/clock heuristics through the driver (currently manual). see: [pp-dpm](https://docs.kernel.org/gpu/amdgpu/thermal.html#pp-dpm) |
+
+**Note**: This is non-exhaustive, see the variables `base_profiles` and `amdgpu_profiles` below for the (default) sources of the merged profile mapping
## Notable variables
These are the variables you're likely to want to change. They are defined in [playbook.yml](playbook.yml)
-| Variable | Description |
-|------------------------|---------------------------------------------------------------------------------------|
-| gpu_clock_min | Sets the minimum (dynamic) GPU clock (in `Mhz`) for the non-default `amdgpu` profiles |
-| gpu_clock_max | Sets the maximum (dynamic) GPU clock (in `MHz`) for the non-default `amdgpu` profiles |
-| gpumem_clock_static | Sets the _static_ memory clock for the GPU (in `MHz`). This is *not* the _effective_ data rate. That is a multiple of this depending on the type of VRAM.
To avoid flickering this does *not* change dynamically with load. |
-| gpu_mv_offset | GPU core voltage offset. Takes +/- some integer in millivolts. Can be used to both over _and_ under volt. eg: `-50` _(undervolt `50mV` or `0.05V`)_ |
-| base_profiles | List of base tuned profiles to clone in the new AMDGPU profiles. Defaults based on `Fedora` |
-| gpu_power_multi | Dictionary with two keys, `default` and `overclock`. Expects two floats to set a power limit relative to the board _capability_. Example: `1.0` is full board capability, `0.5` is 50%. |
+| Variable | Description | In-playbook |
+|------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| gpu_clock_min | Sets the minimum (dynamic) GPU clock (in `Mhz`) for the non-default `amdgpu` profiles | `700` |
+| gpu_clock_max | Sets the maximum (dynamic) GPU clock (in `MHz`) for the non-default `amdgpu` profiles | `2600`, results in `2.6GHz` (rounded); mild overclock |
+| gpumem_clock_static | Sets the _static_ memory clock for the GPU (in `MHz`). This is *not* the _effective_ data rate. That is a multiple of this depending on the type of VRAM.
To avoid flickering this does *not* change dynamically with load. | `1050`, results in just over `1GHz`; mild overclock
Actual effective clock depends on this being multiplied against the data/pump rate of the `GDDR?` GPU memory |
+| gpu_mv_offset | GPU core voltage offset. Takes +/- some integer in millivolts. Can be used to both over _and_ under volt. | `-50` (undervolt `50mV` or `0.05V`) |
+| base_profiles | List of base tuned profiles to clone in the new AMDGPU profiles. Defaults based on `Fedora` |
default:| + diff --git a/playbook.yml b/playbook.yml index eec99dd..76d2f95 100644 --- a/playbook.yml +++ b/playbook.yml @@ -7,23 +7,12 @@ - role: tuned_amdgpu # note: 'gpu_*' vars only apply with the 'custom' suffixed profiles created by this tooling # profiles based on the 'default' amdgpu power profile mode use default clocks - # - # the connected AMD GPU is automatically discovered - assumes one - # on swap to other AMD cards to avoid instability: - # 'rm -rfv /etc/tuned/*amdgpu*' - gpu_clock_min: "750" # default 500, for best performance: near maximum. applies with 'overclock' tuned profile - gpu_clock_max: "2675" # default somewhere around 2529 to 2660 - gpumem_clock_static: "1075" - gpu_power_multi: - default: 0.869969040247678 # 281W - real default - overclock: 0.928792569659443 # 300W - slight boost -# overclock: 1.0 # 323W - full board capability + gpu_clock_min: "750" # default 500 + gpu_clock_max: "2600" # default 2529 + gpumem_clock_static: "1050" # optional, applies offset (+/-) to GPU voltage by provided mV - # gpu_mv_offset: "-25" - # gpu_mv_offset: "+50" # add 50mV or 0.05V - gpu_mv_offset: "+25" # add 25mV or 0.025V + gpu_mv_offset: "-50" # '-50' undervolts GPU core voltage 50mV or 0.05V - # mostly untested, there be dragons/instability # # list of source tuned profiles available on Fedora (TODO: should dynamically discover) base_profiles: @@ -34,3 +23,27 @@ - network-throughput - powersave - virtual-host + # + # mapping of typical Navi generation power profiles from: + # /sys/class/drm/card*/device/pp_power_profile_mode + # ref: https://www.kernel.org/doc/html/v4.20/gpu/amdgpu.html#pp-power-profile-mode + # 'pwr_cap_multi' is multiplied against board *limit* to determine profile wattage; 0.5 = 50% + # values below reflect my 6900XT + amdgpu_profiles: + default: + pwrmode: 0 + pwr_cap_multi: 0.789473684210526 # 255W - default + 3D: + pwrmode: 1 + pwr_cap_multi: 0.789473684210526 # 255W - default + VR: + pwrmode: 4 + pwr_cap_multi: 0.789473684210526 # 255W - default + compute: + pwrmode: 5 + pwr_cap_multi: 0.789473684210526 # 255W - default + custom: + pwrmode: 6 + pwr_cap_multi: 0.869969040247678 # 281W - slight boost + # both dictionaries are merged to create new 'tuned' profiles. eg: + # 'balanced-amdgpu-default', 'balanced-amdgpu-3D', 'balanced-amdgpu-video' diff --git a/power_max multi tab calculator.ods b/power_max multi tab calculator.ods index ea8c9b0..de93923 100644 Binary files a/power_max multi tab calculator.ods and b/power_max multi tab calculator.ods differ diff --git a/roles/tuned_amdgpu/defaults/main.yml b/roles/tuned_amdgpu/defaults/main.yml index 6792dcb..de80a28 100644 --- a/roles/tuned_amdgpu/defaults/main.yml +++ b/roles/tuned_amdgpu/defaults/main.yml @@ -1,12 +1,15 @@ --- # defaults file for tuned_amdgpu # +# vars handling unit conversion RE: power capabilities/limits +# the discovered board limit for power capability; in microWatts, then converted +power_max: "{{ power_max_b64['content'] | b64decode }}" +board_watts: "{{ power_max | int / 1000000 }}" # internals for profile power calculations # item in the context of the with_nested loops in the play -profile_name: "{{ item.0 }}" - -amdgpu_profiles: - - default - - overclock - - peak +profile_name: "{{ item.0.key }}" +profile_percentage: "{{ (item.0.value.pwr_cap_multi * 100.0) | round(2) }}" +profile_multi: "{{ item.0.value.pwr_cap_multi }}" +profile_microwatts: "{{ power_max | float * profile_multi | float }}" +profile_watts: "{{ profile_microwatts | int / 1000000 }}" diff --git a/roles/tuned_amdgpu/files/profile-common.sh b/roles/tuned_amdgpu/files/profile-common.sh deleted file mode 100644 index 5970513..0000000 --- a/roles/tuned_amdgpu/files/profile-common.sh +++ /dev/null @@ -1,35 +0,0 @@ -#!/bin/bash -# -# 'common' file sourced by other scripts under tuned profile -# -# dynamically determine the connected GPU using the DRM subsystem -CARD=$(/usr/bin/grep -ls ^connected /sys/class/drm/*/status | /usr/bin/grep -o 'card[0-9]' | /usr/bin/sort | /usr/bin/uniq | /usr/bin/sort -h | /usr/bin/tail -1) - -function get_hwmon_dir() { - CARD_DIR="/sys/class/drm/${1}/device/" - for CANDIDATE in "${CARD_DIR}"/hwmon/hwmon*; do - if [[ -f "${CANDIDATE}"/power1_cap ]]; then - # found a valid hwmon dir - echo "${CANDIDATE}" - fi - done -} - -# determine the hwmon directory -HWMON_DIR=$(get_hwmon_dir "${CARD}") - -# read all of the power profiles, used to get the IDs for assignment later -PROFILE_MODES=$(< /sys/class/drm/"${CARD}"/device/pp_power_profile_mode) - -# get power capability; later used determine limits -read -r -d '' POWER_CAP < "$HWMON_DIR"/power1_cap_max - -# enable THP; profile enables the 'vm.compaction_proactiveness' sysctl -# improves allocation latency -echo 'always' | tee /sys/kernel/mm/transparent_hugepage/enabled - -# export determinations -export CARD -export HWMON_DIR -export PROFILE_MODES -export POWER_CAP diff --git a/roles/tuned_amdgpu/handlers/main.yml b/roles/tuned_amdgpu/handlers/main.yml index c9a9ad5..60384eb 100644 --- a/roles/tuned_amdgpu/handlers/main.yml +++ b/roles/tuned_amdgpu/handlers/main.yml @@ -4,4 +4,3 @@ ansible.builtin.service: name: tuned state: restarted - become: true diff --git a/roles/tuned_amdgpu/tasks/main.yml b/roles/tuned_amdgpu/tasks/main.yml index 4cc4e42..5f93274 100644 --- a/roles/tuned_amdgpu/tasks/main.yml +++ b/roles/tuned_amdgpu/tasks/main.yml @@ -28,57 +28,70 @@ when: (fed_ppdtuned_swap is not defined) or ('tuned' not in ansible_facts.packages) become: true -- name: Ensure dynamic tuning is disabled - ansible.builtin.lineinfile: - path: /etc/tuned/tuned-main.conf - regexp: '^dynamic_tuning.*=' - line: 'dynamic_tuning = 0' - notify: Restart tuned - become: true +- name: Determine GPU device in drm subsystem + ansible.builtin.shell: + cmd: grep -ls ^connected /sys/class/drm/*/status | grep -o card[0-9] | sort | uniq | sort -h | tail -1 + executable: /bin/bash + changed_when: false + register: card + +- name: Find hwmon/max power capability file for {{ card.stdout }} + ansible.builtin.find: + paths: /sys/class/drm/{{ card.stdout }}/device/hwmon + file_type: file + recurse: true + use_regex: true + patterns: + - '^power1_cap_max$' + register: hwmon + +- name: Find hwmon/current power limit file for {{ card.stdout }} + ansible.builtin.find: + paths: /sys/class/drm/{{ card.stdout }}/device/hwmon + file_type: file + recurse: true + use_regex: true + patterns: + - '^power1_cap$' + register: powercap_set + +- name: Get max power capability for {{ card.stdout }} + ansible.builtin.slurp: + src: "{{ hwmon.files.0.path }}" + register: power_max_b64 - name: Create custom profile directories ansible.builtin.file: state: directory - path: /etc/tuned/{{ item.1 }}-amdgpu-{{ item.0 }} + path: /etc/tuned/{{ item.1 }}-amdgpu-{{ item.0.key }} mode: "0755" with_nested: - - "{{ amdgpu_profiles }}" + - "{{ lookup('dict', amdgpu_profiles) }}" - "{{ base_profiles }}" become: true -- name: Copy 'common' AMDGPU script for all profiles - ansible.builtin.copy: - src: profile-common.sh - dest: "/etc/tuned/{{ item.1 }}-amdgpu-{{ item.0 }}/amdgpu-common.sh" - mode: "0644" # sourced, doesn't require executable bit - owner: root - group: root - notify: Restart tuned - with_nested: - - "{{ amdgpu_profiles }}" - - "{{ base_profiles }}" - become: true - -- name: Template custom AMDGPU profile scripts +- name: Template AMDGPU control/reset scripts ansible.builtin.template: - src: amdgpu-profile-{{ item.0 }}.sh.j2 - dest: /etc/tuned/{{ item.1 }}-amdgpu-{{ item.0 }}/amdgpu-clock.sh + src: templates/amdgpu-clock.sh.j2 + dest: /etc/tuned/{{ item.1 }}-amdgpu-{{ item.0.key }}/amdgpu-clock.sh owner: root group: root mode: "0755" - loop: "{{ amdgpu_profiles | product(base_profiles) | list }}" + with_nested: + - "{{ lookup('dict', amdgpu_profiles) }}" + - "{{ base_profiles }}" notify: Restart tuned become: true -- name: Template tuned.conf for custom profiles +- name: Template custom tuned profiles ansible.builtin.template: src: templates/tuned.conf.j2 - dest: /etc/tuned/{{ item.1 }}-amdgpu-{{ item.0 }}/tuned.conf + dest: /etc/tuned/{{ item.1 }}-amdgpu-{{ item.0.key }}/tuned.conf owner: root group: root mode: "0644" with_nested: - - "{{ amdgpu_profiles }}" + - "{{ lookup('dict', amdgpu_profiles) }}" - "{{ base_profiles }}" notify: Restart tuned become: true diff --git a/roles/tuned_amdgpu/templates/amdgpu-clock.sh.j2 b/roles/tuned_amdgpu/templates/amdgpu-clock.sh.j2 new file mode 100644 index 0000000..90c5f0b --- /dev/null +++ b/roles/tuned_amdgpu/templates/amdgpu-clock.sh.j2 @@ -0,0 +1,72 @@ +#!/bin/bash +# script for tuned AMDGPU clock control +# configures GPU power/clock characteristics +# clocks/power in 3D are dynamic based on need/usage +# +# for 'amdgpu-default' tuned profiles, this will reset the characteristics to default +# for others this will apply overclocking settings -- leaving clock choices to the associated power profile (eg: VR) +# +# rendered by Ansible with environment-appropriate values: +# card #, eg: card0 +# path to discovered sysfs device files (power/clock/voltage control) +# +# AMDGPU driver/sysfs references: +# https://01.org/linuxgraphics/gfx-docs/drm/gpu/amdgpu.html +# https://docs.kernel.org/gpu/amdgpu/thermal.html + +{# done this way to avoid issues with the card number possibly shifting after playbook run #} +# dynamically determine the connected GPU using the DRM subsystem +CARD=$(/usr/bin/grep -ls ^connected /sys/class/drm/*/status | /usr/bin/grep -o 'card[0-9]' | /usr/bin/sort | /usr/bin/uniq | /usr/bin/sort -h | /usr/bin/tail -1) + +{# begin the templated script for 'default' profiles to reset state #} +{% if 'default' in profile_name %} +# set power state transition heuristics to default +echo '{{ item.0.value.pwrmode }}' | tee /sys/class/drm/"${CARD}"/device/pp_power_profile_mode + +# set control mode back to auto +# attempts to dynamically set optimal power profile for (load) conditions +echo 'auto' | tee /sys/class/drm/"${CARD}"/device/power_dpm_force_performance_level + +# reset any existing profile clock changes +echo 'r' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage + +# give '{{ profile_name }}' profile ~{{ profile_percentage }}% (rounded) of the max power capability +# {{ profile_watts }} Watts of {{ board_watts }} total +echo '{{ profile_microwatts | int }}' | tee '{{ powercap_set.files.0.path }}' +{% else %} +{# begin the templated script for non-default AMD GPU profiles, eg: 'VR' or '3D_FULL_SCREEN' #} +# set manual control mode +# allows control via 'pp_dpm_mclk', 'pp_dpm_sclk', 'pp_dpm_pcie', 'pp_dpm_fclk', and 'pp_power_profile_mode' files +# only interested in 'pp_power_profile_mode' for power and 'pp_dpm_mclk' for memory clock (flickering). +# GPU clocks are dynamic based on (load) condition +echo 'manual' | tee /sys/class/drm/"${CARD}"/device/power_dpm_force_performance_level + +# set power state transition heuristics to '{{ profile_name }}' profile +echo '{{ item.0.value.pwrmode }}' | tee /sys/class/drm/"${CARD}"/device/pp_power_profile_mode + +# give '{{ profile_name }}' profile ~{{ profile_percentage }}% (rounded) of the max power capability +# {{ profile_watts }} Watts of {{ board_watts }} total +echo '{{ profile_microwatts | int }}' | tee '{{ powercap_set.files.0.path }}' + +# set the minimum GPU clock +echo 's 0 {{ gpu_clock_min }}' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage + +# set the maximum GPU clock +echo 's 1 {{ gpu_clock_max }}' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage + +# set the maximum GPU *memory* clock +echo 'm 1 {{ gpumem_clock_static }}' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage +{% if gpu_mv_offset is defined %} + +# offset GPU voltage {{ gpu_mv_offset }}mV +echo 'vo {{ gpu_mv_offset }}' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage +{% endif %} + +# commit the changes +echo 'c' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage + +# force GPU memory into highest clock (fix flickering) +# pp_dpm_*clk settings are unintuitive, giving profiles that may be used +# opt not to set the others (eg: sclk/fclk) - those should remain for benefits from the curve +echo '3' | tee /sys/class/drm/"${CARD}"/device/pp_dpm_mclk +{% endif %} diff --git a/roles/tuned_amdgpu/templates/amdgpu-profile-default.sh.j2 b/roles/tuned_amdgpu/templates/amdgpu-profile-default.sh.j2 deleted file mode 100644 index 4bd1282..0000000 --- a/roles/tuned_amdgpu/templates/amdgpu-profile-default.sh.j2 +++ /dev/null @@ -1,36 +0,0 @@ -#!/bin/bash -# script for tuned AMDGPU clock control -# configures GPU power/clock characteristics -# clocks/power in 3D are dynamic based on need/usage -# -# for 'amdgpu-default' tuned profiles, this will reset the characteristics to default -# for others this will apply overclocking settings -- leaving clock choices to the associated power profile (eg: VR) -# -# rendered by Ansible with environment-appropriate values: -# card #, eg: card0 -# path to discovered sysfs device files (power/clock/voltage control) -# -# AMDGPU driver/sysfs references: -# https://01.org/linuxgraphics/gfx-docs/drm/gpu/amdgpu.html -# https://docs.kernel.org/gpu/amdgpu/thermal.html -# -# start by including the 'common' script; determines card/hwmon dir/power profiles/power capability -. $(dirname "${BASH_SOURCE[0]}")/amdgpu-common.sh - -{# begin the templated script for 'default' profiles to reset state #} -# set control mode back to auto -# attempts to dynamically set optimal power profile for (load) conditions -echo 'auto' | tee /sys/class/drm/"${CARD}"/device/power_dpm_force_performance_level - -# reset any existing profile clock changes -echo 'r' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage - -# adjust power limit using multiplier against board capability -POWER_LIM_DEFAULT=$(/usr/bin/awk -v m="$POWER_CAP" -v n={{ gpu_power_multi.default }} 'BEGIN {printf "%.0f", (m*n)}') -echo "$POWER_LIM_DEFAULT" | tee "${HWMON_DIR}/power1_cap" - -# extract the power-saving profile ID number -PROF_DEFAULT_NUM=$(/usr/bin/awk '$0 ~ /BOOTUP_DEFAULT.*:/ {print $1}' <<< "$PROFILE_MODES") - -# reset power/clock heuristics to power-saving -echo "${PROF_DEFAULT_NUM}" | tee /sys/class/drm/"${CARD}"/device/pp_power_profile_mode diff --git a/roles/tuned_amdgpu/templates/amdgpu-profile-overclock.sh.j2 b/roles/tuned_amdgpu/templates/amdgpu-profile-overclock.sh.j2 deleted file mode 100644 index 1f0aa3a..0000000 --- a/roles/tuned_amdgpu/templates/amdgpu-profile-overclock.sh.j2 +++ /dev/null @@ -1,58 +0,0 @@ -#!/bin/bash -# script for tuned AMDGPU clock control -# configures GPU power/clock characteristics -# clocks/power in 3D are dynamic based on need/usage -# -# for 'amdgpu-default' tuned profiles, this will reset the characteristics to default -# for others this will apply overclocking settings -- leaving clock choices to the associated power profile (eg: VR) -# -# rendered by Ansible with environment-appropriate values: -# card #, eg: card0 -# path to discovered sysfs device files (power/clock/voltage control) -# -# AMDGPU driver/sysfs references: -# https://01.org/linuxgraphics/gfx-docs/drm/gpu/amdgpu.html -# https://docs.kernel.org/gpu/amdgpu/thermal.html -# -# start by including the 'common' script; determines card/hwmon dir/power profiles/power capability -. $(dirname "${BASH_SOURCE[0]}")/amdgpu-common.sh - -{# begin the templated script for 'overclocked' AMD GPU profiles based on the existing tuned profiles #} -# set the minimum GPU clock - for best performance, this should be near the maximum -# RX6000 series power management *sucks* -echo 's 0 {{ gpu_clock_min }}' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage - -# set the maximum GPU clock -echo 's 1 {{ gpu_clock_max }}' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage - -# set the GPU *memory* clock -# normally this would appear disregarded, memory clocked at the minimum allowed by the overdrive (OD) range -# it follows the core clock; if both 0/1 profiles for _it_ are high enough, the memory will follow -echo 'm 1 {{ gpumem_clock_static }}' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage -{% if gpu_mv_offset is defined %} - -# offset GPU voltage {{ gpu_mv_offset }}mV -echo 'vo {{ gpu_mv_offset }}' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage -{% endif %} - -# commit the changes -echo 'c' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage - -# force GPU core and memory into highest clocks (fix flickering and poor power management) -# set manual control mode -# allows control via 'pp_dpm_mclk', 'pp_dpm_sclk', 'pp_dpm_pcie', 'pp_dpm_fclk', and 'pp_power_profile_mode' files -echo 'manual' | tee /sys/class/drm/"${CARD}"/device/power_dpm_force_performance_level - -# adjust power limit using multiplier against board capability -POWER_LIM_OC=$(/usr/bin/awk -v m="$POWER_CAP" -v n={{ gpu_power_multi.overclock }} 'BEGIN {printf "%.0f", (m*n)}') -echo "$POWER_LIM_OC" | tee "${HWMON_DIR}/power1_cap" - -# avoid display flickering, force OC'd memory to highest clock -echo '3' | tee /sys/class/drm/"${CARD}"/device/pp_dpm_mclk - -# extract the VR power profile ID number -PROF_VR_NUM=$(/usr/bin/awk '$0 ~ /VR.*:/ {print $1}' <<< "$PROFILE_MODES") - -# force 'overclocked' profile to 'VR' power/clock heuristics -# latency/frame timing seemed favorable with relatively-close minimum clocks -echo "${PROF_VR_NUM}" | tee /sys/class/drm/"${CARD}"/device/pp_power_profile_mode diff --git a/roles/tuned_amdgpu/templates/amdgpu-profile-peak.sh.j2 b/roles/tuned_amdgpu/templates/amdgpu-profile-peak.sh.j2 deleted file mode 100644 index 14105a8..0000000 --- a/roles/tuned_amdgpu/templates/amdgpu-profile-peak.sh.j2 +++ /dev/null @@ -1,66 +0,0 @@ -#!/bin/bash -# script for tuned AMDGPU clock control -# configures GPU power/clock characteristics -# clocks/power in 3D are dynamic based on need/usage -# -# for 'amdgpu-default' tuned profiles, this will reset the characteristics to default -# for others this will apply overclocking settings -- leaving clock choices to the associated power profile (eg: VR) -# -# rendered by Ansible with environment-appropriate values: -# card #, eg: card0 -# path to discovered sysfs device files (power/clock/voltage control) -# -# AMDGPU driver/sysfs references: -# https://01.org/linuxgraphics/gfx-docs/drm/gpu/amdgpu.html -# https://docs.kernel.org/gpu/amdgpu/thermal.html -# -# start by including the 'common' script; determines card/hwmon dir/power profiles/power capability -. $(dirname "${BASH_SOURCE[0]}")/amdgpu-common.sh - -{# begin the templated script for 'overclocked' AMD GPU profiles based on the existing tuned profiles #} -# set the minimum GPU clock - for best performance, this should be near the maximum -# RX6000 series power management *sucks* -echo 's 0 {{ gpu_clock_min }}' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage - -# set the maximum GPU clock -echo 's 1 {{ gpu_clock_max }}' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage - -# set the GPU *memory* clock -# normally this would appear disregarded, memory clocked at the minimum allowed by the overdrive (OD) range -# it follows the core clock; if both 0/1 profiles for _it_ are high enough, the memory will follow -echo 'm 1 {{ gpumem_clock_static }}' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage -{% if gpu_mv_offset is defined %} - -# offset GPU voltage {{ gpu_mv_offset }}mV -echo 'vo {{ gpu_mv_offset }}' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage -{% endif %} - -# commit the changes -echo 'c' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage - -# force GPU core and memory into highest clocks (fix flickering and poor power management) -# set manual control mode -# allows control via 'pp_dpm_mclk', 'pp_dpm_sclk', 'pp_dpm_pcie', 'pp_dpm_fclk', and 'pp_power_profile_mode' files -echo 'manual' | tee /sys/class/drm/"${CARD}"/device/power_dpm_force_performance_level - -# adjust power limit using multiplier against board capability -POWER_LIM_OC=$(/usr/bin/awk -v m="$POWER_CAP" -v n={{ gpu_power_multi.overclock }} 'BEGIN {printf "%.0f", (m*n)}') -echo "$POWER_LIM_OC" | tee "${HWMON_DIR}/power1_cap" - -# pp_dpm_*clk settings are unintuitive, giving profiles that may be used -echo '1' | tee /sys/class/drm/"${CARD}"/device/pp_dpm_sclk -echo '3' | tee /sys/class/drm/"${CARD}"/device/pp_dpm_mclk -echo '2' | tee /sys/class/drm/"${CARD}"/device/pp_dpm_fclk -echo '2' | tee /sys/class/drm/"${CARD}"/device/pp_dpm_socclk - -# extract the VR power profile ID number -PROF_VR_NUM=$(/usr/bin/awk '$0 ~ /VR.*:/ {print $1}' <<< "$PROFILE_MODES") - -# force 'overclocked' profile to 'VR' power/clock heuristics -# latency/frame timing seemed favorable with relatively-close minimum clocks -echo "${PROF_VR_NUM}" | tee /sys/class/drm/"${CARD}"/device/pp_power_profile_mode - -# note 4/8/2023: instead of 'manual'... try dealing with broken power management, force clocks to high -# ref: https://gitlab.freedesktop.org/drm/amd/-/issues/1500 -# followup: doesn't work that well in practice, still flaky on clocks/frame times -#echo 'high' | tee /sys/class/drm/"${CARD}"/device/power_dpm_force_performance_level diff --git a/roles/tuned_amdgpu/templates/tuned.conf.j2 b/roles/tuned_amdgpu/templates/tuned.conf.j2 index 729e025..636abad 100644 --- a/roles/tuned_amdgpu/templates/tuned.conf.j2 +++ b/roles/tuned_amdgpu/templates/tuned.conf.j2 @@ -1,22 +1,16 @@ [main] include={{ item.1 }} -summary={{ item.1 }} + TCP/RAID tweaks + AMDGPU {{ item.0 }} +summary={{ item.1 }} + TCP/RAID tweaks + AMDGPU pp_power_profile_mode = {{ item.0.value.pwrmode }} ({{ item.0.key }}) [sysctl] -# allow regular users to see the kernel ring buffer -kernel.dmesg_restrict=0 net.core.default_qdisc=fq # 'bbr2' requires a [modified] supporting kernel - stock Fedora kernels do *not* support it (currently) # eg: 'kernel-xanmode-edge' from COPR 'rmnscnce/kernel-xanmod' net.ipv4.tcp_congestion_control=bbr2 net.core.rmem_max=33554432 net.core.wmem_max=33554432 -dev.raid.speed_limit_min=1000000 -dev.raid.speed_limit_max=6000000 -# improve THP allocation latency, compact in background -vm.compaction_proactiveness=30 -# make page lock theft slightly more fair -vm.page_lock_unfairness=1 +dev.raid.speed_limit_min=600000 +dev.raid.speed_limit_max=9000000 # allow some games to run (eg: DayZ) vm.max_map_count=1048576 @@ -26,11 +20,3 @@ vm.max_map_count=1048576 [gpuclockscript] type=script script=${i:PROFILE_DIR}/amdgpu-clock.sh - -# for SSDs with no RPM, set no IO scheduler -[ssdnosched] -type=disk -devices_udev_regex=(ID_ATA_ROTATION_RATE_RPM=0) -# elevator=none -elevator=kyber -# elevator=mq-deadline
pwrmode: 0
pwr_cap_multi: 0.75
# 75% relatively safe default
VR:
pwrmode: 4
pwr_cap_multi: 0.8
# 80%, likely slight boost
custom:
pwrmode: 6
pwr_cap_multi: 1.0
# 100%, full GPU board capability
# warning: significantly increased heat