Compare commits
10 commits
f295287699
...
81943ed387
Author | SHA1 | Date | |
---|---|---|---|
81943ed387 | |||
7e709007fd | |||
e45aa26eaa | |||
7c50e771e2 | |||
5fdc4fe6a2 | |||
c833a9a36e | |||
bb03fc2cc2 | |||
f2e4923658 | |||
4f6314f270 | |||
e15ff1af45 |
12 changed files with 282 additions and 173 deletions
38
README.md
38
README.md
|
@ -1,39 +1,39 @@
|
||||||
# tuned-amdgpu
|
# tuned-amdgpu
|
||||||
|
|
||||||
Hacky solution to integrate AMDGPU power profile control in `tuned` with Ansible
|
Hacky solution to integrate AMDGPU power control and overclocking in `tuned` with Ansible
|
||||||
|
|
||||||
Takes a list of existing `tuned` profiles and creates new ones based on them. These new profiles include AMDGPU power/clock parameters
|
Takes a list of existing `tuned` profiles and creates new ones based on them. These new profiles include AMDGPU power/clock parameters
|
||||||
|
|
||||||
An attempt is made to discover the active GPU using the 'connected' state in the `DRM` subsystem, example:
|
An attempt is made to discover the active GPU using the 'connected' state in the `DRM` subsystem, example:
|
||||||
```
|
|
||||||
$ grep -ls ^connected /sys/class/drm/*/status | grep -o card[0-9] | sort | uniq | sort -h | tail -1
|
```bash
|
||||||
|
~ $ grep -ls ^connected /sys/class/drm/*/status | grep -o card[0-9] | sort | uniq | sort -h | tail -1
|
||||||
card1
|
card1
|
||||||
```
|
```
|
||||||
|
_Warning_: This is only tested with `RX6000` series GPUs, it is probable that other generations will *not* work properly. Use at your own risk!
|
||||||
_Warning_: This is only tested with `RX6000` series GPUs, it is probable that older AMD GPUs will not work properly. Use at your own risk!
|
|
||||||
|
|
||||||
## Profiles
|
## Profiles
|
||||||
|
|
||||||
An example of the output/provided profiles follow
|
Two _'profiles'_ are in each name:
|
||||||
|
|
||||||
|
- before `amdgpu` is the source profile provided with `tuned`
|
||||||
|
- after `amdgpu` tells the GPU clock profile offered, outlined below
|
||||||
|
|
||||||
| Output profile | Description |
|
| Output profile | Description |
|
||||||
|:---|---|
|
|:---|---|
|
||||||
| `balanced-amdgpu-default` | Includes the (assumed) existing `balanced` tuned profile.<br/><br/>Only adjusts the GPU power limit (typically lower). Clocks/voltage curve remain the default. |
|
| `balanced-amdgpu-default` | Includes the (assumed) existing `balanced` tuned profile.<br/><br/>Only adjusts the GPU power limit (typically lower). Clocks/voltage curve remain the default. |
|
||||||
| `desktop-amdgpu-VR` | Includes the (assumed) existing `desktop` tuned profile.<br/><br/>Adjusts the GPU power limit, clocks, _and_ the voltage curve.<br/><br/>Uses the predefined `VR` profile in the driver. See `/sys/class/drm/card*/device/pp_power_profile_mode` |
|
| `desktop-amdgpu-overclock` | Includes the (assumed) existing `desktop` tuned profile.<br/><br/>Adjusts the GPU power limit, clocks, _and_ the voltage curve. |
|
||||||
| `latency-performance-amdgpu-custom` | Includes the existing `latency-performance` tuned profile.<br/><br/>Like the existing GPU profiles (eg: _VR)), this also adjusts the GPU power limit, clocks, _and_ the voltage curve.<br/><br/>This differs by using the `custom` profile in the driver. This opens up further tweaking of the power/clock heuristics through the driver (currently manual). see: [pp-dpm](https://docs.kernel.org/gpu/amdgpu/thermal.html#pp-dpm) |
|
| `desktop-amdgpu-peak` | Includes the (assumed) existing `desktop` tuned profile.<br/><br/>Same as the `overclock` profile, but locks clocks to their highest configured values |
|
||||||
|
|
||||||
**Note**: This is non-exhaustive, see the variables `base_profiles` and `amdgpu_profiles` below for the (default) sources of the merged profile mapping
|
|
||||||
|
|
||||||
## Notable variables
|
## Notable variables
|
||||||
|
|
||||||
These are the variables you're likely to want to change. They are defined in [playbook.yml](playbook.yml)
|
These are the variables you're likely to want to change. They are defined in [playbook.yml](playbook.yml)
|
||||||
|
|
||||||
| Variable | Description | In-playbook |
|
| Variable | Description |
|
||||||
|------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
|------------------------|---------------------------------------------------------------------------------------|
|
||||||
| gpu_clock_min | Sets the minimum (dynamic) GPU clock (in `Mhz`) for the non-default `amdgpu` profiles | `700` |
|
| gpu_clock_min | Sets the minimum (dynamic) GPU clock (in `Mhz`) for the non-default `amdgpu` profiles |
|
||||||
| gpu_clock_max | Sets the maximum (dynamic) GPU clock (in `MHz`) for the non-default `amdgpu` profiles | `2600`, results in `2.6GHz` (rounded); mild overclock |
|
| gpu_clock_max | Sets the maximum (dynamic) GPU clock (in `MHz`) for the non-default `amdgpu` profiles |
|
||||||
| gpumem_clock_static | Sets the _static_ memory clock for the GPU (in `MHz`). This is *not* the _effective_ data rate. That is a multiple of this depending on the type of VRAM.<br/><br/>To avoid flickering this does *not* change dynamically with load. | `1050`, results in just over `1GHz`; mild overclock<br/><br/>Actual effective clock depends on this being multiplied against the data/pump rate of the `GDDR?` GPU memory |
|
| gpumem_clock_static | Sets the _static_ memory clock for the GPU (in `MHz`). This is *not* the _effective_ data rate. That is a multiple of this depending on the type of VRAM.<br/><br/>To avoid flickering this does *not* change dynamically with load. |
|
||||||
| gpu_mv_offset | GPU core voltage offset. Takes +/- some integer in millivolts. Can be used to both over _and_ under volt. | `-50` (undervolt `50mV` or `0.05V`) |
|
| gpu_mv_offset | GPU core voltage offset. Takes +/- some integer in millivolts. Can be used to both over _and_ under volt. eg: `-50` _(undervolt `50mV` or `0.05V`)_ |
|
||||||
| base_profiles | List of base tuned profiles to clone in the new AMDGPU profiles. Defaults based on `Fedora` | <ul><li>`balanced`</li><li>`desktop`</li><li>`latency-performance`</li><li>`network-latency`</li><li>`network-throughput`</li><li>`powersave`</li><li>`virtual-host`</li>|
|
| base_profiles | List of base tuned profiles to clone in the new AMDGPU profiles. Defaults based on `Fedora` |
|
||||||
| amdgpu_profiles | Dictionary mapping the AMDGPU power profiles found in `/sys/class/drm/card*/device/pp_power_profile_mode` and custom power limits.<br/><br>For each item, two keys: `pwrmode` and `pwr_cap_multi`.<br/><br/>`pwrmode` maps to the number assigned in `/sys` above.<br/>`pwr_cap_multi` is a multiplier against board power capability. Must be a float, eg: `0.5` for *50%* | <pre>default:<br/> pwrmode: 0<br/> pwr_cap_multi: 0.75<br/> # 75% relatively safe default<br/>VR:<br/> pwrmode: 4<br/> pwr_cap_multi: 0.8<br/> # 80%, likely slight boost<br/>custom:<br/> pwrmode: 6<br/> pwr_cap_multi: 1.0<br/> # 100%, full GPU board capability<br/> # warning: significantly increased heat</pre>|
|
| gpu_power_multi | Dictionary with two keys, `default` and `overclock`. Expects two floats to set a power limit relative to the board _capability_. Example: `1.0` is full board capability, `0.5` is 50%. |
|
||||||
|
|
||||||
|
|
43
playbook.yml
43
playbook.yml
|
@ -7,12 +7,23 @@
|
||||||
- role: tuned_amdgpu
|
- role: tuned_amdgpu
|
||||||
# note: 'gpu_*' vars only apply with the 'custom' suffixed profiles created by this tooling
|
# note: 'gpu_*' vars only apply with the 'custom' suffixed profiles created by this tooling
|
||||||
# profiles based on the 'default' amdgpu power profile mode use default clocks
|
# profiles based on the 'default' amdgpu power profile mode use default clocks
|
||||||
gpu_clock_min: "750" # default 500
|
#
|
||||||
gpu_clock_max: "2600" # default 2529
|
# the connected AMD GPU is automatically discovered - assumes one
|
||||||
gpumem_clock_static: "1050"
|
# on swap to other AMD cards to avoid instability:
|
||||||
|
# 'rm -rfv /etc/tuned/*amdgpu*'
|
||||||
|
gpu_clock_min: "750" # default 500, for best performance: near maximum. applies with 'overclock' tuned profile
|
||||||
|
gpu_clock_max: "2675" # default somewhere around 2529 to 2660
|
||||||
|
gpumem_clock_static: "1075"
|
||||||
|
gpu_power_multi:
|
||||||
|
default: 0.869969040247678 # 281W - real default
|
||||||
|
overclock: 0.928792569659443 # 300W - slight boost
|
||||||
|
# overclock: 1.0 # 323W - full board capability
|
||||||
# optional, applies offset (+/-) to GPU voltage by provided mV
|
# optional, applies offset (+/-) to GPU voltage by provided mV
|
||||||
gpu_mv_offset: "-50"
|
# gpu_mv_offset: "-25"
|
||||||
|
# gpu_mv_offset: "+50" # add 50mV or 0.05V
|
||||||
|
gpu_mv_offset: "+25" # add 25mV or 0.025V
|
||||||
# '-50' undervolts GPU core voltage 50mV or 0.05V
|
# '-50' undervolts GPU core voltage 50mV or 0.05V
|
||||||
|
# mostly untested, there be dragons/instability
|
||||||
#
|
#
|
||||||
# list of source tuned profiles available on Fedora (TODO: should dynamically discover)
|
# list of source tuned profiles available on Fedora (TODO: should dynamically discover)
|
||||||
base_profiles:
|
base_profiles:
|
||||||
|
@ -23,27 +34,3 @@
|
||||||
- network-throughput
|
- network-throughput
|
||||||
- powersave
|
- powersave
|
||||||
- virtual-host
|
- virtual-host
|
||||||
#
|
|
||||||
# mapping of typical Navi generation power profiles from:
|
|
||||||
# /sys/class/drm/card*/device/pp_power_profile_mode
|
|
||||||
# ref: https://www.kernel.org/doc/html/v4.20/gpu/amdgpu.html#pp-power-profile-mode
|
|
||||||
# 'pwr_cap_multi' is multiplied against board *limit* to determine profile wattage; 0.5 = 50%
|
|
||||||
# values below reflect my 6900XT
|
|
||||||
amdgpu_profiles:
|
|
||||||
default:
|
|
||||||
pwrmode: 0
|
|
||||||
pwr_cap_multi: 0.789473684210526 # 255W - default
|
|
||||||
3D:
|
|
||||||
pwrmode: 1
|
|
||||||
pwr_cap_multi: 0.789473684210526 # 255W - default
|
|
||||||
VR:
|
|
||||||
pwrmode: 4
|
|
||||||
pwr_cap_multi: 0.789473684210526 # 255W - default
|
|
||||||
compute:
|
|
||||||
pwrmode: 5
|
|
||||||
pwr_cap_multi: 0.789473684210526 # 255W - default
|
|
||||||
custom:
|
|
||||||
pwrmode: 6
|
|
||||||
pwr_cap_multi: 0.869969040247678 # 281W - slight boost
|
|
||||||
# both dictionaries are merged to create new 'tuned' profiles. eg:
|
|
||||||
# 'balanced-amdgpu-default', 'balanced-amdgpu-3D', 'balanced-amdgpu-video'
|
|
||||||
|
|
Binary file not shown.
|
@ -1,15 +1,12 @@
|
||||||
---
|
---
|
||||||
# defaults file for tuned_amdgpu
|
# defaults file for tuned_amdgpu
|
||||||
#
|
#
|
||||||
# vars handling unit conversion RE: power capabilities/limits
|
|
||||||
# the discovered board limit for power capability; in microWatts, then converted
|
|
||||||
power_max: "{{ power_max_b64['content'] | b64decode }}"
|
|
||||||
board_watts: "{{ power_max | int / 1000000 }}"
|
|
||||||
|
|
||||||
# internals for profile power calculations
|
# internals for profile power calculations
|
||||||
# item in the context of the with_nested loops in the play
|
# item in the context of the with_nested loops in the play
|
||||||
profile_name: "{{ item.0.key }}"
|
profile_name: "{{ item.0 }}"
|
||||||
profile_percentage: "{{ (item.0.value.pwr_cap_multi * 100.0) | round(2) }}"
|
|
||||||
profile_multi: "{{ item.0.value.pwr_cap_multi }}"
|
amdgpu_profiles:
|
||||||
profile_microwatts: "{{ power_max | float * profile_multi | float }}"
|
- default
|
||||||
profile_watts: "{{ profile_microwatts | int / 1000000 }}"
|
- overclock
|
||||||
|
- peak
|
||||||
|
|
35
roles/tuned_amdgpu/files/profile-common.sh
Normal file
35
roles/tuned_amdgpu/files/profile-common.sh
Normal file
|
@ -0,0 +1,35 @@
|
||||||
|
#!/bin/bash
|
||||||
|
#
|
||||||
|
# 'common' file sourced by other scripts under tuned profile
|
||||||
|
#
|
||||||
|
# dynamically determine the connected GPU using the DRM subsystem
|
||||||
|
CARD=$(/usr/bin/grep -ls ^connected /sys/class/drm/*/status | /usr/bin/grep -o 'card[0-9]' | /usr/bin/sort | /usr/bin/uniq | /usr/bin/sort -h | /usr/bin/tail -1)
|
||||||
|
|
||||||
|
function get_hwmon_dir() {
|
||||||
|
CARD_DIR="/sys/class/drm/${1}/device/"
|
||||||
|
for CANDIDATE in "${CARD_DIR}"/hwmon/hwmon*; do
|
||||||
|
if [[ -f "${CANDIDATE}"/power1_cap ]]; then
|
||||||
|
# found a valid hwmon dir
|
||||||
|
echo "${CANDIDATE}"
|
||||||
|
fi
|
||||||
|
done
|
||||||
|
}
|
||||||
|
|
||||||
|
# determine the hwmon directory
|
||||||
|
HWMON_DIR=$(get_hwmon_dir "${CARD}")
|
||||||
|
|
||||||
|
# read all of the power profiles, used to get the IDs for assignment later
|
||||||
|
PROFILE_MODES=$(< /sys/class/drm/"${CARD}"/device/pp_power_profile_mode)
|
||||||
|
|
||||||
|
# get power capability; later used determine limits
|
||||||
|
read -r -d '' POWER_CAP < "$HWMON_DIR"/power1_cap_max
|
||||||
|
|
||||||
|
# enable THP; profile enables the 'vm.compaction_proactiveness' sysctl
|
||||||
|
# improves allocation latency
|
||||||
|
echo 'always' | tee /sys/kernel/mm/transparent_hugepage/enabled
|
||||||
|
|
||||||
|
# export determinations
|
||||||
|
export CARD
|
||||||
|
export HWMON_DIR
|
||||||
|
export PROFILE_MODES
|
||||||
|
export POWER_CAP
|
|
@ -4,3 +4,4 @@
|
||||||
ansible.builtin.service:
|
ansible.builtin.service:
|
||||||
name: tuned
|
name: tuned
|
||||||
state: restarted
|
state: restarted
|
||||||
|
become: true
|
||||||
|
|
|
@ -28,70 +28,57 @@
|
||||||
when: (fed_ppdtuned_swap is not defined) or ('tuned' not in ansible_facts.packages)
|
when: (fed_ppdtuned_swap is not defined) or ('tuned' not in ansible_facts.packages)
|
||||||
become: true
|
become: true
|
||||||
|
|
||||||
- name: Determine GPU device in drm subsystem
|
- name: Ensure dynamic tuning is disabled
|
||||||
ansible.builtin.shell:
|
ansible.builtin.lineinfile:
|
||||||
cmd: grep -ls ^connected /sys/class/drm/*/status | grep -o card[0-9] | sort | uniq | sort -h | tail -1
|
path: /etc/tuned/tuned-main.conf
|
||||||
executable: /bin/bash
|
regexp: '^dynamic_tuning.*='
|
||||||
changed_when: false
|
line: 'dynamic_tuning = 0'
|
||||||
register: card
|
notify: Restart tuned
|
||||||
|
become: true
|
||||||
- name: Find hwmon/max power capability file for {{ card.stdout }}
|
|
||||||
ansible.builtin.find:
|
|
||||||
paths: /sys/class/drm/{{ card.stdout }}/device/hwmon
|
|
||||||
file_type: file
|
|
||||||
recurse: true
|
|
||||||
use_regex: true
|
|
||||||
patterns:
|
|
||||||
- '^power1_cap_max$'
|
|
||||||
register: hwmon
|
|
||||||
|
|
||||||
- name: Find hwmon/current power limit file for {{ card.stdout }}
|
|
||||||
ansible.builtin.find:
|
|
||||||
paths: /sys/class/drm/{{ card.stdout }}/device/hwmon
|
|
||||||
file_type: file
|
|
||||||
recurse: true
|
|
||||||
use_regex: true
|
|
||||||
patterns:
|
|
||||||
- '^power1_cap$'
|
|
||||||
register: powercap_set
|
|
||||||
|
|
||||||
- name: Get max power capability for {{ card.stdout }}
|
|
||||||
ansible.builtin.slurp:
|
|
||||||
src: "{{ hwmon.files.0.path }}"
|
|
||||||
register: power_max_b64
|
|
||||||
|
|
||||||
- name: Create custom profile directories
|
- name: Create custom profile directories
|
||||||
ansible.builtin.file:
|
ansible.builtin.file:
|
||||||
state: directory
|
state: directory
|
||||||
path: /etc/tuned/{{ item.1 }}-amdgpu-{{ item.0.key }}
|
path: /etc/tuned/{{ item.1 }}-amdgpu-{{ item.0 }}
|
||||||
mode: "0755"
|
mode: "0755"
|
||||||
with_nested:
|
with_nested:
|
||||||
- "{{ lookup('dict', amdgpu_profiles) }}"
|
- "{{ amdgpu_profiles }}"
|
||||||
- "{{ base_profiles }}"
|
- "{{ base_profiles }}"
|
||||||
become: true
|
become: true
|
||||||
|
|
||||||
- name: Template AMDGPU control/reset scripts
|
- name: Copy 'common' AMDGPU script for all profiles
|
||||||
|
ansible.builtin.copy:
|
||||||
|
src: profile-common.sh
|
||||||
|
dest: "/etc/tuned/{{ item.1 }}-amdgpu-{{ item.0 }}/amdgpu-common.sh"
|
||||||
|
mode: "0644" # sourced, doesn't require executable bit
|
||||||
|
owner: root
|
||||||
|
group: root
|
||||||
|
notify: Restart tuned
|
||||||
|
with_nested:
|
||||||
|
- "{{ amdgpu_profiles }}"
|
||||||
|
- "{{ base_profiles }}"
|
||||||
|
become: true
|
||||||
|
|
||||||
|
- name: Template custom AMDGPU profile scripts
|
||||||
ansible.builtin.template:
|
ansible.builtin.template:
|
||||||
src: templates/amdgpu-clock.sh.j2
|
src: amdgpu-profile-{{ item.0 }}.sh.j2
|
||||||
dest: /etc/tuned/{{ item.1 }}-amdgpu-{{ item.0.key }}/amdgpu-clock.sh
|
dest: /etc/tuned/{{ item.1 }}-amdgpu-{{ item.0 }}/amdgpu-clock.sh
|
||||||
owner: root
|
owner: root
|
||||||
group: root
|
group: root
|
||||||
mode: "0755"
|
mode: "0755"
|
||||||
with_nested:
|
loop: "{{ amdgpu_profiles | product(base_profiles) | list }}"
|
||||||
- "{{ lookup('dict', amdgpu_profiles) }}"
|
|
||||||
- "{{ base_profiles }}"
|
|
||||||
notify: Restart tuned
|
notify: Restart tuned
|
||||||
become: true
|
become: true
|
||||||
|
|
||||||
- name: Template custom tuned profiles
|
- name: Template tuned.conf for custom profiles
|
||||||
ansible.builtin.template:
|
ansible.builtin.template:
|
||||||
src: templates/tuned.conf.j2
|
src: templates/tuned.conf.j2
|
||||||
dest: /etc/tuned/{{ item.1 }}-amdgpu-{{ item.0.key }}/tuned.conf
|
dest: /etc/tuned/{{ item.1 }}-amdgpu-{{ item.0 }}/tuned.conf
|
||||||
owner: root
|
owner: root
|
||||||
group: root
|
group: root
|
||||||
mode: "0644"
|
mode: "0644"
|
||||||
with_nested:
|
with_nested:
|
||||||
- "{{ lookup('dict', amdgpu_profiles) }}"
|
- "{{ amdgpu_profiles }}"
|
||||||
- "{{ base_profiles }}"
|
- "{{ base_profiles }}"
|
||||||
notify: Restart tuned
|
notify: Restart tuned
|
||||||
become: true
|
become: true
|
||||||
|
|
|
@ -1,72 +0,0 @@
|
||||||
#!/bin/bash
|
|
||||||
# script for tuned AMDGPU clock control
|
|
||||||
# configures GPU power/clock characteristics
|
|
||||||
# clocks/power in 3D are dynamic based on need/usage
|
|
||||||
#
|
|
||||||
# for 'amdgpu-default' tuned profiles, this will reset the characteristics to default
|
|
||||||
# for others this will apply overclocking settings -- leaving clock choices to the associated power profile (eg: VR)
|
|
||||||
#
|
|
||||||
# rendered by Ansible with environment-appropriate values:
|
|
||||||
# card #, eg: card0
|
|
||||||
# path to discovered sysfs device files (power/clock/voltage control)
|
|
||||||
#
|
|
||||||
# AMDGPU driver/sysfs references:
|
|
||||||
# https://01.org/linuxgraphics/gfx-docs/drm/gpu/amdgpu.html
|
|
||||||
# https://docs.kernel.org/gpu/amdgpu/thermal.html
|
|
||||||
|
|
||||||
{# done this way to avoid issues with the card number possibly shifting after playbook run #}
|
|
||||||
# dynamically determine the connected GPU using the DRM subsystem
|
|
||||||
CARD=$(/usr/bin/grep -ls ^connected /sys/class/drm/*/status | /usr/bin/grep -o 'card[0-9]' | /usr/bin/sort | /usr/bin/uniq | /usr/bin/sort -h | /usr/bin/tail -1)
|
|
||||||
|
|
||||||
{# begin the templated script for 'default' profiles to reset state #}
|
|
||||||
{% if 'default' in profile_name %}
|
|
||||||
# set power state transition heuristics to default
|
|
||||||
echo '{{ item.0.value.pwrmode }}' | tee /sys/class/drm/"${CARD}"/device/pp_power_profile_mode
|
|
||||||
|
|
||||||
# set control mode back to auto
|
|
||||||
# attempts to dynamically set optimal power profile for (load) conditions
|
|
||||||
echo 'auto' | tee /sys/class/drm/"${CARD}"/device/power_dpm_force_performance_level
|
|
||||||
|
|
||||||
# reset any existing profile clock changes
|
|
||||||
echo 'r' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage
|
|
||||||
|
|
||||||
# give '{{ profile_name }}' profile ~{{ profile_percentage }}% (rounded) of the max power capability
|
|
||||||
# {{ profile_watts }} Watts of {{ board_watts }} total
|
|
||||||
echo '{{ profile_microwatts | int }}' | tee '{{ powercap_set.files.0.path }}'
|
|
||||||
{% else %}
|
|
||||||
{# begin the templated script for non-default AMD GPU profiles, eg: 'VR' or '3D_FULL_SCREEN' #}
|
|
||||||
# set manual control mode
|
|
||||||
# allows control via 'pp_dpm_mclk', 'pp_dpm_sclk', 'pp_dpm_pcie', 'pp_dpm_fclk', and 'pp_power_profile_mode' files
|
|
||||||
# only interested in 'pp_power_profile_mode' for power and 'pp_dpm_mclk' for memory clock (flickering).
|
|
||||||
# GPU clocks are dynamic based on (load) condition
|
|
||||||
echo 'manual' | tee /sys/class/drm/"${CARD}"/device/power_dpm_force_performance_level
|
|
||||||
|
|
||||||
# set power state transition heuristics to '{{ profile_name }}' profile
|
|
||||||
echo '{{ item.0.value.pwrmode }}' | tee /sys/class/drm/"${CARD}"/device/pp_power_profile_mode
|
|
||||||
|
|
||||||
# give '{{ profile_name }}' profile ~{{ profile_percentage }}% (rounded) of the max power capability
|
|
||||||
# {{ profile_watts }} Watts of {{ board_watts }} total
|
|
||||||
echo '{{ profile_microwatts | int }}' | tee '{{ powercap_set.files.0.path }}'
|
|
||||||
|
|
||||||
# set the minimum GPU clock
|
|
||||||
echo 's 0 {{ gpu_clock_min }}' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage
|
|
||||||
|
|
||||||
# set the maximum GPU clock
|
|
||||||
echo 's 1 {{ gpu_clock_max }}' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage
|
|
||||||
|
|
||||||
# set the maximum GPU *memory* clock
|
|
||||||
echo 'm 1 {{ gpumem_clock_static }}' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage
|
|
||||||
{% if gpu_mv_offset is defined %}
|
|
||||||
|
|
||||||
# offset GPU voltage {{ gpu_mv_offset }}mV
|
|
||||||
echo 'vo {{ gpu_mv_offset }}' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage
|
|
||||||
{% endif %}
|
|
||||||
|
|
||||||
# commit the changes
|
|
||||||
echo 'c' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage
|
|
||||||
|
|
||||||
# force GPU memory into highest clock (fix flickering)
|
|
||||||
# pp_dpm_*clk settings are unintuitive, giving profiles that may be used
|
|
||||||
# opt not to set the others (eg: sclk/fclk) - those should remain for benefits from the curve
|
|
||||||
echo '3' | tee /sys/class/drm/"${CARD}"/device/pp_dpm_mclk
|
|
||||||
{% endif %}
|
|
36
roles/tuned_amdgpu/templates/amdgpu-profile-default.sh.j2
Normal file
36
roles/tuned_amdgpu/templates/amdgpu-profile-default.sh.j2
Normal file
|
@ -0,0 +1,36 @@
|
||||||
|
#!/bin/bash
|
||||||
|
# script for tuned AMDGPU clock control
|
||||||
|
# configures GPU power/clock characteristics
|
||||||
|
# clocks/power in 3D are dynamic based on need/usage
|
||||||
|
#
|
||||||
|
# for 'amdgpu-default' tuned profiles, this will reset the characteristics to default
|
||||||
|
# for others this will apply overclocking settings -- leaving clock choices to the associated power profile (eg: VR)
|
||||||
|
#
|
||||||
|
# rendered by Ansible with environment-appropriate values:
|
||||||
|
# card #, eg: card0
|
||||||
|
# path to discovered sysfs device files (power/clock/voltage control)
|
||||||
|
#
|
||||||
|
# AMDGPU driver/sysfs references:
|
||||||
|
# https://01.org/linuxgraphics/gfx-docs/drm/gpu/amdgpu.html
|
||||||
|
# https://docs.kernel.org/gpu/amdgpu/thermal.html
|
||||||
|
#
|
||||||
|
# start by including the 'common' script; determines card/hwmon dir/power profiles/power capability
|
||||||
|
. $(dirname "${BASH_SOURCE[0]}")/amdgpu-common.sh
|
||||||
|
|
||||||
|
{# begin the templated script for 'default' profiles to reset state #}
|
||||||
|
# set control mode back to auto
|
||||||
|
# attempts to dynamically set optimal power profile for (load) conditions
|
||||||
|
echo 'auto' | tee /sys/class/drm/"${CARD}"/device/power_dpm_force_performance_level
|
||||||
|
|
||||||
|
# reset any existing profile clock changes
|
||||||
|
echo 'r' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage
|
||||||
|
|
||||||
|
# adjust power limit using multiplier against board capability
|
||||||
|
POWER_LIM_DEFAULT=$(/usr/bin/awk -v m="$POWER_CAP" -v n={{ gpu_power_multi.default }} 'BEGIN {printf "%.0f", (m*n)}')
|
||||||
|
echo "$POWER_LIM_DEFAULT" | tee "${HWMON_DIR}/power1_cap"
|
||||||
|
|
||||||
|
# extract the power-saving profile ID number
|
||||||
|
PROF_DEFAULT_NUM=$(/usr/bin/awk '$0 ~ /BOOTUP_DEFAULT.*:/ {print $1}' <<< "$PROFILE_MODES")
|
||||||
|
|
||||||
|
# reset power/clock heuristics to power-saving
|
||||||
|
echo "${PROF_DEFAULT_NUM}" | tee /sys/class/drm/"${CARD}"/device/pp_power_profile_mode
|
58
roles/tuned_amdgpu/templates/amdgpu-profile-overclock.sh.j2
Normal file
58
roles/tuned_amdgpu/templates/amdgpu-profile-overclock.sh.j2
Normal file
|
@ -0,0 +1,58 @@
|
||||||
|
#!/bin/bash
|
||||||
|
# script for tuned AMDGPU clock control
|
||||||
|
# configures GPU power/clock characteristics
|
||||||
|
# clocks/power in 3D are dynamic based on need/usage
|
||||||
|
#
|
||||||
|
# for 'amdgpu-default' tuned profiles, this will reset the characteristics to default
|
||||||
|
# for others this will apply overclocking settings -- leaving clock choices to the associated power profile (eg: VR)
|
||||||
|
#
|
||||||
|
# rendered by Ansible with environment-appropriate values:
|
||||||
|
# card #, eg: card0
|
||||||
|
# path to discovered sysfs device files (power/clock/voltage control)
|
||||||
|
#
|
||||||
|
# AMDGPU driver/sysfs references:
|
||||||
|
# https://01.org/linuxgraphics/gfx-docs/drm/gpu/amdgpu.html
|
||||||
|
# https://docs.kernel.org/gpu/amdgpu/thermal.html
|
||||||
|
#
|
||||||
|
# start by including the 'common' script; determines card/hwmon dir/power profiles/power capability
|
||||||
|
. $(dirname "${BASH_SOURCE[0]}")/amdgpu-common.sh
|
||||||
|
|
||||||
|
{# begin the templated script for 'overclocked' AMD GPU profiles based on the existing tuned profiles #}
|
||||||
|
# set the minimum GPU clock - for best performance, this should be near the maximum
|
||||||
|
# RX6000 series power management *sucks*
|
||||||
|
echo 's 0 {{ gpu_clock_min }}' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage
|
||||||
|
|
||||||
|
# set the maximum GPU clock
|
||||||
|
echo 's 1 {{ gpu_clock_max }}' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage
|
||||||
|
|
||||||
|
# set the GPU *memory* clock
|
||||||
|
# normally this would appear disregarded, memory clocked at the minimum allowed by the overdrive (OD) range
|
||||||
|
# it follows the core clock; if both 0/1 profiles for _it_ are high enough, the memory will follow
|
||||||
|
echo 'm 1 {{ gpumem_clock_static }}' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage
|
||||||
|
{% if gpu_mv_offset is defined %}
|
||||||
|
|
||||||
|
# offset GPU voltage {{ gpu_mv_offset }}mV
|
||||||
|
echo 'vo {{ gpu_mv_offset }}' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage
|
||||||
|
{% endif %}
|
||||||
|
|
||||||
|
# commit the changes
|
||||||
|
echo 'c' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage
|
||||||
|
|
||||||
|
# force GPU core and memory into highest clocks (fix flickering and poor power management)
|
||||||
|
# set manual control mode
|
||||||
|
# allows control via 'pp_dpm_mclk', 'pp_dpm_sclk', 'pp_dpm_pcie', 'pp_dpm_fclk', and 'pp_power_profile_mode' files
|
||||||
|
echo 'manual' | tee /sys/class/drm/"${CARD}"/device/power_dpm_force_performance_level
|
||||||
|
|
||||||
|
# adjust power limit using multiplier against board capability
|
||||||
|
POWER_LIM_OC=$(/usr/bin/awk -v m="$POWER_CAP" -v n={{ gpu_power_multi.overclock }} 'BEGIN {printf "%.0f", (m*n)}')
|
||||||
|
echo "$POWER_LIM_OC" | tee "${HWMON_DIR}/power1_cap"
|
||||||
|
|
||||||
|
# avoid display flickering, force OC'd memory to highest clock
|
||||||
|
echo '3' | tee /sys/class/drm/"${CARD}"/device/pp_dpm_mclk
|
||||||
|
|
||||||
|
# extract the VR power profile ID number
|
||||||
|
PROF_VR_NUM=$(/usr/bin/awk '$0 ~ /VR.*:/ {print $1}' <<< "$PROFILE_MODES")
|
||||||
|
|
||||||
|
# force 'overclocked' profile to 'VR' power/clock heuristics
|
||||||
|
# latency/frame timing seemed favorable with relatively-close minimum clocks
|
||||||
|
echo "${PROF_VR_NUM}" | tee /sys/class/drm/"${CARD}"/device/pp_power_profile_mode
|
66
roles/tuned_amdgpu/templates/amdgpu-profile-peak.sh.j2
Normal file
66
roles/tuned_amdgpu/templates/amdgpu-profile-peak.sh.j2
Normal file
|
@ -0,0 +1,66 @@
|
||||||
|
#!/bin/bash
|
||||||
|
# script for tuned AMDGPU clock control
|
||||||
|
# configures GPU power/clock characteristics
|
||||||
|
# clocks/power in 3D are dynamic based on need/usage
|
||||||
|
#
|
||||||
|
# for 'amdgpu-default' tuned profiles, this will reset the characteristics to default
|
||||||
|
# for others this will apply overclocking settings -- leaving clock choices to the associated power profile (eg: VR)
|
||||||
|
#
|
||||||
|
# rendered by Ansible with environment-appropriate values:
|
||||||
|
# card #, eg: card0
|
||||||
|
# path to discovered sysfs device files (power/clock/voltage control)
|
||||||
|
#
|
||||||
|
# AMDGPU driver/sysfs references:
|
||||||
|
# https://01.org/linuxgraphics/gfx-docs/drm/gpu/amdgpu.html
|
||||||
|
# https://docs.kernel.org/gpu/amdgpu/thermal.html
|
||||||
|
#
|
||||||
|
# start by including the 'common' script; determines card/hwmon dir/power profiles/power capability
|
||||||
|
. $(dirname "${BASH_SOURCE[0]}")/amdgpu-common.sh
|
||||||
|
|
||||||
|
{# begin the templated script for 'overclocked' AMD GPU profiles based on the existing tuned profiles #}
|
||||||
|
# set the minimum GPU clock - for best performance, this should be near the maximum
|
||||||
|
# RX6000 series power management *sucks*
|
||||||
|
echo 's 0 {{ gpu_clock_min }}' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage
|
||||||
|
|
||||||
|
# set the maximum GPU clock
|
||||||
|
echo 's 1 {{ gpu_clock_max }}' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage
|
||||||
|
|
||||||
|
# set the GPU *memory* clock
|
||||||
|
# normally this would appear disregarded, memory clocked at the minimum allowed by the overdrive (OD) range
|
||||||
|
# it follows the core clock; if both 0/1 profiles for _it_ are high enough, the memory will follow
|
||||||
|
echo 'm 1 {{ gpumem_clock_static }}' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage
|
||||||
|
{% if gpu_mv_offset is defined %}
|
||||||
|
|
||||||
|
# offset GPU voltage {{ gpu_mv_offset }}mV
|
||||||
|
echo 'vo {{ gpu_mv_offset }}' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage
|
||||||
|
{% endif %}
|
||||||
|
|
||||||
|
# commit the changes
|
||||||
|
echo 'c' | tee /sys/class/drm/"${CARD}"/device/pp_od_clk_voltage
|
||||||
|
|
||||||
|
# force GPU core and memory into highest clocks (fix flickering and poor power management)
|
||||||
|
# set manual control mode
|
||||||
|
# allows control via 'pp_dpm_mclk', 'pp_dpm_sclk', 'pp_dpm_pcie', 'pp_dpm_fclk', and 'pp_power_profile_mode' files
|
||||||
|
echo 'manual' | tee /sys/class/drm/"${CARD}"/device/power_dpm_force_performance_level
|
||||||
|
|
||||||
|
# adjust power limit using multiplier against board capability
|
||||||
|
POWER_LIM_OC=$(/usr/bin/awk -v m="$POWER_CAP" -v n={{ gpu_power_multi.overclock }} 'BEGIN {printf "%.0f", (m*n)}')
|
||||||
|
echo "$POWER_LIM_OC" | tee "${HWMON_DIR}/power1_cap"
|
||||||
|
|
||||||
|
# pp_dpm_*clk settings are unintuitive, giving profiles that may be used
|
||||||
|
echo '1' | tee /sys/class/drm/"${CARD}"/device/pp_dpm_sclk
|
||||||
|
echo '3' | tee /sys/class/drm/"${CARD}"/device/pp_dpm_mclk
|
||||||
|
echo '2' | tee /sys/class/drm/"${CARD}"/device/pp_dpm_fclk
|
||||||
|
echo '2' | tee /sys/class/drm/"${CARD}"/device/pp_dpm_socclk
|
||||||
|
|
||||||
|
# extract the VR power profile ID number
|
||||||
|
PROF_VR_NUM=$(/usr/bin/awk '$0 ~ /VR.*:/ {print $1}' <<< "$PROFILE_MODES")
|
||||||
|
|
||||||
|
# force 'overclocked' profile to 'VR' power/clock heuristics
|
||||||
|
# latency/frame timing seemed favorable with relatively-close minimum clocks
|
||||||
|
echo "${PROF_VR_NUM}" | tee /sys/class/drm/"${CARD}"/device/pp_power_profile_mode
|
||||||
|
|
||||||
|
# note 4/8/2023: instead of 'manual'... try dealing with broken power management, force clocks to high
|
||||||
|
# ref: https://gitlab.freedesktop.org/drm/amd/-/issues/1500
|
||||||
|
# followup: doesn't work that well in practice, still flaky on clocks/frame times
|
||||||
|
#echo 'high' | tee /sys/class/drm/"${CARD}"/device/power_dpm_force_performance_level
|
|
@ -1,16 +1,22 @@
|
||||||
[main]
|
[main]
|
||||||
include={{ item.1 }}
|
include={{ item.1 }}
|
||||||
summary={{ item.1 }} + TCP/RAID tweaks + AMDGPU pp_power_profile_mode = {{ item.0.value.pwrmode }} ({{ item.0.key }})
|
summary={{ item.1 }} + TCP/RAID tweaks + AMDGPU {{ item.0 }}
|
||||||
|
|
||||||
[sysctl]
|
[sysctl]
|
||||||
|
# allow regular users to see the kernel ring buffer
|
||||||
|
kernel.dmesg_restrict=0
|
||||||
net.core.default_qdisc=fq
|
net.core.default_qdisc=fq
|
||||||
# 'bbr2' requires a [modified] supporting kernel - stock Fedora kernels do *not* support it (currently)
|
# 'bbr2' requires a [modified] supporting kernel - stock Fedora kernels do *not* support it (currently)
|
||||||
# eg: 'kernel-xanmode-edge' from COPR 'rmnscnce/kernel-xanmod'
|
# eg: 'kernel-xanmode-edge' from COPR 'rmnscnce/kernel-xanmod'
|
||||||
net.ipv4.tcp_congestion_control=bbr2
|
net.ipv4.tcp_congestion_control=bbr2
|
||||||
net.core.rmem_max=33554432
|
net.core.rmem_max=33554432
|
||||||
net.core.wmem_max=33554432
|
net.core.wmem_max=33554432
|
||||||
dev.raid.speed_limit_min=600000
|
dev.raid.speed_limit_min=1000000
|
||||||
dev.raid.speed_limit_max=9000000
|
dev.raid.speed_limit_max=6000000
|
||||||
|
# improve THP allocation latency, compact in background
|
||||||
|
vm.compaction_proactiveness=30
|
||||||
|
# make page lock theft slightly more fair
|
||||||
|
vm.page_lock_unfairness=1
|
||||||
# allow some games to run (eg: DayZ)
|
# allow some games to run (eg: DayZ)
|
||||||
vm.max_map_count=1048576
|
vm.max_map_count=1048576
|
||||||
|
|
||||||
|
@ -20,3 +26,11 @@ vm.max_map_count=1048576
|
||||||
[gpuclockscript]
|
[gpuclockscript]
|
||||||
type=script
|
type=script
|
||||||
script=${i:PROFILE_DIR}/amdgpu-clock.sh
|
script=${i:PROFILE_DIR}/amdgpu-clock.sh
|
||||||
|
|
||||||
|
# for SSDs with no RPM, set no IO scheduler
|
||||||
|
[ssdnosched]
|
||||||
|
type=disk
|
||||||
|
devices_udev_regex=(ID_ATA_ROTATION_RATE_RPM=0)
|
||||||
|
# elevator=none
|
||||||
|
elevator=kyber
|
||||||
|
# elevator=mq-deadline
|
||||||
|
|
Loading…
Reference in a new issue