drm/graphics pull request for v4.16-rc1

-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJacnVwAAoJEAx081l5xIa+HhIP/0yDg5tuco0QN3YskE/bIa3o
 4VDWsLi+WCoSZoV4uWLKYK8OHiNzKdnGfNoUNWqRqaYilWDtpgBX86Wjg5hxnGwA
 /6jGfU1nhb0teG9clGBbzgxHXW6iKvT+p/Pp1pC8HXU+zEUaungJcWY120hITwMD
 NqUGK6kYRsJVYj+4b+5Ho7Fvv912bbjK0YAptD6RdzX4rDPN0D+XrtXlYsg1PJYx
 jv/NNWEP5mCesYKsS8JzHYcfOF/vdQpPwAV4C3LKaQy5k3pVVIDOEuOycIZTKMf3
 K/fSsbvhHMH3Ck+lPcK+etcoQbkLCcmKbw+3uvM/7njkn7Dp24Ryk9FXB3dXXOgb
 3kLs7f0gY9j/NAi3uKAMvACPvXNA7eptIvAmN/VKzmEiqgx+l0sveSuU73DVoe/x
 Jko8ijyiKchcN+/CTgZ7FNyEd0UWO06+9B0RMrlEezE8f14EhR51wIQQTNFJRJn/
 kqRM1hC2Cvb00vAwq7jjZcDa7hRCI0OoVU9N37smtPuTJY94tR/CUbq10g4pSlu8
 h8FiHnLuhlyh1DQNNS19HQfOSh0yYgEGRQcIKy3vqshsO3/hbe8bQD5UerqMZPZB
 ZpMEWe5VHSWIVjAxgzHNXFd9F/jSeWDVkCztKfx0CLmzHZNLNjw+/zgbIdF3vj9T
 S1cwFZLWr/ngf5mbyR88
 =pLN1
 -----END PGP SIGNATURE-----

Merge tag 'drm-for-v4.16' of git://people.freedesktop.org/~airlied/linux

Pull drm updates from Dave Airlie:
 "This seems to have been a comparatively quieter merge window, I assume
  due to holidays etc. The "biggest" change is AMD header cleanups, which
  merge/remove a bunch of them. The AMD gpu scheduler is now being made generic
  with the etnaviv driver wanting to reuse the code, hopefully other drivers
  can go in the same direction.

  Otherwise it's the usual lots of stuff in i915/amdgpu, not so much stuff
  elsewhere.

  Core:
   - Add .last_close and .output_poll_changed helpers to reduce driver footprints
   - Fix plane clipping
   - Improved debug printing support
   - Add panel orientation property
   - Update edid derived properties at edid setting
   - Reduction in fbdev driver footprint
   - Move amdgpu scheduler into core for other drivers to use.

  i915:
   - Selftest and IGT improvements
   - Fast boot prep work on IPS, pipe config
   - HW workarounds for Cannonlake, Geminilake
   - Cannonlake clock and HDMI2.0 fixes
   - GPU cache invalidation and context switch improvements
   - Display planes cleanup
   - New PMU interface for perf queries
   - New firmware support for KBL/SKL
   - Geminilake HW workaround for perforamce
   - Coffeelake stolen memory improvements
   - GPU reset robustness work
   - Cannonlake horizontal plane flipping
   - GVT work

  amdgpu/radeon:
   - RV and Vega header file cleanups (lots of lines gone!)
   - TTM operation context support
   - 48-bit GPUVM support for Vega/RV
   - ECC support for Vega
   - Resizeable BAR support
   - Multi-display sync support
   - Enable swapout for reserved BOs during allocation
   - S3 fixes on Raven
   - GPU reset cleanup and fixes
   - 2+1 level GPU page table

  amdkfd:
   - GFX7/8 SDMA user queues support
   - Hardware scheduling for multiple processes
   - dGPU prep work

  rcar:
   - Added R8A7743/5 support
   - System suspend/resume support

  sun4i:
   - Multi-plane support for YUV formats
   - A83T and LVDS support

  msm:
   - Devfreq support for GPU

  tegra:
   - Prep work for adding Tegra186 support
   - Tegra186 HDMI support
   - HDMI2.0 and zpos support by using generic helpers

  tilcdc:
   - Misc fixes

  omapdrm:
   - Support memory bandwidth limits
   - DSI command mode panel cleanups
   - DMM error handling

  exynos:
   - drop the old IPP subdriver.

  etnaviv:
   - Occlusion query fixes
   - Job handling fixes
   - Prep work for hooking in gpu scheduler

  armada:
   - Move closer to atomic modesetting
   - Allow disabling primary plane if overlay is full screen

  imx:
   - Format modifier support
   - Add tile prefetch to PRE
   - Runtime PM support for PRG

  ast:
   - fix LUT loading"

* tag 'drm-for-v4.16' of git://people.freedesktop.org/~airlied/linux: (1471 commits)
  drm/ast: Load lut in crtc_commit
  drm: Check for lessee in DROP_MASTER ioctl
  drm: fix gpu scheduler link order
  drm/amd/display: Demote error print to debug print when ATOM impl missing
  dma-buf: fix reservation_object_wait_timeout_rcu once more v2
  drm/amdgpu: Avoid leaking PM domain on driver unbind (v2)
  drm/amd/amdgpu: Add Polaris version check
  drm/amdgpu: Reenable manual GPU reset from sysfs
  drm/amdgpu: disable MMHUB power gating on raven
  drm/ttm: Don't unreserve swapped BOs that were previously reserved
  drm/ttm: Don't add swapped BOs to swap-LRU list
  drm/amdgpu: only check for ECC on Vega10
  drm/amd/powerplay: Fix smu_table_entry.handle type
  drm/ttm: add VADDR_FLAG_UPDATED_COUNT to correctly update dma_page global count
  drm: Fix PANEL_ORIENTATION_QUIRKS breaking the Kconfig DRM menuconfig
  drm/radeon: fill in rb backend map on evergreen/ni.
  drm/amdgpu/gfx9: fix ngg enablement to clear gds reserved memory (v2)
  drm/ttm: only free pages rather than update global memory count together
  drm/amdgpu: fix CPU based VM updates
  drm/amdgpu: fix typo in amdgpu_vce_validate_bo
  ...
This commit is contained in:
Linus Torvalds 2018-02-01 17:48:47 -08:00
commit 4bf772b146
1020 changed files with 57631 additions and 97127 deletions

View file

@ -48,6 +48,10 @@ Required properties:
Documentation/devicetree/bindings/reset/reset.txt,
the reset-names should be "hdmitx_apb", "hdmitx", "hdmitx_phy"
Optional properties:
- hdmi-supply: Optional phandle to an external 5V regulator to power the HDMI
logic, as described in the file ../regulator/regulator.txt
Required nodes:
The connections to the HDMI ports are modeled using the OF graph

View file

@ -64,6 +64,10 @@ Required properties:
- reg-names: should contain the names of the previous memory regions
- interrupts: should contain the VENC Vsync interrupt number
Optional properties:
- power-domains: Optional phandle to associated power domain as described in
the file ../power/power_domain.txt
Required nodes:
The connections to the VPU output video ports are modeled using the OF graph

View file

@ -0,0 +1,25 @@
Ilitek ILI9225 display panels
This binding is for display panels using an Ilitek ILI9225 controller in SPI
mode.
Required properties:
- compatible: "vot,v220hf01a-t", "ilitek,ili9225"
- rs-gpios: Register select signal
- reset-gpios: Reset pin
The node for this driver must be a child node of a SPI controller, hence
all mandatory properties described in ../spi/spi-bus.txt must be specified.
Optional properties:
- rotation: panel rotation in degrees counter clockwise (0,90,180,270)
Example:
display@0{
compatible = "vot,v220hf01a-t", "ilitek,ili9225";
reg = <0>;
spi-max-frequency = <12000000>;
rs-gpios = <&gpio0 9 GPIO_ACTIVE_HIGH>;
reset-gpios = <&gpio0 8 GPIO_ACTIVE_HIGH>;
rotation = <270>;
};

View file

@ -0,0 +1,49 @@
Ilitek ILI9322 TFT panel driver with SPI control bus
This is a driver for 320x240 TFT panels, accepting a variety of input
streams that get adapted and scaled to the panel. The panel output has
960 TFT source driver pins and 240 TFT gate driver pins, VCOM, VCOML and
VCOMH outputs.
Required properties:
- compatible: "dlink,dir-685-panel", "ilitek,ili9322"
(full system-specific compatible is always required to look up configuration)
- reg: address of the panel on the SPI bus
Optional properties:
- vcc-supply: core voltage supply, see regulator/regulator.txt
- iovcc-supply: voltage supply for the interface input/output signals,
see regulator/regulator.txt
- vci-supply: voltage supply for analog parts, see regulator/regulator.txt
- reset-gpios: a GPIO spec for the reset pin, see gpio/gpio.txt
The following optional properties only apply to RGB and YUV input modes and
can be omitted for BT.656 input modes:
- pixelclk-active: see display/panel/display-timing.txt
- de-active: see display/panel/display-timing.txt
- hsync-active: see display/panel/display-timing.txt
- vsync-active: see display/panel/display-timing.txt
The panel must obey the rules for a SPI slave device as specified in
spi/spi-bus.txt
The device node can contain one 'port' child node with one child
'endpoint' node, according to the bindings defined in
media/video-interfaces.txt. This node should describe panel's video bus.
Example:
panel: display@0 {
compatible = "dlink,dir-685-panel", "ilitek,ili9322";
reg = <0>;
vcc-supply = <&vdisp>;
iovcc-supply = <&vdisp>;
vci-supply = <&vdisp>;
port {
panel_in: endpoint {
remote-endpoint = <&display_out>;
};
};
};

View file

@ -0,0 +1,7 @@
Mitsubishi "AA070MC01 7.0" WVGA TFT LCD panel
Required properties:
- compatible: should be "mitsubishi,aa070mc01-ca1"
This binding is compatible with the simple-panel binding, which is specified
in simple-panel.txt in this directory.

View file

@ -78,6 +78,16 @@ used for panels that implement compatible control signals.
while active. Active high reset signals can be supported by inverting the
GPIO specifier polarity flag.
Power
-----
- power-supply: display panels require power to be supplied. While several
panels need more than one power supply with panel-specific constraints
governing the order and timings of the power supplies, in many cases a single
power supply is sufficient, either because the panel has a single power rail,
or because all its power rails can be driven by the same supply. In that case
the power-supply property specifies the supply powering the panel as a phandle
to a regulator.
Backlight
---------

View file

@ -32,6 +32,7 @@ Optional properties:
- label: See panel-common.txt.
- gpios: See panel-common.txt.
- backlight: See panel-common.txt.
- power-supply: See panel-common.txt.
- data-mirror: If set, reverse the bit order described in the data mappings
below on all data lanes, transmitting bits for slots 6 to 0 instead of
0 to 6.

View file

@ -1,7 +1,7 @@
Simple display panel
Required properties:
- power-supply: regulator to provide the supply voltage
- power-supply: See panel-common.txt
Optional properties:
- ddc-i2c-bus: phandle of an I2C controller used for DDC EDID probing

View file

@ -0,0 +1,29 @@
Tianma Micro-electronics TM070RVHG71 7.0" WXGA TFT LCD panel
Required properties:
- compatible: should be "tianma,tm070rvhg71"
- power-supply: single regulator to provide the supply voltage
- backlight: phandle of the backlight device attached to the panel
Required nodes:
- port: LVDS port mapping to connect this display
This panel needs single power supply voltage. Its backlight is conntrolled
via PWM signal.
Example:
--------
Example device-tree definition when connected to iMX6Q based board
panel: panel-lvds0 {
compatible = "tianma,tm070rvhg71";
backlight = <&backlight_lvds>;
power-supply = <&reg_lvds>;
port {
panel_in_lvds0: endpoint {
remote-endpoint = <&lvds0_out>;
};
};
};

View file

@ -2,7 +2,7 @@ Toppoly TD028TTEC1 Panel
========================
Required properties:
- compatible: "toppoly,td028ttec1"
- compatible: "tpo,td028ttec1"
Optional properties:
- label: a symbolic name for the panel
@ -14,7 +14,7 @@ Example
-------
lcd-panel: td028ttec1@0 {
compatible = "toppoly,td028ttec1";
compatible = "tpo,td028ttec1";
reg = <0>;
spi-max-frequency = <100000>;
spi-cpol;

View file

@ -3,6 +3,8 @@
Required Properties:
- compatible: must be one of the following.
- "renesas,du-r8a7743" for R8A7743 (RZ/G1M) compatible DU
- "renesas,du-r8a7745" for R8A7745 (RZ/G1E) compatible DU
- "renesas,du-r8a7779" for R8A7779 (R-Car H1) compatible DU
- "renesas,du-r8a7790" for R8A7790 (R-Car H2) compatible DU
- "renesas,du-r8a7791" for R8A7791 (R-Car M2-W) compatible DU
@ -27,10 +29,10 @@ Required Properties:
- clock-names: Name of the clocks. This property is model-dependent.
- R8A7779 uses a single functional clock. The clock doesn't need to be
named.
- R8A779[0123456] use one functional clock per channel and one clock per
LVDS encoder (if available). The functional clocks must be named "du.x"
with "x" being the channel numerical index. The LVDS clocks must be
named "lvds.x" with "x" being the LVDS encoder numerical index.
- All other DU instances use one functional clock per channel and one
clock per LVDS encoder (if available). The functional clocks must be
named "du.x" with "x" being the channel numerical index. The LVDS clocks
must be named "lvds.x" with "x" being the LVDS encoder numerical index.
- In addition to the functional and encoder clocks, all DU versions also
support externally supplied pixel clocks. Those clocks are optional.
When supplied they must be named "dclkin.x" with "x" being the input
@ -49,16 +51,18 @@ bindings specified in Documentation/devicetree/bindings/graph.txt.
The following table lists for each supported model the port number
corresponding to each DU output.
Port 0 Port1 Port2 Port3
Port0 Port1 Port2 Port3
-----------------------------------------------------------------------------
R8A7779 (H1) DPAD 0 DPAD 1 - -
R8A7790 (H2) DPAD LVDS 0 LVDS 1 -
R8A7791 (M2-W) DPAD LVDS 0 - -
R8A7792 (V2H) DPAD 0 DPAD 1 - -
R8A7793 (M2-N) DPAD LVDS 0 - -
R8A7794 (E2) DPAD 0 DPAD 1 - -
R8A7795 (H3) DPAD HDMI 0 HDMI 1 LVDS
R8A7796 (M3-W) DPAD HDMI LVDS -
R8A7743 (RZ/G1M) DPAD 0 LVDS 0 - -
R8A7745 (RZ/G1E) DPAD 0 DPAD 1 - -
R8A7779 (R-Car H1) DPAD 0 DPAD 1 - -
R8A7790 (R-Car H2) DPAD 0 LVDS 0 LVDS 1 -
R8A7791 (R-Car M2-W) DPAD 0 LVDS 0 - -
R8A7792 (R-Car V2H) DPAD 0 DPAD 1 - -
R8A7793 (R-Car M2-N) DPAD 0 LVDS 0 - -
R8A7794 (R-Car E2) DPAD 0 DPAD 1 - -
R8A7795 (R-Car H3) DPAD 0 HDMI 0 HDMI 1 LVDS 0
R8A7796 (R-Car M3-W) DPAD 0 HDMI 0 LVDS 0 -
Example: R8A7795 (R-Car H3) ES2.0 DU

View file

@ -7,6 +7,7 @@ buffer to an external LCD interface.
Required properties:
- compatible: value should be one of the following
"rockchip,rk3036-vop";
"rockchip,rk3126-vop";
"rockchip,rk3288-vop";
"rockchip,rk3368-vop";
"rockchip,rk3366-vop";

View file

@ -0,0 +1,35 @@
Sitronix ST7735R display panels
This binding is for display panels using a Sitronix ST7735R controller in SPI
mode.
Required properties:
- compatible: "jianda,jd-t18003-t01", "sitronix,st7735r"
- dc-gpios: Display data/command selection (D/CX)
- reset-gpios: Reset signal (RSTX)
The node for this driver must be a child node of a SPI controller, hence
all mandatory properties described in ../spi/spi-bus.txt must be specified.
Optional properties:
- rotation: panel rotation in degrees counter clockwise (0,90,180,270)
- backlight: phandle of the backlight device attached to the panel
Example:
backlight: backlight {
compatible = "gpio-backlight";
gpios = <&gpio 44 GPIO_ACTIVE_HIGH>;
}
...
display@0{
compatible = "jianda,jd-t18003-t01", "sitronix,st7735r";
reg = <0>;
spi-max-frequency = <32000000>;
dc-gpios = <&gpio 43 GPIO_ACTIVE_HIGH>;
reset-gpios = <&gpio 80 GPIO_ACTIVE_HIGH>;
rotation = <270>;
backlight = &backlight;
};

View file

@ -10,7 +10,11 @@
- "lcd" for the clock feeding the output pixel clock & IP clock.
- resets: reset to be used by the device (defined by use of RCC macro).
Required nodes:
- Video port for RGB output.
- Video port for DPI RGB output: ltdc has one video port with up to 2
endpoints:
- for external dpi rgb panel or bridge, using gpios.
- for internal dpi input of the MIPI DSI host controller.
Note: These 2 endpoints cannot be activated simultaneously.
* STMicroelectronics STM32 DSI controller specific extensions to Synopsys
DesignWare MIPI DSI host controller

View file

@ -93,6 +93,7 @@ Required properties:
* allwinner,sun6i-a31s-tcon
* allwinner,sun7i-a20-tcon
* allwinner,sun8i-a33-tcon
* allwinner,sun8i-a83t-tcon-lcd
* allwinner,sun8i-v3s-tcon
- reg: base address and size of memory-mapped region
- interrupts: interrupt associated to this IP
@ -121,6 +122,14 @@ Required properties:
On SoCs other than the A33 and V3s, there is one more clock required:
- 'tcon-ch1': The clock driving the TCON channel 1
On SoCs that support LVDS (all SoCs but the A13, H3, H5 and V3s), you
need one more reset line:
- 'lvds': The reset line driving the LVDS logic
And on the A23, A31, A31s and A33, you need one more clock line:
- 'lvds-alt': An alternative clock source, separate from the TCON channel 0
clock, that can be used to drive the LVDS clock
DRC
---
@ -216,6 +225,7 @@ supported.
Required properties:
- compatible: value must be one of:
* allwinner,sun8i-a83t-de2-mixer-0
* allwinner,sun8i-v3s-de2-mixer
- reg: base address and size of the memory-mapped region.
- clocks: phandles to the clocks feeding the mixer
@ -245,6 +255,7 @@ Required properties:
* allwinner,sun6i-a31s-display-engine
* allwinner,sun7i-a20-display-engine
* allwinner,sun8i-a33-display-engine
* allwinner,sun8i-a83t-display-engine
* allwinner,sun8i-v3s-display-engine
- allwinner,pipelines: list of phandle to the display engine

View file

@ -206,21 +206,33 @@ of the following host1x client modules:
- "nvidia,tegra132-sor": for Tegra132
- "nvidia,tegra210-sor": for Tegra210
- "nvidia,tegra210-sor1": for Tegra210
- "nvidia,tegra186-sor": for Tegra186
- "nvidia,tegra186-sor1": for Tegra186
- reg: Physical base address and length of the controller's registers.
- interrupts: The interrupt outputs from the controller.
- clocks: Must contain an entry for each entry in clock-names.
See ../clocks/clock-bindings.txt for details.
- clock-names: Must include the following entries:
- sor: clock input for the SOR hardware
- source: source clock for the SOR clock
- out: SOR output clock
- parent: input for the pixel clock
- dp: reference clock for the SOR clock
- safe: safe reference for the SOR clock during power up
For Tegra186 and later:
- pad: SOR pad output clock (on Tegra186 and later)
Obsolete:
- source: source clock for the SOR clock (obsolete, use "out" instead)
- resets: Must contain an entry for each entry in reset-names.
See ../reset/reset.txt for details.
- reset-names: Must include the following entries:
- sor
Required properties on Tegra186 and later:
- nvidia,interface: index of the SOR interface
Optional properties:
- nvidia,ddc-i2c-bus: phandle of an I2C controller used for DDC EDID probing
- nvidia,hpd-gpio: specifies a GPIO used for hotplug detection

View file

@ -47,6 +47,11 @@ Required properties:
- clocks: handle to fclk
- clock-names: "fck"
Optional properties:
- max-memory-bandwidth: Input memory (from main memory to dispc) bandwidth limit
in bytes per second
HDMI
----

View file

@ -28,6 +28,10 @@ Required properties:
- ti,hwmods: "dss_dispc"
- interrupts: the DISPC interrupt
Optional properties:
- max-memory-bandwidth: Input memory (from main memory to dispc) bandwidth limit
in bytes per second
RFBI
----

View file

@ -37,6 +37,10 @@ Required properties:
- clocks: handle to fclk
- clock-names: "fck"
Optional properties:
- max-memory-bandwidth: Input memory (from main memory to dispc) bandwidth limit
in bytes per second
RFBI
----

View file

@ -36,6 +36,10 @@ Required properties:
- clocks: handle to fclk
- clock-names: "fck"
Optional properties:
- max-memory-bandwidth: Input memory (from main memory to dispc) bandwidth limit
in bytes per second
RFBI
----

View file

@ -36,6 +36,10 @@ Required properties:
- clocks: handle to fclk
- clock-names: "fck"
Optional properties:
- max-memory-bandwidth: Input memory (from main memory to dispc) bandwidth limit
in bytes per second
RFBI
----

View file

@ -156,6 +156,7 @@ i2se I2SE GmbH
ibm International Business Machines (IBM)
idt Integrated Device Technologies, Inc.
ifi Ingenieurburo Fur Ic-Technologie (I/F/I)
ilitek ILI Technology Corporation (ILITEK)
img Imagination Technologies Ltd.
infineon Infineon Technologies
inforce Inforce Computing
@ -174,6 +175,7 @@ itead ITEAD Intelligent Systems Co.Ltd
iwave iWave Systems Technologies Pvt. Ltd.
jdi Japan Display Inc.
jedec JEDEC Solid State Technology Association
jianda Jiandangjing Technology Co., Ltd.
karo Ka-Ro electronics GmbH
keithkoep Keith & Koep GmbH
keymile Keymile GmbH
@ -383,6 +385,7 @@ virtio Virtual I/O Device Specification, developed by the OASIS consortium
vivante Vivante Corporation
vocore VoCore Studio
voipac Voipac Technologies s.r.o.
vot Vision Optical Technology Co., Ltd.
wd Western Digital Corp.
wetek WeTek Electronics, limited.
wexler Wexler

View file

@ -74,15 +74,6 @@ Helper Functions Reference
.. kernel-doc:: drivers/gpu/drm/drm_atomic_helper.c
:export:
Legacy CRTC/Modeset Helper Functions Reference
==============================================
.. kernel-doc:: drivers/gpu/drm/drm_crtc_helper.c
:doc: overview
.. kernel-doc:: drivers/gpu/drm/drm_crtc_helper.c
:export:
Simple KMS Helper Reference
===========================
@ -163,6 +154,9 @@ Panel Helper Reference
.. kernel-doc:: drivers/gpu/drm/drm_panel.c
:export:
.. kernel-doc:: drivers/gpu/drm/drm_panel_orientation_quirks.c
:export:
Display Port Helper Functions Reference
=======================================
@ -279,15 +273,6 @@ Flip-work Helper Reference
.. kernel-doc:: drivers/gpu/drm/drm_flip_work.c
:export:
Plane Helper Reference
======================
.. kernel-doc:: drivers/gpu/drm/drm_plane_helper.c
:doc: overview
.. kernel-doc:: drivers/gpu/drm/drm_plane_helper.c
:export:
Auxiliary Modeset Helpers
=========================
@ -305,3 +290,21 @@ Framebuffer GEM Helper Reference
.. kernel-doc:: drivers/gpu/drm/drm_gem_framebuffer_helper.c
:export:
Legacy Plane Helper Reference
=============================
.. kernel-doc:: drivers/gpu/drm/drm_plane_helper.c
:doc: overview
.. kernel-doc:: drivers/gpu/drm/drm_plane_helper.c
:export:
Legacy CRTC/Modeset Helper Functions Reference
==============================================
.. kernel-doc:: drivers/gpu/drm/drm_crtc_helper.c
:doc: overview
.. kernel-doc:: drivers/gpu/drm/drm_crtc_helper.c
:export:

View file

@ -263,14 +263,20 @@ Taken all together there's two consequences for the atomic design:
- An atomic update is assembled and validated as an entirely free-standing pile
of structures within the :c:type:`drm_atomic_state <drm_atomic_state>`
container. Again drivers can subclass that container for their own state
structure tracking needs. Only when a state is committed is it applied to the
driver and modeset objects. This way rolling back an update boils down to
releasing memory and unreferencing objects like framebuffers.
container. Driver private state structures are also tracked in the same
structure; see the next chapter. Only when a state is committed is it applied
to the driver and modeset objects. This way rolling back an update boils down
to releasing memory and unreferencing objects like framebuffers.
Read on in this chapter, and also in :ref:`drm_atomic_helper` for more detailed
coverage of specific topics.
Handling Driver Private State
-----------------------------
.. kernel-doc:: drivers/gpu/drm/drm_atomic.c
:doc: handling driver private state
Atomic Mode Setting Function Reference
--------------------------------------

View file

@ -347,10 +347,10 @@ GuC-specific firmware loader
GuC-based command submission
----------------------------
.. kernel-doc:: drivers/gpu/drm/i915/i915_guc_submission.c
.. kernel-doc:: drivers/gpu/drm/i915/intel_guc_submission.c
:doc: GuC-based command submission
.. kernel-doc:: drivers/gpu/drm/i915/i915_guc_submission.c
.. kernel-doc:: drivers/gpu/drm/i915/intel_guc_submission.c
:internal:
GuC Firmware Layout

View file

@ -179,8 +179,39 @@ don't do this, drivers used dev_info/warn/err to make this differentiation. We
now have DRM_DEV_* variants of the drm print macros, so we can start to convert
those drivers back to using drm-formwatted specific log messages.
Before you start this conversion please contact the relevant maintainers to make
sure your work will be merged - not everyone agrees that the DRM dmesg macros
are better.
Contact: Sean Paul, Maintainer of the driver you plan to convert
Convert drivers to use simple modeset suspend/resume
----------------------------------------------------
Most drivers (except i915 and nouveau) that use
drm_atomic_helper_suspend/resume() can probably be converted to use
drm_mode_config_helper_suspend/resume().
Contact: Maintainer of the driver you plan to convert
Convert drivers to use drm_fb_helper_fbdev_setup/teardown()
-----------------------------------------------------------
Most drivers can use drm_fb_helper_fbdev_setup() except maybe:
- amdgpu which has special logic to decide whether to call
drm_helper_disable_unused_functions()
- armada which isn't atomic and doesn't call
drm_helper_disable_unused_functions()
- i915 which calls drm_fb_helper_initial_config() in a worker
Drivers that use drm_framebuffer_remove() to clean up the fbdev framebuffer can
probably use drm_fb_helper_fbdev_teardown().
Contact: Maintainer of the driver you plan to convert
Core refactorings
=================
@ -382,11 +413,6 @@ those drivers as simple as possible, so lots of room for refactoring:
one of the ideas for having a shared dsi/dbi helper, abstracting away the
transport details more.
- tinydrm_lastclose could be drm_fb_helper_lastclose. Only thing we need
for that is to store the drm_fb_helper pointer somewhere in
drm_device->mode_config. And then we could roll that out to all the
drivers.
- tinydrm_gem_cma_prime_import_sg_table should probably go into the cma
helpers, as a _vmapped variant (since not every driver needs the vmap).
And tinydrm_gem_cma_free_object could the be merged into
@ -400,11 +426,6 @@ those drivers as simple as possible, so lots of room for refactoring:
a drm_device wrong. Doesn't matter, since everyone else gets it wrong
too :-)
- With the fbdev pointer in dev->mode_config we could also make
suspend/resume helpers entirely generic, at least if we add a
dev->mode_config.suspend_state. We could even provide a generic pm_ops
structure with those.
- also rework the drm_framebuffer_funcs->dirty hook wire-up, see above.
Contact: Noralf Trønnes, Daniel Vetter

View file

@ -4486,6 +4486,12 @@ T: git git://anongit.freedesktop.org/drm/drm-misc
S: Maintained
F: drivers/gpu/drm/tve200/
DRM DRIVER FOR ILITEK ILI9225 PANELS
M: David Lechner <david@lechnology.com>
S: Maintained
F: drivers/gpu/drm/tinydrm/ili9225.c
F: Documentation/devicetree/bindings/display/ili9225.txt
DRM DRIVER FOR INTEL I810 VIDEO CARDS
S: Orphan / Obsolete
F: drivers/gpu/drm/i810/
@ -4572,6 +4578,12 @@ S: Maintained
F: drivers/gpu/drm/tinydrm/st7586.c
F: Documentation/devicetree/bindings/display/st7586.txt
DRM DRIVER FOR SITRONIX ST7735R PANELS
M: David Lechner <david@lechnology.com>
S: Maintained
F: drivers/gpu/drm/tinydrm/st7735r.c
F: Documentation/devicetree/bindings/display/st7735r.txt
DRM DRIVER FOR TDFX VIDEO CARDS
S: Orphan / Obsolete
F: drivers/gpu/drm/tdfx/
@ -4611,7 +4623,7 @@ F: include/linux/vga*
DRM DRIVERS AND MISC GPU PATCHES
M: Daniel Vetter <daniel.vetter@intel.com>
M: Jani Nikula <jani.nikula@linux.intel.com>
M: Gustavo Padovan <gustavo@padovan.org>
M: Sean Paul <seanpaul@chromium.org>
W: https://01.org/linuxgraphics/gfx-docs/maintainer-tools/drm-misc.html
S: Maintained
@ -4740,7 +4752,8 @@ F: Documentation/devicetree/bindings/display/bridge/renesas,dw-hdmi.txt
F: Documentation/devicetree/bindings/display/renesas,du.txt
DRM DRIVERS FOR ROCKCHIP
M: Mark Yao <mark.yao@rock-chips.com>
M: Sandy Huang <hjc@rock-chips.com>
M: Heiko Stübner <heiko@sntech.de>
L: dri-devel@lists.freedesktop.org
S: Maintained
F: drivers/gpu/drm/rockchip/
@ -4828,6 +4841,15 @@ S: Maintained
F: drivers/gpu/drm/tinydrm/
F: include/drm/tinydrm/
DRM TTM SUBSYSTEM
M: Christian Koenig <christian.koenig@amd.com>
M: Roger He <Hongbo.He@amd.com>
T: git git://people.freedesktop.org/~agd5f/linux
S: Maintained
L: dri-devel@lists.freedesktop.org
F: include/drm/ttm/
F: drivers/gpu/drm/ttm/
DSBR100 USB FM RADIO DRIVER
M: Alexey Klimov <klimov.linux@gmail.com>
L: linux-media@vger.kernel.org
@ -7056,7 +7078,7 @@ M: Zhi Wang <zhi.a.wang@intel.com>
L: intel-gvt-dev@lists.freedesktop.org
L: intel-gfx@lists.freedesktop.org
W: https://01.org/igvt-g
T: git https://github.com/01org/gvt-linux.git
T: git https://github.com/intel/gvt-linux.git
S: Supported
F: drivers/gpu/drm/i915/gvt/
@ -11426,6 +11448,7 @@ F: drivers/net/wireless/quantenna
RADEON and AMDGPU DRM DRIVERS
M: Alex Deucher <alexander.deucher@amd.com>
M: Christian König <christian.koenig@amd.com>
M: David (ChunMing) Zhou <David1.Zhou@amd.com>
L: amd-gfx@lists.freedesktop.org
T: git git://people.freedesktop.org/~agd5f/linux
S: Supported

View file

@ -146,6 +146,18 @@ int iosf_mbi_register_pmic_bus_access_notifier(struct notifier_block *nb);
*/
int iosf_mbi_unregister_pmic_bus_access_notifier(struct notifier_block *nb);
/**
* iosf_mbi_unregister_pmic_bus_access_notifier_unlocked - Unregister PMIC bus
* notifier, unlocked
*
* Like iosf_mbi_unregister_pmic_bus_access_notifier(), but for use when the
* caller has already called iosf_mbi_punit_acquire() itself.
*
* @nb: notifier_block to unregister
*/
int iosf_mbi_unregister_pmic_bus_access_notifier_unlocked(
struct notifier_block *nb);
/**
* iosf_mbi_call_pmic_bus_access_notifier_chain - Call PMIC bus notifier chain
*
@ -154,6 +166,11 @@ int iosf_mbi_unregister_pmic_bus_access_notifier(struct notifier_block *nb);
*/
int iosf_mbi_call_pmic_bus_access_notifier_chain(unsigned long val, void *v);
/**
* iosf_mbi_assert_punit_acquired - Assert that the P-Unit has been acquired.
*/
void iosf_mbi_assert_punit_acquired(void);
#else /* CONFIG_IOSF_MBI is not enabled */
static inline
bool iosf_mbi_available(void)
@ -197,12 +214,20 @@ int iosf_mbi_unregister_pmic_bus_access_notifier(struct notifier_block *nb)
return 0;
}
static inline int
iosf_mbi_unregister_pmic_bus_access_notifier_unlocked(struct notifier_block *nb)
{
return 0;
}
static inline
int iosf_mbi_call_pmic_bus_access_notifier_chain(unsigned long val, void *v)
{
return 0;
}
static inline void iosf_mbi_assert_punit_acquired(void) {}
#endif /* CONFIG_IOSF_MBI */
#endif /* IOSF_MBI_SYMS_H */

View file

@ -243,7 +243,7 @@ static void __init intel_remapping_check(int num, int slot, int func)
#define KB(x) ((x) * 1024UL)
#define MB(x) (KB (KB (x)))
static size_t __init i830_tseg_size(void)
static resource_size_t __init i830_tseg_size(void)
{
u8 esmramc = read_pci_config_byte(0, 0, 0, I830_ESMRAMC);
@ -256,7 +256,7 @@ static size_t __init i830_tseg_size(void)
return KB(512);
}
static size_t __init i845_tseg_size(void)
static resource_size_t __init i845_tseg_size(void)
{
u8 esmramc = read_pci_config_byte(0, 0, 0, I845_ESMRAMC);
u8 tseg_size = esmramc & I845_TSEG_SIZE_MASK;
@ -273,7 +273,7 @@ static size_t __init i845_tseg_size(void)
return 0;
}
static size_t __init i85x_tseg_size(void)
static resource_size_t __init i85x_tseg_size(void)
{
u8 esmramc = read_pci_config_byte(0, 0, 0, I85X_ESMRAMC);
@ -283,12 +283,12 @@ static size_t __init i85x_tseg_size(void)
return MB(1);
}
static size_t __init i830_mem_size(void)
static resource_size_t __init i830_mem_size(void)
{
return read_pci_config_byte(0, 0, 0, I830_DRB3) * MB(32);
}
static size_t __init i85x_mem_size(void)
static resource_size_t __init i85x_mem_size(void)
{
return read_pci_config_byte(0, 0, 1, I85X_DRB3) * MB(32);
}
@ -297,36 +297,36 @@ static size_t __init i85x_mem_size(void)
* On 830/845/85x the stolen memory base isn't available in any
* register. We need to calculate it as TOM-TSEG_SIZE-stolen_size.
*/
static phys_addr_t __init i830_stolen_base(int num, int slot, int func,
size_t stolen_size)
static resource_size_t __init i830_stolen_base(int num, int slot, int func,
resource_size_t stolen_size)
{
return (phys_addr_t)i830_mem_size() - i830_tseg_size() - stolen_size;
return i830_mem_size() - i830_tseg_size() - stolen_size;
}
static phys_addr_t __init i845_stolen_base(int num, int slot, int func,
size_t stolen_size)
static resource_size_t __init i845_stolen_base(int num, int slot, int func,
resource_size_t stolen_size)
{
return (phys_addr_t)i830_mem_size() - i845_tseg_size() - stolen_size;
return i830_mem_size() - i845_tseg_size() - stolen_size;
}
static phys_addr_t __init i85x_stolen_base(int num, int slot, int func,
size_t stolen_size)
static resource_size_t __init i85x_stolen_base(int num, int slot, int func,
resource_size_t stolen_size)
{
return (phys_addr_t)i85x_mem_size() - i85x_tseg_size() - stolen_size;
return i85x_mem_size() - i85x_tseg_size() - stolen_size;
}
static phys_addr_t __init i865_stolen_base(int num, int slot, int func,
size_t stolen_size)
static resource_size_t __init i865_stolen_base(int num, int slot, int func,
resource_size_t stolen_size)
{
u16 toud = 0;
toud = read_pci_config_16(0, 0, 0, I865_TOUD);
return (phys_addr_t)(toud << 16) + i845_tseg_size();
return toud * KB(64) + i845_tseg_size();
}
static phys_addr_t __init gen3_stolen_base(int num, int slot, int func,
size_t stolen_size)
static resource_size_t __init gen3_stolen_base(int num, int slot, int func,
resource_size_t stolen_size)
{
u32 bsm;
@ -337,10 +337,10 @@ static phys_addr_t __init gen3_stolen_base(int num, int slot, int func,
*/
bsm = read_pci_config(num, slot, func, INTEL_BSM);
return (phys_addr_t)bsm & INTEL_BSM_MASK;
return bsm & INTEL_BSM_MASK;
}
static size_t __init i830_stolen_size(int num, int slot, int func)
static resource_size_t __init i830_stolen_size(int num, int slot, int func)
{
u16 gmch_ctrl;
u16 gms;
@ -361,7 +361,7 @@ static size_t __init i830_stolen_size(int num, int slot, int func)
return 0;
}
static size_t __init gen3_stolen_size(int num, int slot, int func)
static resource_size_t __init gen3_stolen_size(int num, int slot, int func)
{
u16 gmch_ctrl;
u16 gms;
@ -390,7 +390,7 @@ static size_t __init gen3_stolen_size(int num, int slot, int func)
return 0;
}
static size_t __init gen6_stolen_size(int num, int slot, int func)
static resource_size_t __init gen6_stolen_size(int num, int slot, int func)
{
u16 gmch_ctrl;
u16 gms;
@ -398,10 +398,10 @@ static size_t __init gen6_stolen_size(int num, int slot, int func)
gmch_ctrl = read_pci_config_16(num, slot, func, SNB_GMCH_CTRL);
gms = (gmch_ctrl >> SNB_GMCH_GMS_SHIFT) & SNB_GMCH_GMS_MASK;
return (size_t)gms * MB(32);
return gms * MB(32);
}
static size_t __init gen8_stolen_size(int num, int slot, int func)
static resource_size_t __init gen8_stolen_size(int num, int slot, int func)
{
u16 gmch_ctrl;
u16 gms;
@ -409,10 +409,10 @@ static size_t __init gen8_stolen_size(int num, int slot, int func)
gmch_ctrl = read_pci_config_16(num, slot, func, SNB_GMCH_CTRL);
gms = (gmch_ctrl >> BDW_GMCH_GMS_SHIFT) & BDW_GMCH_GMS_MASK;
return (size_t)gms * MB(32);
return gms * MB(32);
}
static size_t __init chv_stolen_size(int num, int slot, int func)
static resource_size_t __init chv_stolen_size(int num, int slot, int func)
{
u16 gmch_ctrl;
u16 gms;
@ -426,14 +426,14 @@ static size_t __init chv_stolen_size(int num, int slot, int func)
* 0x17 to 0x1d: 4MB increments start at 36MB
*/
if (gms < 0x11)
return (size_t)gms * MB(32);
return gms * MB(32);
else if (gms < 0x17)
return (size_t)(gms - 0x11 + 2) * MB(4);
return (gms - 0x11) * MB(4) + MB(8);
else
return (size_t)(gms - 0x17 + 9) * MB(4);
return (gms - 0x17) * MB(4) + MB(36);
}
static size_t __init gen9_stolen_size(int num, int slot, int func)
static resource_size_t __init gen9_stolen_size(int num, int slot, int func)
{
u16 gmch_ctrl;
u16 gms;
@ -444,14 +444,15 @@ static size_t __init gen9_stolen_size(int num, int slot, int func)
/* 0x0 to 0xef: 32MB increments starting at 0MB */
/* 0xf0 to 0xfe: 4MB increments starting at 4MB */
if (gms < 0xf0)
return (size_t)gms * MB(32);
return gms * MB(32);
else
return (size_t)(gms - 0xf0 + 1) * MB(4);
return (gms - 0xf0) * MB(4) + MB(4);
}
struct intel_early_ops {
size_t (*stolen_size)(int num, int slot, int func);
phys_addr_t (*stolen_base)(int num, int slot, int func, size_t size);
resource_size_t (*stolen_size)(int num, int slot, int func);
resource_size_t (*stolen_base)(int num, int slot, int func,
resource_size_t size);
};
static const struct intel_early_ops i830_early_ops __initconst = {
@ -527,16 +528,20 @@ static const struct pci_device_id intel_early_ids[] __initconst = {
INTEL_SKL_IDS(&gen9_early_ops),
INTEL_BXT_IDS(&gen9_early_ops),
INTEL_KBL_IDS(&gen9_early_ops),
INTEL_CFL_IDS(&gen9_early_ops),
INTEL_GLK_IDS(&gen9_early_ops),
INTEL_CNL_IDS(&gen9_early_ops),
};
struct resource intel_graphics_stolen_res __ro_after_init = DEFINE_RES_MEM(0, 0);
EXPORT_SYMBOL(intel_graphics_stolen_res);
static void __init
intel_graphics_stolen(int num, int slot, int func,
const struct intel_early_ops *early_ops)
{
phys_addr_t base, end;
size_t size;
resource_size_t base, size;
resource_size_t end;
size = early_ops->stolen_size(num, slot, func);
base = early_ops->stolen_base(num, slot, func, size);
@ -545,8 +550,12 @@ intel_graphics_stolen(int num, int slot, int func,
return;
end = base + size - 1;
printk(KERN_INFO "Reserving Intel graphics memory at %pa-%pa\n",
&base, &end);
intel_graphics_stolen_res.start = base;
intel_graphics_stolen_res.end = end;
printk(KERN_INFO "Reserving Intel graphics memory at %pR\n",
&intel_graphics_stolen_res);
/* Mark this space as reserved */
e820__range_add(base, size, E820_TYPE_RESERVED);

View file

@ -218,14 +218,23 @@ int iosf_mbi_register_pmic_bus_access_notifier(struct notifier_block *nb)
}
EXPORT_SYMBOL(iosf_mbi_register_pmic_bus_access_notifier);
int iosf_mbi_unregister_pmic_bus_access_notifier_unlocked(
struct notifier_block *nb)
{
iosf_mbi_assert_punit_acquired();
return blocking_notifier_chain_unregister(
&iosf_mbi_pmic_bus_access_notifier, nb);
}
EXPORT_SYMBOL(iosf_mbi_unregister_pmic_bus_access_notifier_unlocked);
int iosf_mbi_unregister_pmic_bus_access_notifier(struct notifier_block *nb)
{
int ret;
/* Wait for the bus to go inactive before unregistering */
mutex_lock(&iosf_mbi_punit_mutex);
ret = blocking_notifier_chain_unregister(
&iosf_mbi_pmic_bus_access_notifier, nb);
ret = iosf_mbi_unregister_pmic_bus_access_notifier_unlocked(nb);
mutex_unlock(&iosf_mbi_punit_mutex);
return ret;
@ -239,6 +248,12 @@ int iosf_mbi_call_pmic_bus_access_notifier_chain(unsigned long val, void *v)
}
EXPORT_SYMBOL(iosf_mbi_call_pmic_bus_access_notifier_chain);
void iosf_mbi_assert_punit_acquired(void)
{
WARN_ON(!mutex_is_locked(&iosf_mbi_punit_mutex));
}
EXPORT_SYMBOL(iosf_mbi_assert_punit_acquired);
#ifdef CONFIG_IOSF_MBI_DEBUG
static u32 dbg_mdr;
static u32 dbg_mcr;

View file

@ -231,6 +231,7 @@ config DMA_SHARED_BUFFER
bool
default n
select ANON_INODES
select IRQ_WORK
help
This option enables the framework for buffer-sharing between
multiple drivers. A buffer is associated with a file using driver

View file

@ -80,7 +80,7 @@ static struct _intel_private {
unsigned int needs_dmar : 1;
phys_addr_t gma_bus_addr;
/* Size of memory reserved for graphics by the BIOS */
unsigned int stolen_size;
resource_size_t stolen_size;
/* Total number of gtt entries. */
unsigned int gtt_total_entries;
/* Part of the gtt that is mappable by the cpu, for those chips where
@ -333,13 +333,13 @@ static void i810_write_entry(dma_addr_t addr, unsigned int entry,
writel_relaxed(addr | pte_flags, intel_private.gtt + entry);
}
static unsigned int intel_gtt_stolen_size(void)
static resource_size_t intel_gtt_stolen_size(void)
{
u16 gmch_ctrl;
u8 rdct;
int local = 0;
static const int ddt[4] = { 0, 16, 32, 64 };
unsigned int stolen_size = 0;
resource_size_t stolen_size = 0;
if (INTEL_GTT_GEN == 1)
return 0; /* no stolen mem on i81x */
@ -417,8 +417,8 @@ static unsigned int intel_gtt_stolen_size(void)
}
if (stolen_size > 0) {
dev_info(&intel_private.bridge_dev->dev, "detected %dK %s memory\n",
stolen_size / KB(1), local ? "local" : "stolen");
dev_info(&intel_private.bridge_dev->dev, "detected %lluK %s memory\n",
(u64)stolen_size / KB(1), local ? "local" : "stolen");
} else {
dev_info(&intel_private.bridge_dev->dev,
"no pre-allocated video memory detected\n");
@ -872,6 +872,8 @@ void intel_gtt_insert_sg_entries(struct sg_table *st,
}
}
wmb();
if (intel_private.driver->chipset_flush)
intel_private.driver->chipset_flush();
}
EXPORT_SYMBOL(intel_gtt_insert_sg_entries);
@ -1422,12 +1424,10 @@ int intel_gmch_probe(struct pci_dev *bridge_pdev, struct pci_dev *gpu_pdev,
EXPORT_SYMBOL(intel_gmch_probe);
void intel_gtt_get(u64 *gtt_total,
u32 *stolen_size,
phys_addr_t *mappable_base,
u64 *mappable_end)
resource_size_t *mappable_end)
{
*gtt_total = intel_private.gtt_total_entries << PAGE_SHIFT;
*stolen_size = intel_private.stolen_size;
*mappable_base = intel_private.gma_bus_addr;
*mappable_end = intel_private.gtt_mappable_entries << PAGE_SHIFT;
}

View file

@ -351,13 +351,13 @@ static inline int is_dma_buf_file(struct file *file)
*
* 2. Userspace passes this file-descriptors to all drivers it wants this buffer
* to share with: First the filedescriptor is converted to a &dma_buf using
* dma_buf_get(). The the buffer is attached to the device using
* dma_buf_get(). Then the buffer is attached to the device using
* dma_buf_attach().
*
* Up to this stage the exporter is still free to migrate or reallocate the
* backing storage.
*
* 3. Once the buffer is attached to all devices userspace can inniate DMA
* 3. Once the buffer is attached to all devices userspace can initiate DMA
* access to the shared buffer. In the kernel this is done by calling
* dma_buf_map_attachment() and dma_buf_unmap_attachment().
*
@ -617,7 +617,7 @@ EXPORT_SYMBOL_GPL(dma_buf_detach);
* Returns sg_table containing the scatterlist to be returned; returns ERR_PTR
* on error. May return -EINTR if it is interrupted by a signal.
*
* A mapping must be unmapped again using dma_buf_map_attachment(). Note that
* A mapping must be unmapped by using dma_buf_unmap_attachment(). Note that
* the underlying backing storage is pinned for as long as a mapping exists,
* therefore users/importers should not hold onto a mapping for undue amounts of
* time.
@ -1179,8 +1179,7 @@ static int dma_buf_init_debugfs(void)
static void dma_buf_uninit_debugfs(void)
{
if (dma_buf_debugfs_dir)
debugfs_remove_recursive(dma_buf_debugfs_dir);
debugfs_remove_recursive(dma_buf_debugfs_dir);
}
#else
static inline int dma_buf_init_debugfs(void)

View file

@ -31,6 +31,14 @@ static const char *dma_fence_array_get_timeline_name(struct dma_fence *fence)
return "unbound";
}
static void irq_dma_fence_array_work(struct irq_work *wrk)
{
struct dma_fence_array *array = container_of(wrk, typeof(*array), work);
dma_fence_signal(&array->base);
dma_fence_put(&array->base);
}
static void dma_fence_array_cb_func(struct dma_fence *f,
struct dma_fence_cb *cb)
{
@ -39,8 +47,9 @@ static void dma_fence_array_cb_func(struct dma_fence *f,
struct dma_fence_array *array = array_cb->array;
if (atomic_dec_and_test(&array->num_pending))
dma_fence_signal(&array->base);
dma_fence_put(&array->base);
irq_work_queue(&array->work);
else
dma_fence_put(&array->base);
}
static bool dma_fence_array_enable_signaling(struct dma_fence *fence)
@ -136,6 +145,7 @@ struct dma_fence_array *dma_fence_array_create(int num_fences,
spin_lock_init(&array->lock);
dma_fence_init(&array->base, &dma_fence_array_ops, &array->lock,
context, seqno);
init_irq_work(&array->work, irq_dma_fence_array_work);
array->num_fences = num_fences;
atomic_set(&array->num_pending, signal_on_any ? 1 : num_fences);

View file

@ -104,7 +104,8 @@ reservation_object_add_shared_inplace(struct reservation_object *obj,
struct reservation_object_list *fobj,
struct dma_fence *fence)
{
u32 i;
struct dma_fence *signaled = NULL;
u32 i, signaled_idx;
dma_fence_get(fence);
@ -126,17 +127,28 @@ reservation_object_add_shared_inplace(struct reservation_object *obj,
dma_fence_put(old_fence);
return;
}
if (!signaled && dma_fence_is_signaled(old_fence)) {
signaled = old_fence;
signaled_idx = i;
}
}
/*
* memory barrier is added by write_seqcount_begin,
* fobj->shared_count is protected by this lock too
*/
RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence);
fobj->shared_count++;
if (signaled) {
RCU_INIT_POINTER(fobj->shared[signaled_idx], fence);
} else {
RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence);
fobj->shared_count++;
}
write_seqcount_end(&obj->seq);
preempt_enable();
dma_fence_put(signaled);
}
static void
@ -145,8 +157,7 @@ reservation_object_add_shared_replace(struct reservation_object *obj,
struct reservation_object_list *fobj,
struct dma_fence *fence)
{
unsigned i;
struct dma_fence *old_fence = NULL;
unsigned i, j, k;
dma_fence_get(fence);
@ -162,24 +173,21 @@ reservation_object_add_shared_replace(struct reservation_object *obj,
* references from the old struct are carried over to
* the new.
*/
fobj->shared_count = old->shared_count;
for (i = 0; i < old->shared_count; ++i) {
for (i = 0, j = 0, k = fobj->shared_max; i < old->shared_count; ++i) {
struct dma_fence *check;
check = rcu_dereference_protected(old->shared[i],
reservation_object_held(obj));
if (!old_fence && check->context == fence->context) {
old_fence = check;
RCU_INIT_POINTER(fobj->shared[i], fence);
} else
RCU_INIT_POINTER(fobj->shared[i], check);
}
if (!old_fence) {
RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence);
fobj->shared_count++;
if (check->context == fence->context ||
dma_fence_is_signaled(check))
RCU_INIT_POINTER(fobj->shared[--k], check);
else
RCU_INIT_POINTER(fobj->shared[j++], check);
}
fobj->shared_count = j;
RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence);
fobj->shared_count++;
done:
preempt_disable();
@ -192,10 +200,18 @@ done:
write_seqcount_end(&obj->seq);
preempt_enable();
if (old)
kfree_rcu(old, rcu);
if (!old)
return;
dma_fence_put(old_fence);
/* Drop the references to the signaled fences */
for (i = k; i < fobj->shared_max; ++i) {
struct dma_fence *f;
f = rcu_dereference_protected(fobj->shared[i],
reservation_object_held(obj));
dma_fence_put(f);
}
kfree_rcu(old, rcu);
}
/**
@ -318,7 +334,7 @@ retry:
continue;
}
dst_list->shared[dst_list->shared_count++] = fence;
rcu_assign_pointer(dst_list->shared[dst_list->shared_count++], fence);
}
} else {
dst_list = NULL;
@ -455,13 +471,15 @@ long reservation_object_wait_timeout_rcu(struct reservation_object *obj,
unsigned long timeout)
{
struct dma_fence *fence;
unsigned seq, shared_count, i = 0;
unsigned seq, shared_count;
long ret = timeout ? timeout : 1;
int i;
retry:
shared_count = 0;
seq = read_seqcount_begin(&obj->seq);
rcu_read_lock();
i = -1;
fence = rcu_dereference(obj->fence_excl);
if (fence && !test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
@ -477,14 +495,14 @@ retry:
fence = NULL;
}
if (!fence && wait_all) {
if (wait_all) {
struct reservation_object_list *fobj =
rcu_dereference(obj->fence);
if (fobj)
shared_count = fobj->shared_count;
for (i = 0; i < shared_count; ++i) {
for (i = 0; !fence && i < shared_count; ++i) {
struct dma_fence *lfence = rcu_dereference(fobj->shared[i]);
if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,

View file

@ -7,6 +7,7 @@
menuconfig DRM
tristate "Direct Rendering Manager (XFree86 4.1.0 and higher DRI support)"
depends on (AGP || AGP=n) && !EMULATED_CMPXCHG && HAS_DMA
select DRM_PANEL_ORIENTATION_QUIRKS
select HDMI
select FB_CMDLINE
select I2C
@ -149,6 +150,10 @@ config DRM_VM
bool
depends on DRM && MMU
config DRM_SCHED
tristate
depends on DRM
source "drivers/gpu/drm/i2c/Kconfig"
source "drivers/gpu/drm/arm/Kconfig"
@ -178,6 +183,7 @@ config DRM_AMDGPU
depends on DRM && PCI && MMU
select FW_LOADER
select DRM_KMS_HELPER
select DRM_SCHED
select DRM_TTM
select POWER_SUPPLY
select HWMON
@ -362,6 +368,10 @@ config DRM_SAVAGE
endif # DRM_LEGACY
# Separate option because drm_panel_orientation_quirks.c is shared with fbdev
config DRM_PANEL_ORIENTATION_QUIRKS
tristate
config DRM_LIB_RANDOM
bool
default n

View file

@ -47,8 +47,10 @@ obj-$(CONFIG_DRM_DEBUG_MM_SELFTEST) += selftests/
obj-$(CONFIG_DRM) += drm.o
obj-$(CONFIG_DRM_MIPI_DSI) += drm_mipi_dsi.o
obj-$(CONFIG_DRM_PANEL_ORIENTATION_QUIRKS) += drm_panel_orientation_quirks.o
obj-$(CONFIG_DRM_ARM) += arm/
obj-$(CONFIG_DRM_TTM) += ttm/
obj-$(CONFIG_DRM_SCHED) += scheduler/
obj-$(CONFIG_DRM_TDFX) += tdfx/
obj-$(CONFIG_DRM_R128) += r128/
obj-y += amd/lib/

View file

@ -52,7 +52,8 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
amdgpu_prime.o amdgpu_vm.o amdgpu_ib.o amdgpu_pll.o \
amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \
amdgpu_gtt_mgr.o amdgpu_vram_mgr.o amdgpu_virt.o amdgpu_atomfirmware.o \
amdgpu_queue_mgr.o amdgpu_vf_error.o amdgpu_sched.o
amdgpu_queue_mgr.o amdgpu_vf_error.o amdgpu_sched.o amdgpu_debugfs.o \
amdgpu_ids.o
# add asic specific block
amdgpu-$(CONFIG_DRM_AMDGPU_CIK)+= cik.o cik_ih.o kv_smc.o kv_dpm.o \
@ -62,7 +63,7 @@ amdgpu-$(CONFIG_DRM_AMDGPU_CIK)+= cik.o cik_ih.o kv_smc.o kv_dpm.o \
amdgpu-$(CONFIG_DRM_AMDGPU_SI)+= si.o gmc_v6_0.o gfx_v6_0.o si_ih.o si_dma.o dce_v6_0.o si_dpm.o si_smc.o
amdgpu-y += \
vi.o mxgpu_vi.o nbio_v6_1.o soc15.o mxgpu_ai.o nbio_v7_0.o
vi.o mxgpu_vi.o nbio_v6_1.o soc15.o mxgpu_ai.o nbio_v7_0.o vega10_reg_init.o
# add GMC block
amdgpu-y += \
@ -135,10 +136,7 @@ amdgpu-y += \
amdgpu-y += amdgpu_cgs.o
# GPU scheduler
amdgpu-y += \
../scheduler/gpu_scheduler.o \
../scheduler/sched_fence.o \
amdgpu_job.o
amdgpu-y += amdgpu_job.o
# ACP componet
ifneq ($(CONFIG_DRM_AMD_ACP),)

View file

@ -45,8 +45,11 @@
#include <drm/drmP.h>
#include <drm/drm_gem.h>
#include <drm/amdgpu_drm.h>
#include <drm/gpu_scheduler.h>
#include <kgd_kfd_interface.h>
#include "dm_pp_interface.h"
#include "kgd_pp_interface.h"
#include "amd_shared.h"
#include "amdgpu_mode.h"
@ -59,7 +62,6 @@
#include "amdgpu_sync.h"
#include "amdgpu_ring.h"
#include "amdgpu_vm.h"
#include "amd_powerplay.h"
#include "amdgpu_dpm.h"
#include "amdgpu_acp.h"
#include "amdgpu_uvd.h"
@ -67,10 +69,9 @@
#include "amdgpu_vcn.h"
#include "amdgpu_mn.h"
#include "amdgpu_dm.h"
#include "gpu_scheduler.h"
#include "amdgpu_virt.h"
#include "amdgpu_gart.h"
#include "amdgpu_debugfs.h"
/*
* Modules parameters.
@ -125,6 +126,7 @@ extern int amdgpu_param_buf_per_se;
extern int amdgpu_job_hang_limit;
extern int amdgpu_lbpw;
extern int amdgpu_compute_multipipe;
extern int amdgpu_gpu_recovery;
#ifdef CONFIG_DRM_AMDGPU_SI
extern int amdgpu_si_support;
@ -177,6 +179,10 @@ extern int amdgpu_cik_support;
#define CIK_CURSOR_WIDTH 128
#define CIK_CURSOR_HEIGHT 128
/* GPU RESET flags */
#define AMDGPU_RESET_INFO_VRAM_LOST (1 << 0)
#define AMDGPU_RESET_INFO_FULLRESET (1 << 1)
struct amdgpu_device;
struct amdgpu_ib;
struct amdgpu_cs_parser;
@ -218,17 +224,18 @@ enum amdgpu_kiq_irq {
AMDGPU_CP_KIQ_IRQ_LAST
};
int amdgpu_set_clockgating_state(struct amdgpu_device *adev,
enum amd_ip_block_type block_type,
enum amd_clockgating_state state);
int amdgpu_set_powergating_state(struct amdgpu_device *adev,
enum amd_ip_block_type block_type,
enum amd_powergating_state state);
void amdgpu_get_clockgating_state(struct amdgpu_device *adev, u32 *flags);
int amdgpu_wait_for_idle(struct amdgpu_device *adev,
enum amd_ip_block_type block_type);
bool amdgpu_is_idle(struct amdgpu_device *adev,
enum amd_ip_block_type block_type);
int amdgpu_device_ip_set_clockgating_state(struct amdgpu_device *adev,
enum amd_ip_block_type block_type,
enum amd_clockgating_state state);
int amdgpu_device_ip_set_powergating_state(struct amdgpu_device *adev,
enum amd_ip_block_type block_type,
enum amd_powergating_state state);
void amdgpu_device_ip_get_clockgating_state(struct amdgpu_device *adev,
u32 *flags);
int amdgpu_device_ip_wait_for_idle(struct amdgpu_device *adev,
enum amd_ip_block_type block_type);
bool amdgpu_device_ip_is_idle(struct amdgpu_device *adev,
enum amd_ip_block_type block_type);
#define AMDGPU_MAX_IP_NUM 16
@ -253,15 +260,16 @@ struct amdgpu_ip_block {
const struct amdgpu_ip_block_version *version;
};
int amdgpu_ip_block_version_cmp(struct amdgpu_device *adev,
enum amd_ip_block_type type,
u32 major, u32 minor);
int amdgpu_device_ip_block_version_cmp(struct amdgpu_device *adev,
enum amd_ip_block_type type,
u32 major, u32 minor);
struct amdgpu_ip_block * amdgpu_get_ip_block(struct amdgpu_device *adev,
enum amd_ip_block_type type);
struct amdgpu_ip_block *
amdgpu_device_ip_get_ip_block(struct amdgpu_device *adev,
enum amd_ip_block_type type);
int amdgpu_ip_block_add(struct amdgpu_device *adev,
const struct amdgpu_ip_block_version *ip_block_version);
int amdgpu_device_ip_block_add(struct amdgpu_device *adev,
const struct amdgpu_ip_block_version *ip_block_version);
/* provided by hw blocks that can move/clear data. e.g., gfx or sdma */
struct amdgpu_buffer_funcs {
@ -341,8 +349,9 @@ struct amdgpu_gart_funcs {
uint64_t (*get_vm_pte_flags)(struct amdgpu_device *adev,
uint32_t flags);
/* get the pde for a given mc addr */
u64 (*get_vm_pde)(struct amdgpu_device *adev, u64 addr);
uint32_t (*get_invalidate_req)(unsigned int vm_id);
void (*get_vm_pde)(struct amdgpu_device *adev, int level,
u64 *dst, u64 *flags);
uint32_t (*get_invalidate_req)(unsigned int vmid);
};
/* provided by the ih block */
@ -368,9 +377,6 @@ struct amdgpu_dummy_page {
struct page *page;
dma_addr_t addr;
};
int amdgpu_dummy_page_init(struct amdgpu_device *adev);
void amdgpu_dummy_page_fini(struct amdgpu_device *adev);
/*
* Clocks
@ -418,7 +424,6 @@ struct reservation_object *amdgpu_gem_prime_res_obj(struct drm_gem_object *);
void *amdgpu_gem_prime_vmap(struct drm_gem_object *obj);
void amdgpu_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr);
int amdgpu_gem_prime_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma);
int amdgpu_gem_debugfs_init(struct amdgpu_device *adev);
/* sub-allocation manager, it has to be protected by another lock.
* By conception this is an helper for other part of the driver
@ -535,6 +540,7 @@ struct amdgpu_mc {
u64 private_aperture_end;
/* protects concurrent invalidation */
spinlock_t invalidate_lock;
bool translate_further;
};
/*
@ -645,12 +651,6 @@ typedef enum _AMDGPU_DOORBELL64_ASSIGNMENT
AMDGPU_DOORBELL64_INVALID = 0xFFFF
} AMDGPU_DOORBELL64_ASSIGNMENT;
void amdgpu_doorbell_get_kfd_info(struct amdgpu_device *adev,
phys_addr_t *aperture_base,
size_t *aperture_size,
size_t *start_offset);
/*
* IRQS.
*/
@ -684,7 +684,7 @@ struct amdgpu_ib {
uint32_t flags;
};
extern const struct amd_sched_backend_ops amdgpu_sched_ops;
extern const struct drm_sched_backend_ops amdgpu_sched_ops;
int amdgpu_job_alloc(struct amdgpu_device *adev, unsigned num_ibs,
struct amdgpu_job **job, struct amdgpu_vm *vm);
@ -694,7 +694,7 @@ int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev, unsigned size,
void amdgpu_job_free_resources(struct amdgpu_job *job);
void amdgpu_job_free(struct amdgpu_job *job);
int amdgpu_job_submit(struct amdgpu_job *job, struct amdgpu_ring *ring,
struct amd_sched_entity *entity, void *owner,
struct drm_sched_entity *entity, void *owner,
struct dma_fence **f);
/*
@ -727,7 +727,7 @@ int amdgpu_queue_mgr_map(struct amdgpu_device *adev,
struct amdgpu_ctx_ring {
uint64_t sequence;
struct dma_fence **fences;
struct amd_sched_entity entity;
struct drm_sched_entity entity;
};
struct amdgpu_ctx {
@ -735,14 +735,16 @@ struct amdgpu_ctx {
struct amdgpu_device *adev;
struct amdgpu_queue_mgr queue_mgr;
unsigned reset_counter;
unsigned reset_counter_query;
uint32_t vram_lost_counter;
spinlock_t ring_lock;
struct dma_fence **fences;
struct amdgpu_ctx_ring rings[AMDGPU_MAX_RINGS];
bool preamble_presented;
enum amd_sched_priority init_priority;
enum amd_sched_priority override_priority;
enum drm_sched_priority init_priority;
enum drm_sched_priority override_priority;
struct mutex lock;
atomic_t guilty;
};
struct amdgpu_ctx_mgr {
@ -760,7 +762,7 @@ int amdgpu_ctx_add_fence(struct amdgpu_ctx *ctx, struct amdgpu_ring *ring,
struct dma_fence *amdgpu_ctx_get_fence(struct amdgpu_ctx *ctx,
struct amdgpu_ring *ring, uint64_t seq);
void amdgpu_ctx_priority_override(struct amdgpu_ctx *ctx,
enum amd_sched_priority priority);
enum drm_sched_priority priority);
int amdgpu_ctx_ioctl(struct drm_device *dev, void *data,
struct drm_file *filp);
@ -957,6 +959,7 @@ struct amdgpu_gfx_config {
};
struct amdgpu_cu_info {
uint32_t simd_per_cu;
uint32_t max_waves_per_simd;
uint32_t wave_front_size;
uint32_t max_scratch_slots_per_cu;
@ -1109,12 +1112,11 @@ struct amdgpu_cs_parser {
#define AMDGPU_HAVE_CTX_SWITCH (1 << 2) /* bit set means context switch occured */
struct amdgpu_job {
struct amd_sched_job base;
struct drm_sched_job base;
struct amdgpu_device *adev;
struct amdgpu_vm *vm;
struct amdgpu_ring *ring;
struct amdgpu_sync sync;
struct amdgpu_sync dep_sync;
struct amdgpu_sync sched_sync;
struct amdgpu_ib *ibs;
struct dma_fence *fence; /* the hw fence */
@ -1123,7 +1125,7 @@ struct amdgpu_job {
void *owner;
uint64_t fence_ctx; /* the fence_context this job uses */
bool vm_needs_flush;
unsigned vm_id;
unsigned vmid;
uint64_t vm_pd_addr;
uint32_t gds_base, gds_size;
uint32_t gws_base, gws_size;
@ -1164,10 +1166,10 @@ struct amdgpu_wb {
unsigned long used[DIV_ROUND_UP(AMDGPU_MAX_WB, BITS_PER_LONG)];
};
int amdgpu_wb_get(struct amdgpu_device *adev, u32 *wb);
void amdgpu_wb_free(struct amdgpu_device *adev, u32 wb);
int amdgpu_device_wb_get(struct amdgpu_device *adev, u32 *wb);
void amdgpu_device_wb_free(struct amdgpu_device *adev, u32 wb);
void amdgpu_get_pcie_info(struct amdgpu_device *adev);
void amdgpu_device_get_pcie_info(struct amdgpu_device *adev);
/*
* SDMA
@ -1232,24 +1234,6 @@ void amdgpu_benchmark(struct amdgpu_device *adev, int test_number);
*/
void amdgpu_test_moves(struct amdgpu_device *adev);
/*
* Debugfs
*/
struct amdgpu_debugfs {
const struct drm_info_list *files;
unsigned num_files;
};
int amdgpu_debugfs_add_files(struct amdgpu_device *adev,
const struct drm_info_list *files,
unsigned nfiles);
int amdgpu_debugfs_fence_init(struct amdgpu_device *adev);
#if defined(CONFIG_DEBUG_FS)
int amdgpu_debugfs_init(struct drm_minor *minor);
#endif
int amdgpu_debugfs_firmware_init(struct amdgpu_device *adev);
/*
* amdgpu smumgr functions
@ -1404,8 +1388,6 @@ struct amdgpu_fw_vram_usage {
void *va;
};
int amdgpu_fw_reserve_vram_init(struct amdgpu_device *adev);
/*
* CGS
*/
@ -1421,6 +1403,87 @@ typedef void (*amdgpu_wreg_t)(struct amdgpu_device*, uint32_t, uint32_t);
typedef uint32_t (*amdgpu_block_rreg_t)(struct amdgpu_device*, uint32_t, uint32_t);
typedef void (*amdgpu_block_wreg_t)(struct amdgpu_device*, uint32_t, uint32_t, uint32_t);
/*
* amdgpu nbio functions
*
*/
struct nbio_hdp_flush_reg {
u32 ref_and_mask_cp0;
u32 ref_and_mask_cp1;
u32 ref_and_mask_cp2;
u32 ref_and_mask_cp3;
u32 ref_and_mask_cp4;
u32 ref_and_mask_cp5;
u32 ref_and_mask_cp6;
u32 ref_and_mask_cp7;
u32 ref_and_mask_cp8;
u32 ref_and_mask_cp9;
u32 ref_and_mask_sdma0;
u32 ref_and_mask_sdma1;
};
struct amdgpu_nbio_funcs {
const struct nbio_hdp_flush_reg *hdp_flush_reg;
u32 (*get_hdp_flush_req_offset)(struct amdgpu_device *adev);
u32 (*get_hdp_flush_done_offset)(struct amdgpu_device *adev);
u32 (*get_pcie_index_offset)(struct amdgpu_device *adev);
u32 (*get_pcie_data_offset)(struct amdgpu_device *adev);
u32 (*get_rev_id)(struct amdgpu_device *adev);
void (*mc_access_enable)(struct amdgpu_device *adev, bool enable);
void (*hdp_flush)(struct amdgpu_device *adev);
u32 (*get_memsize)(struct amdgpu_device *adev);
void (*sdma_doorbell_range)(struct amdgpu_device *adev, int instance,
bool use_doorbell, int doorbell_index);
void (*enable_doorbell_aperture)(struct amdgpu_device *adev,
bool enable);
void (*enable_doorbell_selfring_aperture)(struct amdgpu_device *adev,
bool enable);
void (*ih_doorbell_range)(struct amdgpu_device *adev,
bool use_doorbell, int doorbell_index);
void (*update_medium_grain_clock_gating)(struct amdgpu_device *adev,
bool enable);
void (*update_medium_grain_light_sleep)(struct amdgpu_device *adev,
bool enable);
void (*get_clockgating_state)(struct amdgpu_device *adev,
u32 *flags);
void (*ih_control)(struct amdgpu_device *adev);
void (*init_registers)(struct amdgpu_device *adev);
void (*detect_hw_virt)(struct amdgpu_device *adev);
};
/* Define the HW IP blocks will be used in driver , add more if necessary */
enum amd_hw_ip_block_type {
GC_HWIP = 1,
HDP_HWIP,
SDMA0_HWIP,
SDMA1_HWIP,
MMHUB_HWIP,
ATHUB_HWIP,
NBIO_HWIP,
MP0_HWIP,
UVD_HWIP,
VCN_HWIP = UVD_HWIP,
VCE_HWIP,
DF_HWIP,
DCE_HWIP,
OSSSYS_HWIP,
SMUIO_HWIP,
PWR_HWIP,
NBIF_HWIP,
MAX_HWIP
};
#define HWIP_MAX_INSTANCE 6
struct amd_powerplay {
struct cgs_device *cgs_device;
void *pp_handle;
const struct amd_ip_funcs *ip_funcs;
const struct amd_pm_funcs *pp_funcs;
};
#define AMDGPU_RESET_MAGIC_NUM 64
struct amdgpu_device {
struct device *dev;
@ -1606,6 +1669,11 @@ struct amdgpu_device {
/* amdkfd interface */
struct kfd_dev *kfd;
/* soc15 register offset based on ip, instance and segment */
uint32_t *reg_offset[MAX_HWIP][HWIP_MAX_INSTANCE];
const struct amdgpu_nbio_funcs *nbio_funcs;
/* delayed work_func for deferring clockgating during resume */
struct delayed_work late_init_work;
@ -1616,9 +1684,6 @@ struct amdgpu_device {
/* link all shadow bo */
struct list_head shadow_list;
struct mutex shadow_list_lock;
/* link all gtt */
spinlock_t gtt_list_lock;
struct list_head gtt_list;
/* keep an lru list of rings by HW IP */
struct list_head ring_lru_list;
spinlock_t ring_lru_list_lock;
@ -1629,7 +1694,8 @@ struct amdgpu_device {
/* record last mm index being written through WREG32*/
unsigned long last_mm_index;
bool in_sriov_reset;
bool in_gpu_reset;
struct mutex lock_reset;
};
static inline struct amdgpu_device *amdgpu_ttm_adev(struct ttm_bo_device *bdev)
@ -1773,7 +1839,7 @@ amdgpu_get_sdma_instance(struct amdgpu_ring *ring)
#define amdgpu_asic_get_config_memsize(adev) (adev)->asic_funcs->get_config_memsize((adev))
#define amdgpu_gart_flush_gpu_tlb(adev, vmid) (adev)->gart.gart_funcs->flush_gpu_tlb((adev), (vmid))
#define amdgpu_gart_set_pte_pde(adev, pt, idx, addr, flags) (adev)->gart.gart_funcs->set_pte_pde((adev), (pt), (idx), (addr), (flags))
#define amdgpu_gart_get_vm_pde(adev, addr) (adev)->gart.gart_funcs->get_vm_pde((adev), (addr))
#define amdgpu_gart_get_vm_pde(adev, level, dst, flags) (adev)->gart.gart_funcs->get_vm_pde((adev), (level), (dst), (flags))
#define amdgpu_vm_copy_pte(adev, ib, pe, src, count) ((adev)->vm_manager.vm_pte_funcs->copy_pte((ib), (pe), (src), (count)))
#define amdgpu_vm_write_pte(adev, ib, pe, value, count, incr) ((adev)->vm_manager.vm_pte_funcs->write_pte((ib), (pe), (value), (count), (incr)))
#define amdgpu_vm_set_pte_pde(adev, ib, pe, addr, count, incr, flags) ((adev)->vm_manager.vm_pte_funcs->set_pte_pde((ib), (pe), (addr), (count), (incr), (flags)))
@ -1784,7 +1850,7 @@ amdgpu_get_sdma_instance(struct amdgpu_ring *ring)
#define amdgpu_ring_get_rptr(r) (r)->funcs->get_rptr((r))
#define amdgpu_ring_get_wptr(r) (r)->funcs->get_wptr((r))
#define amdgpu_ring_set_wptr(r) (r)->funcs->set_wptr((r))
#define amdgpu_ring_emit_ib(r, ib, vm_id, c) (r)->funcs->emit_ib((r), (ib), (vm_id), (c))
#define amdgpu_ring_emit_ib(r, ib, vmid, c) (r)->funcs->emit_ib((r), (ib), (vmid), (c))
#define amdgpu_ring_emit_pipeline_sync(r) (r)->funcs->emit_pipeline_sync((r))
#define amdgpu_ring_emit_vm_flush(r, vmid, addr) (r)->funcs->emit_vm_flush((r), (vmid), (addr))
#define amdgpu_ring_emit_fence(r, addr, seq, flags) (r)->funcs->emit_fence((r), (addr), (seq), (flags))
@ -1823,22 +1889,25 @@ amdgpu_get_sdma_instance(struct amdgpu_ring *ring)
#define amdgpu_psp_check_fw_loading_status(adev, i) (adev)->firmware.funcs->check_fw_loading_status((adev), (i))
/* Common functions */
int amdgpu_gpu_reset(struct amdgpu_device *adev);
bool amdgpu_need_backup(struct amdgpu_device *adev);
void amdgpu_pci_config_reset(struct amdgpu_device *adev);
bool amdgpu_need_post(struct amdgpu_device *adev);
int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
struct amdgpu_job* job, bool force);
void amdgpu_device_pci_config_reset(struct amdgpu_device *adev);
bool amdgpu_device_need_post(struct amdgpu_device *adev);
void amdgpu_update_display_priority(struct amdgpu_device *adev);
void amdgpu_cs_report_moved_bytes(struct amdgpu_device *adev, u64 num_bytes,
u64 num_vis_bytes);
void amdgpu_ttm_placement_from_domain(struct amdgpu_bo *abo, u32 domain);
bool amdgpu_ttm_bo_is_amdgpu_bo(struct ttm_buffer_object *bo);
void amdgpu_vram_location(struct amdgpu_device *adev, struct amdgpu_mc *mc, u64 base);
void amdgpu_gart_location(struct amdgpu_device *adev, struct amdgpu_mc *mc);
void amdgpu_device_vram_location(struct amdgpu_device *adev,
struct amdgpu_mc *mc, u64 base);
void amdgpu_device_gart_location(struct amdgpu_device *adev,
struct amdgpu_mc *mc);
int amdgpu_device_resize_fb_bar(struct amdgpu_device *adev);
void amdgpu_ttm_set_active_vram_size(struct amdgpu_device *adev, u64 size);
int amdgpu_ttm_init(struct amdgpu_device *adev);
void amdgpu_ttm_fini(struct amdgpu_device *adev);
void amdgpu_program_register_sequence(struct amdgpu_device *adev,
void amdgpu_device_program_register_sequence(struct amdgpu_device *adev,
const u32 *registers,
const u32 array_size);
@ -1872,7 +1941,7 @@ void amdgpu_driver_lastclose_kms(struct drm_device *dev);
int amdgpu_driver_open_kms(struct drm_device *dev, struct drm_file *file_priv);
void amdgpu_driver_postclose_kms(struct drm_device *dev,
struct drm_file *file_priv);
int amdgpu_suspend(struct amdgpu_device *adev);
int amdgpu_device_ip_suspend(struct amdgpu_device *adev);
int amdgpu_device_suspend(struct drm_device *dev, bool suspend, bool fbcon);
int amdgpu_device_resume(struct drm_device *dev, bool resume, bool fbcon);
u32 amdgpu_get_vblank_counter_kms(struct drm_device *dev, unsigned int pipe);

View file

@ -277,7 +277,7 @@ static int acp_hw_init(void *handle)
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
const struct amdgpu_ip_block *ip_block =
amdgpu_get_ip_block(adev, AMD_IP_BLOCK_TYPE_ACP);
amdgpu_device_ip_get_ip_block(adev, AMD_IP_BLOCK_TYPE_ACP);
if (!ip_block)
return -EINVAL;

View file

@ -85,7 +85,7 @@ void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev)
kfd2kgd = amdgpu_amdkfd_gfx_8_0_get_functions();
break;
default:
dev_info(adev->dev, "kfd not supported on this ASIC\n");
dev_dbg(adev->dev, "kfd not supported on this ASIC\n");
return;
}
@ -93,6 +93,39 @@ void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev)
adev->pdev, kfd2kgd);
}
/**
* amdgpu_doorbell_get_kfd_info - Report doorbell configuration required to
* setup amdkfd
*
* @adev: amdgpu_device pointer
* @aperture_base: output returning doorbell aperture base physical address
* @aperture_size: output returning doorbell aperture size in bytes
* @start_offset: output returning # of doorbell bytes reserved for amdgpu.
*
* amdgpu and amdkfd share the doorbell aperture. amdgpu sets it up,
* takes doorbells required for its own rings and reports the setup to amdkfd.
* amdgpu reserved doorbells are at the start of the doorbell aperture.
*/
static void amdgpu_doorbell_get_kfd_info(struct amdgpu_device *adev,
phys_addr_t *aperture_base,
size_t *aperture_size,
size_t *start_offset)
{
/*
* The first num_doorbells are used by amdgpu.
* amdkfd takes whatever's left in the aperture.
*/
if (adev->doorbell.size > adev->doorbell.num_doorbells * sizeof(u32)) {
*aperture_base = adev->doorbell.base;
*aperture_size = adev->doorbell.size;
*start_offset = adev->doorbell.num_doorbells * sizeof(u32);
} else {
*aperture_base = 0;
*aperture_size = 0;
*start_offset = 0;
}
}
void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
{
int i;
@ -242,14 +275,34 @@ void free_gtt_mem(struct kgd_dev *kgd, void *mem_obj)
kfree(mem);
}
uint64_t get_vmem_size(struct kgd_dev *kgd)
void get_local_mem_info(struct kgd_dev *kgd,
struct kfd_local_mem_info *mem_info)
{
struct amdgpu_device *adev =
(struct amdgpu_device *)kgd;
struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
uint64_t address_mask = adev->dev->dma_mask ? ~*adev->dev->dma_mask :
~((1ULL << 32) - 1);
resource_size_t aper_limit = adev->mc.aper_base + adev->mc.aper_size;
BUG_ON(kgd == NULL);
memset(mem_info, 0, sizeof(*mem_info));
if (!(adev->mc.aper_base & address_mask || aper_limit & address_mask)) {
mem_info->local_mem_size_public = adev->mc.visible_vram_size;
mem_info->local_mem_size_private = adev->mc.real_vram_size -
adev->mc.visible_vram_size;
} else {
mem_info->local_mem_size_public = 0;
mem_info->local_mem_size_private = adev->mc.real_vram_size;
}
mem_info->vram_width = adev->mc.vram_width;
return adev->mc.real_vram_size;
pr_debug("Address base: %pap limit %pap public 0x%llx private 0x%llx\n",
&adev->mc.aper_base, &aper_limit,
mem_info->local_mem_size_public,
mem_info->local_mem_size_private);
if (amdgpu_sriov_vf(adev))
mem_info->mem_clk_max = adev->clock.default_mclk / 100;
else
mem_info->mem_clk_max = amdgpu_dpm_get_mclk(adev, false) / 100;
}
uint64_t get_gpu_clock_counter(struct kgd_dev *kgd)
@ -265,6 +318,39 @@ uint32_t get_max_engine_clock_in_mhz(struct kgd_dev *kgd)
{
struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
/* The sclk is in quantas of 10kHz */
return adev->pm.dpm.dyn_state.max_clock_voltage_on_ac.sclk / 100;
/* the sclk is in quantas of 10kHz */
if (amdgpu_sriov_vf(adev))
return adev->clock.default_sclk / 100;
return amdgpu_dpm_get_sclk(adev, false) / 100;
}
void get_cu_info(struct kgd_dev *kgd, struct kfd_cu_info *cu_info)
{
struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
struct amdgpu_cu_info acu_info = adev->gfx.cu_info;
memset(cu_info, 0, sizeof(*cu_info));
if (sizeof(cu_info->cu_bitmap) != sizeof(acu_info.bitmap))
return;
cu_info->cu_active_number = acu_info.number;
cu_info->cu_ao_mask = acu_info.ao_cu_mask;
memcpy(&cu_info->cu_bitmap[0], &acu_info.bitmap[0],
sizeof(acu_info.bitmap));
cu_info->num_shader_engines = adev->gfx.config.max_shader_engines;
cu_info->num_shader_arrays_per_engine = adev->gfx.config.max_sh_per_se;
cu_info->num_cu_per_sh = adev->gfx.config.max_cu_per_sh;
cu_info->simd_per_cu = acu_info.simd_per_cu;
cu_info->max_waves_per_simd = acu_info.max_waves_per_simd;
cu_info->wave_front_size = acu_info.wave_front_size;
cu_info->max_scratch_slots_per_cu = acu_info.max_scratch_slots_per_cu;
cu_info->lds_size = acu_info.lds_size;
}
uint64_t amdgpu_amdkfd_get_vram_usage(struct kgd_dev *kgd)
{
struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
return amdgpu_vram_mgr_usage(&adev->mman.bdev.man[TTM_PL_VRAM]);
}

View file

@ -56,10 +56,13 @@ int alloc_gtt_mem(struct kgd_dev *kgd, size_t size,
void **mem_obj, uint64_t *gpu_addr,
void **cpu_ptr);
void free_gtt_mem(struct kgd_dev *kgd, void *mem_obj);
uint64_t get_vmem_size(struct kgd_dev *kgd);
void get_local_mem_info(struct kgd_dev *kgd,
struct kfd_local_mem_info *mem_info);
uint64_t get_gpu_clock_counter(struct kgd_dev *kgd);
uint32_t get_max_engine_clock_in_mhz(struct kgd_dev *kgd);
void get_cu_info(struct kgd_dev *kgd, struct kfd_cu_info *cu_info);
uint64_t amdgpu_amdkfd_get_vram_usage(struct kgd_dev *kgd);
#define read_user_wptr(mmptr, wptr, dst) \
({ \

View file

@ -105,7 +105,14 @@ static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
uint32_t queue_id, uint32_t __user *wptr,
uint32_t wptr_shift, uint32_t wptr_mask,
struct mm_struct *mm);
static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd);
static int kgd_hqd_dump(struct kgd_dev *kgd,
uint32_t pipe_id, uint32_t queue_id,
uint32_t (**dump)[2], uint32_t *n_regs);
static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd,
uint32_t __user *wptr, struct mm_struct *mm);
static int kgd_hqd_sdma_dump(struct kgd_dev *kgd,
uint32_t engine_id, uint32_t queue_id,
uint32_t (**dump)[2], uint32_t *n_regs);
static bool kgd_hqd_is_occupied(struct kgd_dev *kgd, uint64_t queue_address,
uint32_t pipe_id, uint32_t queue_id);
@ -166,17 +173,19 @@ static int get_tile_config(struct kgd_dev *kgd,
static const struct kfd2kgd_calls kfd2kgd = {
.init_gtt_mem_allocation = alloc_gtt_mem,
.free_gtt_mem = free_gtt_mem,
.get_vmem_size = get_vmem_size,
.get_local_mem_info = get_local_mem_info,
.get_gpu_clock_counter = get_gpu_clock_counter,
.get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
.alloc_pasid = amdgpu_vm_alloc_pasid,
.free_pasid = amdgpu_vm_free_pasid,
.alloc_pasid = amdgpu_pasid_alloc,
.free_pasid = amdgpu_pasid_free,
.program_sh_mem_settings = kgd_program_sh_mem_settings,
.set_pasid_vmid_mapping = kgd_set_pasid_vmid_mapping,
.init_pipeline = kgd_init_pipeline,
.init_interrupts = kgd_init_interrupts,
.hqd_load = kgd_hqd_load,
.hqd_sdma_load = kgd_hqd_sdma_load,
.hqd_dump = kgd_hqd_dump,
.hqd_sdma_dump = kgd_hqd_sdma_dump,
.hqd_is_occupied = kgd_hqd_is_occupied,
.hqd_sdma_is_occupied = kgd_hqd_sdma_is_occupied,
.hqd_destroy = kgd_hqd_destroy,
@ -191,6 +200,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
.get_fw_version = get_fw_version,
.set_scratch_backing_va = set_scratch_backing_va,
.get_tile_config = get_tile_config,
.get_cu_info = get_cu_info,
.get_vram_usage = amdgpu_amdkfd_get_vram_usage
};
struct kfd2kgd_calls *amdgpu_amdkfd_gfx_7_get_functions(void)
@ -375,7 +386,44 @@ static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
return 0;
}
static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd)
static int kgd_hqd_dump(struct kgd_dev *kgd,
uint32_t pipe_id, uint32_t queue_id,
uint32_t (**dump)[2], uint32_t *n_regs)
{
struct amdgpu_device *adev = get_amdgpu_device(kgd);
uint32_t i = 0, reg;
#define HQD_N_REGS (35+4)
#define DUMP_REG(addr) do { \
if (WARN_ON_ONCE(i >= HQD_N_REGS)) \
break; \
(*dump)[i][0] = (addr) << 2; \
(*dump)[i++][1] = RREG32(addr); \
} while (0)
*dump = kmalloc(HQD_N_REGS*2*sizeof(uint32_t), GFP_KERNEL);
if (*dump == NULL)
return -ENOMEM;
acquire_queue(kgd, pipe_id, queue_id);
DUMP_REG(mmCOMPUTE_STATIC_THREAD_MGMT_SE0);
DUMP_REG(mmCOMPUTE_STATIC_THREAD_MGMT_SE1);
DUMP_REG(mmCOMPUTE_STATIC_THREAD_MGMT_SE2);
DUMP_REG(mmCOMPUTE_STATIC_THREAD_MGMT_SE3);
for (reg = mmCP_MQD_BASE_ADDR; reg <= mmCP_MQD_CONTROL; reg++)
DUMP_REG(reg);
release_queue(kgd);
WARN_ON_ONCE(i != HQD_N_REGS);
*n_regs = i;
return 0;
}
static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd,
uint32_t __user *wptr, struct mm_struct *mm)
{
struct amdgpu_device *adev = get_amdgpu_device(kgd);
struct cik_sdma_rlc_registers *m;
@ -410,10 +458,17 @@ static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd)
WREG32(mmSDMA0_GFX_CONTEXT_CNTL, data);
}
WREG32(sdma_base_addr + mmSDMA0_RLC0_DOORBELL,
m->sdma_rlc_doorbell);
WREG32(sdma_base_addr + mmSDMA0_RLC0_RB_RPTR, 0);
WREG32(sdma_base_addr + mmSDMA0_RLC0_RB_WPTR, 0);
data = REG_SET_FIELD(m->sdma_rlc_doorbell, SDMA0_RLC0_DOORBELL,
ENABLE, 1);
WREG32(sdma_base_addr + mmSDMA0_RLC0_DOORBELL, data);
WREG32(sdma_base_addr + mmSDMA0_RLC0_RB_RPTR, m->sdma_rlc_rb_rptr);
if (read_user_wptr(mm, wptr, data))
WREG32(sdma_base_addr + mmSDMA0_RLC0_RB_WPTR, data);
else
WREG32(sdma_base_addr + mmSDMA0_RLC0_RB_WPTR,
m->sdma_rlc_rb_rptr);
WREG32(sdma_base_addr + mmSDMA0_RLC0_VIRTUAL_ADDR,
m->sdma_rlc_virtual_addr);
WREG32(sdma_base_addr + mmSDMA0_RLC0_RB_BASE, m->sdma_rlc_rb_base);
@ -423,8 +478,37 @@ static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd)
m->sdma_rlc_rb_rptr_addr_lo);
WREG32(sdma_base_addr + mmSDMA0_RLC0_RB_RPTR_ADDR_HI,
m->sdma_rlc_rb_rptr_addr_hi);
WREG32(sdma_base_addr + mmSDMA0_RLC0_RB_CNTL,
m->sdma_rlc_rb_cntl);
data = REG_SET_FIELD(m->sdma_rlc_rb_cntl, SDMA0_RLC0_RB_CNTL,
RB_ENABLE, 1);
WREG32(sdma_base_addr + mmSDMA0_RLC0_RB_CNTL, data);
return 0;
}
static int kgd_hqd_sdma_dump(struct kgd_dev *kgd,
uint32_t engine_id, uint32_t queue_id,
uint32_t (**dump)[2], uint32_t *n_regs)
{
struct amdgpu_device *adev = get_amdgpu_device(kgd);
uint32_t sdma_offset = engine_id * SDMA1_REGISTER_OFFSET +
queue_id * KFD_CIK_SDMA_QUEUE_OFFSET;
uint32_t i = 0, reg;
#undef HQD_N_REGS
#define HQD_N_REGS (19+4)
*dump = kmalloc(HQD_N_REGS*2*sizeof(uint32_t), GFP_KERNEL);
if (*dump == NULL)
return -ENOMEM;
for (reg = mmSDMA0_RLC0_RB_CNTL; reg <= mmSDMA0_RLC0_DOORBELL; reg++)
DUMP_REG(sdma_offset + reg);
for (reg = mmSDMA0_RLC0_VIRTUAL_ADDR; reg <= mmSDMA0_RLC0_WATERMARK;
reg++)
DUMP_REG(sdma_offset + reg);
WARN_ON_ONCE(i != HQD_N_REGS);
*n_regs = i;
return 0;
}
@ -575,7 +659,7 @@ static int kgd_hqd_sdma_destroy(struct kgd_dev *kgd, void *mqd,
struct cik_sdma_rlc_registers *m;
uint32_t sdma_base_addr;
uint32_t temp;
int timeout = utimeout;
unsigned long end_jiffies = (utimeout * HZ / 1000) + jiffies;
m = get_sdma_mqd(mqd);
sdma_base_addr = get_sdma_base_addr(m);
@ -588,10 +672,9 @@ static int kgd_hqd_sdma_destroy(struct kgd_dev *kgd, void *mqd,
temp = RREG32(sdma_base_addr + mmSDMA0_RLC0_CONTEXT_STATUS);
if (temp & SDMA0_STATUS_REG__RB_CMD_IDLE__SHIFT)
break;
if (timeout <= 0)
if (time_after(jiffies, end_jiffies))
return -ETIME;
msleep(20);
timeout -= 20;
usleep_range(500, 1000);
}
WREG32(sdma_base_addr + mmSDMA0_RLC0_DOORBELL, 0);
@ -599,6 +682,8 @@ static int kgd_hqd_sdma_destroy(struct kgd_dev *kgd, void *mqd,
RREG32(sdma_base_addr + mmSDMA0_RLC0_RB_CNTL) |
SDMA0_RLC0_RB_CNTL__RB_ENABLE_MASK);
m->sdma_rlc_rb_rptr = RREG32(sdma_base_addr + mmSDMA0_RLC0_RB_RPTR);
return 0;
}

View file

@ -45,7 +45,7 @@ enum hqd_dequeue_request_type {
RESET_WAVES
};
struct cik_sdma_rlc_registers;
struct vi_sdma_mqd;
/*
* Register access functions
@ -64,7 +64,14 @@ static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
uint32_t queue_id, uint32_t __user *wptr,
uint32_t wptr_shift, uint32_t wptr_mask,
struct mm_struct *mm);
static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd);
static int kgd_hqd_dump(struct kgd_dev *kgd,
uint32_t pipe_id, uint32_t queue_id,
uint32_t (**dump)[2], uint32_t *n_regs);
static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd,
uint32_t __user *wptr, struct mm_struct *mm);
static int kgd_hqd_sdma_dump(struct kgd_dev *kgd,
uint32_t engine_id, uint32_t queue_id,
uint32_t (**dump)[2], uint32_t *n_regs);
static bool kgd_hqd_is_occupied(struct kgd_dev *kgd, uint64_t queue_address,
uint32_t pipe_id, uint32_t queue_id);
static bool kgd_hqd_sdma_is_occupied(struct kgd_dev *kgd, void *mqd);
@ -125,17 +132,19 @@ static int get_tile_config(struct kgd_dev *kgd,
static const struct kfd2kgd_calls kfd2kgd = {
.init_gtt_mem_allocation = alloc_gtt_mem,
.free_gtt_mem = free_gtt_mem,
.get_vmem_size = get_vmem_size,
.get_local_mem_info = get_local_mem_info,
.get_gpu_clock_counter = get_gpu_clock_counter,
.get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
.alloc_pasid = amdgpu_vm_alloc_pasid,
.free_pasid = amdgpu_vm_free_pasid,
.alloc_pasid = amdgpu_pasid_alloc,
.free_pasid = amdgpu_pasid_free,
.program_sh_mem_settings = kgd_program_sh_mem_settings,
.set_pasid_vmid_mapping = kgd_set_pasid_vmid_mapping,
.init_pipeline = kgd_init_pipeline,
.init_interrupts = kgd_init_interrupts,
.hqd_load = kgd_hqd_load,
.hqd_sdma_load = kgd_hqd_sdma_load,
.hqd_dump = kgd_hqd_dump,
.hqd_sdma_dump = kgd_hqd_sdma_dump,
.hqd_is_occupied = kgd_hqd_is_occupied,
.hqd_sdma_is_occupied = kgd_hqd_sdma_is_occupied,
.hqd_destroy = kgd_hqd_destroy,
@ -152,6 +161,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
.get_fw_version = get_fw_version,
.set_scratch_backing_va = set_scratch_backing_va,
.get_tile_config = get_tile_config,
.get_cu_info = get_cu_info,
.get_vram_usage = amdgpu_amdkfd_get_vram_usage
};
struct kfd2kgd_calls *amdgpu_amdkfd_gfx_8_0_get_functions(void)
@ -268,9 +279,15 @@ static int kgd_init_interrupts(struct kgd_dev *kgd, uint32_t pipe_id)
return 0;
}
static inline uint32_t get_sdma_base_addr(struct cik_sdma_rlc_registers *m)
static inline uint32_t get_sdma_base_addr(struct vi_sdma_mqd *m)
{
return 0;
uint32_t retval;
retval = m->sdma_engine_id * SDMA1_REGISTER_OFFSET +
m->sdma_queue_id * KFD_VI_SDMA_QUEUE_OFFSET;
pr_debug("kfd: sdma base address: 0x%x\n", retval);
return retval;
}
static inline struct vi_mqd *get_mqd(void *mqd)
@ -278,9 +295,9 @@ static inline struct vi_mqd *get_mqd(void *mqd)
return (struct vi_mqd *)mqd;
}
static inline struct cik_sdma_rlc_registers *get_sdma_mqd(void *mqd)
static inline struct vi_sdma_mqd *get_sdma_mqd(void *mqd)
{
return (struct cik_sdma_rlc_registers *)mqd;
return (struct vi_sdma_mqd *)mqd;
}
static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
@ -358,8 +375,138 @@ static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
return 0;
}
static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd)
static int kgd_hqd_dump(struct kgd_dev *kgd,
uint32_t pipe_id, uint32_t queue_id,
uint32_t (**dump)[2], uint32_t *n_regs)
{
struct amdgpu_device *adev = get_amdgpu_device(kgd);
uint32_t i = 0, reg;
#define HQD_N_REGS (54+4)
#define DUMP_REG(addr) do { \
if (WARN_ON_ONCE(i >= HQD_N_REGS)) \
break; \
(*dump)[i][0] = (addr) << 2; \
(*dump)[i++][1] = RREG32(addr); \
} while (0)
*dump = kmalloc(HQD_N_REGS*2*sizeof(uint32_t), GFP_KERNEL);
if (*dump == NULL)
return -ENOMEM;
acquire_queue(kgd, pipe_id, queue_id);
DUMP_REG(mmCOMPUTE_STATIC_THREAD_MGMT_SE0);
DUMP_REG(mmCOMPUTE_STATIC_THREAD_MGMT_SE1);
DUMP_REG(mmCOMPUTE_STATIC_THREAD_MGMT_SE2);
DUMP_REG(mmCOMPUTE_STATIC_THREAD_MGMT_SE3);
for (reg = mmCP_MQD_BASE_ADDR; reg <= mmCP_HQD_EOP_DONES; reg++)
DUMP_REG(reg);
release_queue(kgd);
WARN_ON_ONCE(i != HQD_N_REGS);
*n_regs = i;
return 0;
}
static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd,
uint32_t __user *wptr, struct mm_struct *mm)
{
struct amdgpu_device *adev = get_amdgpu_device(kgd);
struct vi_sdma_mqd *m;
unsigned long end_jiffies;
uint32_t sdma_base_addr;
uint32_t data;
m = get_sdma_mqd(mqd);
sdma_base_addr = get_sdma_base_addr(m);
WREG32(sdma_base_addr + mmSDMA0_RLC0_RB_CNTL,
m->sdmax_rlcx_rb_cntl & (~SDMA0_RLC0_RB_CNTL__RB_ENABLE_MASK));
end_jiffies = msecs_to_jiffies(2000) + jiffies;
while (true) {
data = RREG32(sdma_base_addr + mmSDMA0_RLC0_CONTEXT_STATUS);
if (data & SDMA0_RLC0_CONTEXT_STATUS__IDLE_MASK)
break;
if (time_after(jiffies, end_jiffies))
return -ETIME;
usleep_range(500, 1000);
}
if (m->sdma_engine_id) {
data = RREG32(mmSDMA1_GFX_CONTEXT_CNTL);
data = REG_SET_FIELD(data, SDMA1_GFX_CONTEXT_CNTL,
RESUME_CTX, 0);
WREG32(mmSDMA1_GFX_CONTEXT_CNTL, data);
} else {
data = RREG32(mmSDMA0_GFX_CONTEXT_CNTL);
data = REG_SET_FIELD(data, SDMA0_GFX_CONTEXT_CNTL,
RESUME_CTX, 0);
WREG32(mmSDMA0_GFX_CONTEXT_CNTL, data);
}
data = REG_SET_FIELD(m->sdmax_rlcx_doorbell, SDMA0_RLC0_DOORBELL,
ENABLE, 1);
WREG32(sdma_base_addr + mmSDMA0_RLC0_DOORBELL, data);
WREG32(sdma_base_addr + mmSDMA0_RLC0_RB_RPTR, m->sdmax_rlcx_rb_rptr);
if (read_user_wptr(mm, wptr, data))
WREG32(sdma_base_addr + mmSDMA0_RLC0_RB_WPTR, data);
else
WREG32(sdma_base_addr + mmSDMA0_RLC0_RB_WPTR,
m->sdmax_rlcx_rb_rptr);
WREG32(sdma_base_addr + mmSDMA0_RLC0_VIRTUAL_ADDR,
m->sdmax_rlcx_virtual_addr);
WREG32(sdma_base_addr + mmSDMA0_RLC0_RB_BASE, m->sdmax_rlcx_rb_base);
WREG32(sdma_base_addr + mmSDMA0_RLC0_RB_BASE_HI,
m->sdmax_rlcx_rb_base_hi);
WREG32(sdma_base_addr + mmSDMA0_RLC0_RB_RPTR_ADDR_LO,
m->sdmax_rlcx_rb_rptr_addr_lo);
WREG32(sdma_base_addr + mmSDMA0_RLC0_RB_RPTR_ADDR_HI,
m->sdmax_rlcx_rb_rptr_addr_hi);
data = REG_SET_FIELD(m->sdmax_rlcx_rb_cntl, SDMA0_RLC0_RB_CNTL,
RB_ENABLE, 1);
WREG32(sdma_base_addr + mmSDMA0_RLC0_RB_CNTL, data);
return 0;
}
static int kgd_hqd_sdma_dump(struct kgd_dev *kgd,
uint32_t engine_id, uint32_t queue_id,
uint32_t (**dump)[2], uint32_t *n_regs)
{
struct amdgpu_device *adev = get_amdgpu_device(kgd);
uint32_t sdma_offset = engine_id * SDMA1_REGISTER_OFFSET +
queue_id * KFD_VI_SDMA_QUEUE_OFFSET;
uint32_t i = 0, reg;
#undef HQD_N_REGS
#define HQD_N_REGS (19+4+2+3+7)
*dump = kmalloc(HQD_N_REGS*2*sizeof(uint32_t), GFP_KERNEL);
if (*dump == NULL)
return -ENOMEM;
for (reg = mmSDMA0_RLC0_RB_CNTL; reg <= mmSDMA0_RLC0_DOORBELL; reg++)
DUMP_REG(sdma_offset + reg);
for (reg = mmSDMA0_RLC0_VIRTUAL_ADDR; reg <= mmSDMA0_RLC0_WATERMARK;
reg++)
DUMP_REG(sdma_offset + reg);
for (reg = mmSDMA0_RLC0_CSA_ADDR_LO; reg <= mmSDMA0_RLC0_CSA_ADDR_HI;
reg++)
DUMP_REG(sdma_offset + reg);
for (reg = mmSDMA0_RLC0_IB_SUB_REMAIN; reg <= mmSDMA0_RLC0_DUMMY_REG;
reg++)
DUMP_REG(sdma_offset + reg);
for (reg = mmSDMA0_RLC0_MIDCMD_DATA0; reg <= mmSDMA0_RLC0_MIDCMD_CNTL;
reg++)
DUMP_REG(sdma_offset + reg);
WARN_ON_ONCE(i != HQD_N_REGS);
*n_regs = i;
return 0;
}
@ -388,7 +535,7 @@ static bool kgd_hqd_is_occupied(struct kgd_dev *kgd, uint64_t queue_address,
static bool kgd_hqd_sdma_is_occupied(struct kgd_dev *kgd, void *mqd)
{
struct amdgpu_device *adev = get_amdgpu_device(kgd);
struct cik_sdma_rlc_registers *m;
struct vi_sdma_mqd *m;
uint32_t sdma_base_addr;
uint32_t sdma_rlc_rb_cntl;
@ -509,10 +656,10 @@ static int kgd_hqd_sdma_destroy(struct kgd_dev *kgd, void *mqd,
unsigned int utimeout)
{
struct amdgpu_device *adev = get_amdgpu_device(kgd);
struct cik_sdma_rlc_registers *m;
struct vi_sdma_mqd *m;
uint32_t sdma_base_addr;
uint32_t temp;
int timeout = utimeout;
unsigned long end_jiffies = (utimeout * HZ / 1000) + jiffies;
m = get_sdma_mqd(mqd);
sdma_base_addr = get_sdma_base_addr(m);
@ -523,18 +670,19 @@ static int kgd_hqd_sdma_destroy(struct kgd_dev *kgd, void *mqd,
while (true) {
temp = RREG32(sdma_base_addr + mmSDMA0_RLC0_CONTEXT_STATUS);
if (temp & SDMA0_STATUS_REG__RB_CMD_IDLE__SHIFT)
if (temp & SDMA0_RLC0_CONTEXT_STATUS__IDLE_MASK)
break;
if (timeout <= 0)
if (time_after(jiffies, end_jiffies))
return -ETIME;
msleep(20);
timeout -= 20;
usleep_range(500, 1000);
}
WREG32(sdma_base_addr + mmSDMA0_RLC0_DOORBELL, 0);
WREG32(sdma_base_addr + mmSDMA0_RLC0_RB_RPTR, 0);
WREG32(sdma_base_addr + mmSDMA0_RLC0_RB_WPTR, 0);
WREG32(sdma_base_addr + mmSDMA0_RLC0_RB_BASE, 0);
WREG32(sdma_base_addr + mmSDMA0_RLC0_RB_CNTL,
RREG32(sdma_base_addr + mmSDMA0_RLC0_RB_CNTL) |
SDMA0_RLC0_RB_CNTL__RB_ENABLE_MASK);
m->sdmax_rlcx_rb_rptr = RREG32(sdma_base_addr + mmSDMA0_RLC0_RB_RPTR);
return 0;
}

View file

@ -27,6 +27,7 @@
#include <drm/amdgpu_drm.h>
#include "amdgpu.h"
#include "amdgpu_atombios.h"
#include "amdgpu_atomfirmware.h"
#include "amdgpu_i2c.h"
#include "atom.h"
@ -690,12 +691,12 @@ int amdgpu_atombios_get_clock_info(struct amdgpu_device *adev)
le32_to_cpu(firmware_info->info_21.ulDefaultDispEngineClkFreq);
/* set a reasonable default for DP */
if (adev->clock.default_dispclk < 53900) {
DRM_INFO("Changing default dispclk from %dMhz to 600Mhz\n",
adev->clock.default_dispclk / 100);
DRM_DEBUG("Changing default dispclk from %dMhz to 600Mhz\n",
adev->clock.default_dispclk / 100);
adev->clock.default_dispclk = 60000;
} else if (adev->clock.default_dispclk <= 60000) {
DRM_INFO("Changing default dispclk from %dMhz to 625Mhz\n",
adev->clock.default_dispclk / 100);
DRM_DEBUG("Changing default dispclk from %dMhz to 625Mhz\n",
adev->clock.default_dispclk / 100);
adev->clock.default_dispclk = 62500;
}
adev->clock.dp_extclk =
@ -1699,7 +1700,7 @@ void amdgpu_atombios_scratch_regs_lock(struct amdgpu_device *adev, bool lock)
WREG32(adev->bios_scratch_reg_offset + 6, bios_6_scratch);
}
void amdgpu_atombios_scratch_regs_init(struct amdgpu_device *adev)
static void amdgpu_atombios_scratch_regs_init(struct amdgpu_device *adev)
{
uint32_t bios_2_scratch, bios_6_scratch;
@ -1721,28 +1722,6 @@ void amdgpu_atombios_scratch_regs_init(struct amdgpu_device *adev)
WREG32(adev->bios_scratch_reg_offset + 6, bios_6_scratch);
}
void amdgpu_atombios_scratch_regs_save(struct amdgpu_device *adev)
{
int i;
for (i = 0; i < AMDGPU_BIOS_NUM_SCRATCH; i++)
adev->bios_scratch[i] = RREG32(adev->bios_scratch_reg_offset + i);
}
void amdgpu_atombios_scratch_regs_restore(struct amdgpu_device *adev)
{
int i;
/*
* VBIOS will check ASIC_INIT_COMPLETE bit to decide if
* execute ASIC_Init posting via driver
*/
adev->bios_scratch[7] &= ~ATOM_S7_ASIC_INIT_COMPLETE_MASK;
for (i = 0; i < AMDGPU_BIOS_NUM_SCRATCH; i++)
WREG32(adev->bios_scratch_reg_offset + i, adev->bios_scratch[i]);
}
void amdgpu_atombios_scratch_regs_engine_hung(struct amdgpu_device *adev,
bool hung)
{
@ -1798,7 +1777,7 @@ void amdgpu_atombios_copy_swap(u8 *dst, u8 *src, u8 num_bytes, bool to_le)
#endif
}
int amdgpu_atombios_allocate_fb_scratch(struct amdgpu_device *adev)
static int amdgpu_atombios_allocate_fb_scratch(struct amdgpu_device *adev)
{
struct atom_context *ctx = adev->mode_info.atom_context;
int index = GetIndexIntoMasterTable(DATA, VRAM_UsageByFirmware);
@ -1841,3 +1820,234 @@ int amdgpu_atombios_allocate_fb_scratch(struct amdgpu_device *adev)
ctx->scratch_size_bytes = usage_bytes;
return 0;
}
/* ATOM accessor methods */
/*
* ATOM is an interpreted byte code stored in tables in the vbios. The
* driver registers callbacks to access registers and the interpreter
* in the driver parses the tables and executes then to program specific
* actions (set display modes, asic init, etc.). See amdgpu_atombios.c,
* atombios.h, and atom.c
*/
/**
* cail_pll_read - read PLL register
*
* @info: atom card_info pointer
* @reg: PLL register offset
*
* Provides a PLL register accessor for the atom interpreter (r4xx+).
* Returns the value of the PLL register.
*/
static uint32_t cail_pll_read(struct card_info *info, uint32_t reg)
{
return 0;
}
/**
* cail_pll_write - write PLL register
*
* @info: atom card_info pointer
* @reg: PLL register offset
* @val: value to write to the pll register
*
* Provides a PLL register accessor for the atom interpreter (r4xx+).
*/
static void cail_pll_write(struct card_info *info, uint32_t reg, uint32_t val)
{
}
/**
* cail_mc_read - read MC (Memory Controller) register
*
* @info: atom card_info pointer
* @reg: MC register offset
*
* Provides an MC register accessor for the atom interpreter (r4xx+).
* Returns the value of the MC register.
*/
static uint32_t cail_mc_read(struct card_info *info, uint32_t reg)
{
return 0;
}
/**
* cail_mc_write - write MC (Memory Controller) register
*
* @info: atom card_info pointer
* @reg: MC register offset
* @val: value to write to the pll register
*
* Provides a MC register accessor for the atom interpreter (r4xx+).
*/
static void cail_mc_write(struct card_info *info, uint32_t reg, uint32_t val)
{
}
/**
* cail_reg_write - write MMIO register
*
* @info: atom card_info pointer
* @reg: MMIO register offset
* @val: value to write to the pll register
*
* Provides a MMIO register accessor for the atom interpreter (r4xx+).
*/
static void cail_reg_write(struct card_info *info, uint32_t reg, uint32_t val)
{
struct amdgpu_device *adev = info->dev->dev_private;
WREG32(reg, val);
}
/**
* cail_reg_read - read MMIO register
*
* @info: atom card_info pointer
* @reg: MMIO register offset
*
* Provides an MMIO register accessor for the atom interpreter (r4xx+).
* Returns the value of the MMIO register.
*/
static uint32_t cail_reg_read(struct card_info *info, uint32_t reg)
{
struct amdgpu_device *adev = info->dev->dev_private;
uint32_t r;
r = RREG32(reg);
return r;
}
/**
* cail_ioreg_write - write IO register
*
* @info: atom card_info pointer
* @reg: IO register offset
* @val: value to write to the pll register
*
* Provides a IO register accessor for the atom interpreter (r4xx+).
*/
static void cail_ioreg_write(struct card_info *info, uint32_t reg, uint32_t val)
{
struct amdgpu_device *adev = info->dev->dev_private;
WREG32_IO(reg, val);
}
/**
* cail_ioreg_read - read IO register
*
* @info: atom card_info pointer
* @reg: IO register offset
*
* Provides an IO register accessor for the atom interpreter (r4xx+).
* Returns the value of the IO register.
*/
static uint32_t cail_ioreg_read(struct card_info *info, uint32_t reg)
{
struct amdgpu_device *adev = info->dev->dev_private;
uint32_t r;
r = RREG32_IO(reg);
return r;
}
static ssize_t amdgpu_atombios_get_vbios_version(struct device *dev,
struct device_attribute *attr,
char *buf)
{
struct drm_device *ddev = dev_get_drvdata(dev);
struct amdgpu_device *adev = ddev->dev_private;
struct atom_context *ctx = adev->mode_info.atom_context;
return snprintf(buf, PAGE_SIZE, "%s\n", ctx->vbios_version);
}
static DEVICE_ATTR(vbios_version, 0444, amdgpu_atombios_get_vbios_version,
NULL);
/**
* amdgpu_atombios_fini - free the driver info and callbacks for atombios
*
* @adev: amdgpu_device pointer
*
* Frees the driver info and register access callbacks for the ATOM
* interpreter (r4xx+).
* Called at driver shutdown.
*/
void amdgpu_atombios_fini(struct amdgpu_device *adev)
{
if (adev->mode_info.atom_context) {
kfree(adev->mode_info.atom_context->scratch);
kfree(adev->mode_info.atom_context->iio);
}
kfree(adev->mode_info.atom_context);
adev->mode_info.atom_context = NULL;
kfree(adev->mode_info.atom_card_info);
adev->mode_info.atom_card_info = NULL;
device_remove_file(adev->dev, &dev_attr_vbios_version);
}
/**
* amdgpu_atombios_init - init the driver info and callbacks for atombios
*
* @adev: amdgpu_device pointer
*
* Initializes the driver info and register access callbacks for the
* ATOM interpreter (r4xx+).
* Returns 0 on sucess, -ENOMEM on failure.
* Called at driver startup.
*/
int amdgpu_atombios_init(struct amdgpu_device *adev)
{
struct card_info *atom_card_info =
kzalloc(sizeof(struct card_info), GFP_KERNEL);
int ret;
if (!atom_card_info)
return -ENOMEM;
adev->mode_info.atom_card_info = atom_card_info;
atom_card_info->dev = adev->ddev;
atom_card_info->reg_read = cail_reg_read;
atom_card_info->reg_write = cail_reg_write;
/* needed for iio ops */
if (adev->rio_mem) {
atom_card_info->ioreg_read = cail_ioreg_read;
atom_card_info->ioreg_write = cail_ioreg_write;
} else {
DRM_DEBUG("PCI I/O BAR is not found. Using MMIO to access ATOM BIOS\n");
atom_card_info->ioreg_read = cail_reg_read;
atom_card_info->ioreg_write = cail_reg_write;
}
atom_card_info->mc_read = cail_mc_read;
atom_card_info->mc_write = cail_mc_write;
atom_card_info->pll_read = cail_pll_read;
atom_card_info->pll_write = cail_pll_write;
adev->mode_info.atom_context = amdgpu_atom_parse(atom_card_info, adev->bios);
if (!adev->mode_info.atom_context) {
amdgpu_atombios_fini(adev);
return -ENOMEM;
}
mutex_init(&adev->mode_info.atom_context->mutex);
if (adev->is_atom_fw) {
amdgpu_atomfirmware_scratch_regs_init(adev);
amdgpu_atomfirmware_allocate_fb_scratch(adev);
} else {
amdgpu_atombios_scratch_regs_init(adev);
amdgpu_atombios_allocate_fb_scratch(adev);
}
ret = device_create_file(adev->dev, &dev_attr_vbios_version);
if (ret) {
DRM_ERROR("Failed to create device file for VBIOS version\n");
return ret;
}
return 0;
}

View file

@ -195,9 +195,6 @@ int amdgpu_atombios_init_mc_reg_table(struct amdgpu_device *adev,
bool amdgpu_atombios_has_gpu_virtualization_table(struct amdgpu_device *adev);
void amdgpu_atombios_scratch_regs_lock(struct amdgpu_device *adev, bool lock);
void amdgpu_atombios_scratch_regs_init(struct amdgpu_device *adev);
void amdgpu_atombios_scratch_regs_save(struct amdgpu_device *adev);
void amdgpu_atombios_scratch_regs_restore(struct amdgpu_device *adev);
void amdgpu_atombios_scratch_regs_engine_hung(struct amdgpu_device *adev,
bool hung);
bool amdgpu_atombios_scratch_need_asic_init(struct amdgpu_device *adev);
@ -219,6 +216,7 @@ int amdgpu_atombios_get_svi2_info(struct amdgpu_device *adev,
u8 voltage_type,
u8 *svd_gpio_id, u8 *svc_gpio_id);
int amdgpu_atombios_allocate_fb_scratch(struct amdgpu_device *adev);
void amdgpu_atombios_fini(struct amdgpu_device *adev);
int amdgpu_atombios_init(struct amdgpu_device *adev);
#endif

View file

@ -14,6 +14,16 @@
#include "amd_acpi.h"
#define AMDGPU_PX_QUIRK_FORCE_ATPX (1 << 0)
struct amdgpu_px_quirk {
u32 chip_vendor;
u32 chip_device;
u32 subsys_vendor;
u32 subsys_device;
u32 px_quirk_flags;
};
struct amdgpu_atpx_functions {
bool px_params;
bool power_cntl;
@ -35,6 +45,7 @@ struct amdgpu_atpx {
static struct amdgpu_atpx_priv {
bool atpx_detected;
bool bridge_pm_usable;
unsigned int quirks;
/* handle for device - and atpx */
acpi_handle dhandle;
acpi_handle other_handle;
@ -205,13 +216,19 @@ static int amdgpu_atpx_validate(struct amdgpu_atpx *atpx)
atpx->is_hybrid = false;
if (valid_bits & ATPX_MS_HYBRID_GFX_SUPPORTED) {
printk("ATPX Hybrid Graphics\n");
/*
* Disable legacy PM methods only when pcie port PM is usable,
* otherwise the device might fail to power off or power on.
*/
atpx->functions.power_cntl = !amdgpu_atpx_priv.bridge_pm_usable;
atpx->is_hybrid = true;
if (amdgpu_atpx_priv.quirks & AMDGPU_PX_QUIRK_FORCE_ATPX) {
printk("ATPX Hybrid Graphics, forcing to ATPX\n");
atpx->functions.power_cntl = true;
atpx->is_hybrid = false;
} else {
printk("ATPX Hybrid Graphics\n");
/*
* Disable legacy PM methods only when pcie port PM is usable,
* otherwise the device might fail to power off or power on.
*/
atpx->functions.power_cntl = !amdgpu_atpx_priv.bridge_pm_usable;
atpx->is_hybrid = true;
}
}
atpx->dgpu_req_power_for_displays = false;
@ -547,6 +564,30 @@ static const struct vga_switcheroo_handler amdgpu_atpx_handler = {
.get_client_id = amdgpu_atpx_get_client_id,
};
static const struct amdgpu_px_quirk amdgpu_px_quirk_list[] = {
/* HG _PR3 doesn't seem to work on this A+A weston board */
{ 0x1002, 0x6900, 0x1002, 0x0124, AMDGPU_PX_QUIRK_FORCE_ATPX },
{ 0x1002, 0x6900, 0x1028, 0x0812, AMDGPU_PX_QUIRK_FORCE_ATPX },
{ 0, 0, 0, 0, 0 },
};
static void amdgpu_atpx_get_quirks(struct pci_dev *pdev)
{
const struct amdgpu_px_quirk *p = amdgpu_px_quirk_list;
/* Apply PX quirks */
while (p && p->chip_device != 0) {
if (pdev->vendor == p->chip_vendor &&
pdev->device == p->chip_device &&
pdev->subsystem_vendor == p->subsys_vendor &&
pdev->subsystem_device == p->subsys_device) {
amdgpu_atpx_priv.quirks |= p->px_quirk_flags;
break;
}
++p;
}
}
/**
* amdgpu_atpx_detect - detect whether we have PX
*
@ -570,6 +611,7 @@ static bool amdgpu_atpx_detect(void)
parent_pdev = pci_upstream_bridge(pdev);
d3_supported |= parent_pdev && parent_pdev->bridge_d3;
amdgpu_atpx_get_quirks(pdev);
}
while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_OTHER << 8, pdev)) != NULL) {
@ -579,6 +621,7 @@ static bool amdgpu_atpx_detect(void)
parent_pdev = pci_upstream_bridge(pdev);
d3_supported |= parent_pdev && parent_pdev->bridge_d3;
amdgpu_atpx_get_quirks(pdev);
}
if (has_atpx && vga_count == 2) {

View file

@ -93,7 +93,7 @@ static bool igp_read_bios_from_vram(struct amdgpu_device *adev)
resource_size_t size = 256 * 1024; /* ??? */
if (!(adev->flags & AMD_IS_APU))
if (amdgpu_need_post(adev))
if (amdgpu_device_need_post(adev))
return false;
adev->bios = NULL;

View file

@ -801,6 +801,11 @@ static int amdgpu_cgs_get_firmware_info(struct cgs_device *cgs_device,
else
strcpy(fw_name, "amdgpu/vega10_smc.bin");
break;
case CHIP_CARRIZO:
case CHIP_STONEY:
case CHIP_RAVEN:
adev->pm.fw_version = info->version;
return 0;
default:
DRM_ERROR("SMC firmware not supported\n");
return -EINVAL;
@ -948,7 +953,6 @@ static int amdgpu_cgs_get_active_displays_info(struct cgs_device *cgs_device,
(amdgpu_crtc->v_border * 2);
mode_info->vblank_time_us = vblank_lines * line_time_us;
mode_info->refresh_rate = drm_mode_vrefresh(&amdgpu_crtc->hw_mode);
mode_info->ref_clock = adev->clock.spll.reference_freq;
mode_info = NULL;
}
}
@ -958,7 +962,6 @@ static int amdgpu_cgs_get_active_displays_info(struct cgs_device *cgs_device,
if (mode_info != NULL) {
mode_info->vblank_time_us = adev->pm.pm_display_cfg.min_vblank_time;
mode_info->refresh_rate = adev->pm.pm_display_cfg.vrefresh;
mode_info->ref_clock = adev->clock.spll.reference_freq;
}
}
return 0;

View file

@ -358,7 +358,6 @@ static int amdgpu_connector_ddc_get_modes(struct drm_connector *connector)
if (amdgpu_connector->edid) {
drm_mode_connector_update_edid_property(connector, amdgpu_connector->edid);
ret = drm_add_edid_modes(connector, amdgpu_connector->edid);
drm_edid_to_eld(connector, amdgpu_connector->edid);
return ret;
}
drm_mode_connector_update_edid_property(connector, NULL);

View file

@ -90,6 +90,12 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p, void *data)
goto free_chunk;
}
/* skip guilty context job */
if (atomic_read(&p->ctx->guilty) == 1) {
ret = -ECANCELED;
goto free_chunk;
}
mutex_lock(&p->ctx->lock);
/* get chunks */
@ -337,7 +343,12 @@ static int amdgpu_cs_bo_validate(struct amdgpu_cs_parser *p,
struct amdgpu_bo *bo)
{
struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
u64 initial_bytes_moved, bytes_moved;
struct ttm_operation_ctx ctx = {
.interruptible = true,
.no_wait_gpu = false,
.allow_reserved_eviction = false,
.resv = bo->tbo.resv
};
uint32_t domain;
int r;
@ -367,15 +378,13 @@ static int amdgpu_cs_bo_validate(struct amdgpu_cs_parser *p,
retry:
amdgpu_ttm_placement_from_domain(bo, domain);
initial_bytes_moved = atomic64_read(&adev->num_bytes_moved);
r = ttm_bo_validate(&bo->tbo, &bo->placement, true, false);
bytes_moved = atomic64_read(&adev->num_bytes_moved) -
initial_bytes_moved;
p->bytes_moved += bytes_moved;
r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
p->bytes_moved += ctx.bytes_moved;
if (adev->mc.visible_vram_size < adev->mc.real_vram_size &&
bo->tbo.mem.mem_type == TTM_PL_VRAM &&
bo->tbo.mem.start < adev->mc.visible_vram_size >> PAGE_SHIFT)
p->bytes_moved_vis += bytes_moved;
p->bytes_moved_vis += ctx.bytes_moved;
if (unlikely(r == -ENOMEM) && domain != bo->allowed_domains) {
domain = bo->allowed_domains;
@ -390,6 +399,7 @@ static bool amdgpu_cs_try_evict(struct amdgpu_cs_parser *p,
struct amdgpu_bo *validated)
{
uint32_t domain = validated->allowed_domains;
struct ttm_operation_ctx ctx = { true, false };
int r;
if (!p->evictable)
@ -431,7 +441,7 @@ static bool amdgpu_cs_try_evict(struct amdgpu_cs_parser *p,
bo->tbo.mem.mem_type == TTM_PL_VRAM &&
bo->tbo.mem.start < adev->mc.visible_vram_size >> PAGE_SHIFT;
initial_bytes_moved = atomic64_read(&adev->num_bytes_moved);
r = ttm_bo_validate(&bo->tbo, &bo->placement, true, false);
r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
bytes_moved = atomic64_read(&adev->num_bytes_moved) -
initial_bytes_moved;
p->bytes_moved += bytes_moved;
@ -470,6 +480,7 @@ static int amdgpu_cs_validate(void *param, struct amdgpu_bo *bo)
static int amdgpu_cs_list_validate(struct amdgpu_cs_parser *p,
struct list_head *validated)
{
struct ttm_operation_ctx ctx = { true, false };
struct amdgpu_bo_list_entry *lobj;
int r;
@ -487,8 +498,7 @@ static int amdgpu_cs_list_validate(struct amdgpu_cs_parser *p,
lobj->user_pages) {
amdgpu_ttm_placement_from_domain(bo,
AMDGPU_GEM_DOMAIN_CPU);
r = ttm_bo_validate(&bo->tbo, &bo->placement, true,
false);
r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
if (r)
return r;
amdgpu_ttm_tt_set_user_pages(bo->tbo.ttm,
@ -678,7 +688,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
if (!r && p->uf_entry.robj) {
struct amdgpu_bo *uf = p->uf_entry.robj;
r = amdgpu_ttm_bind(&uf->tbo, &uf->tbo.mem);
r = amdgpu_ttm_alloc_gart(&uf->tbo);
p->job->uf_addr += amdgpu_bo_gpu_offset(uf);
}
@ -768,10 +778,6 @@ static int amdgpu_bo_vm_update_pte(struct amdgpu_cs_parser *p)
struct amdgpu_bo *bo;
int i, r;
r = amdgpu_vm_update_directories(adev, vm);
if (r)
return r;
r = amdgpu_vm_clear_freed(adev, vm, NULL);
if (r)
return r;
@ -781,7 +787,7 @@ static int amdgpu_bo_vm_update_pte(struct amdgpu_cs_parser *p)
return r;
r = amdgpu_sync_fence(adev, &p->job->sync,
fpriv->prt_va->last_pt_update);
fpriv->prt_va->last_pt_update, false);
if (r)
return r;
@ -795,7 +801,7 @@ static int amdgpu_bo_vm_update_pte(struct amdgpu_cs_parser *p)
return r;
f = bo_va->last_pt_update;
r = amdgpu_sync_fence(adev, &p->job->sync, f);
r = amdgpu_sync_fence(adev, &p->job->sync, f, false);
if (r)
return r;
}
@ -818,7 +824,7 @@ static int amdgpu_bo_vm_update_pte(struct amdgpu_cs_parser *p)
return r;
f = bo_va->last_pt_update;
r = amdgpu_sync_fence(adev, &p->job->sync, f);
r = amdgpu_sync_fence(adev, &p->job->sync, f, false);
if (r)
return r;
}
@ -829,7 +835,11 @@ static int amdgpu_bo_vm_update_pte(struct amdgpu_cs_parser *p)
if (r)
return r;
r = amdgpu_sync_fence(adev, &p->job->sync, vm->last_update);
r = amdgpu_vm_update_directories(adev, vm);
if (r)
return r;
r = amdgpu_sync_fence(adev, &p->job->sync, vm->last_update, false);
if (r)
return r;
@ -865,8 +875,8 @@ static int amdgpu_cs_ib_vm_chunk(struct amdgpu_device *adev,
struct amdgpu_bo_va_mapping *m;
struct amdgpu_bo *aobj = NULL;
struct amdgpu_cs_chunk *chunk;
uint64_t offset, va_start;
struct amdgpu_ib *ib;
uint64_t offset;
uint8_t *kptr;
chunk = &p->chunks[i];
@ -876,14 +886,14 @@ static int amdgpu_cs_ib_vm_chunk(struct amdgpu_device *adev,
if (chunk->chunk_id != AMDGPU_CHUNK_ID_IB)
continue;
r = amdgpu_cs_find_mapping(p, chunk_ib->va_start,
&aobj, &m);
va_start = chunk_ib->va_start & AMDGPU_VA_HOLE_MASK;
r = amdgpu_cs_find_mapping(p, va_start, &aobj, &m);
if (r) {
DRM_ERROR("IB va_start is invalid\n");
return r;
}
if ((chunk_ib->va_start + chunk_ib->ib_bytes) >
if ((va_start + chunk_ib->ib_bytes) >
(m->last + 1) * AMDGPU_GPU_PAGE_SIZE) {
DRM_ERROR("IB va_start+ib_bytes is invalid\n");
return -EINVAL;
@ -896,7 +906,7 @@ static int amdgpu_cs_ib_vm_chunk(struct amdgpu_device *adev,
}
offset = m->start * AMDGPU_GPU_PAGE_SIZE;
kptr += chunk_ib->va_start - offset;
kptr += va_start - offset;
memcpy(ib->ptr, kptr, chunk_ib->ib_bytes);
amdgpu_bo_kunmap(aobj);
@ -1033,8 +1043,8 @@ static int amdgpu_cs_process_fence_dep(struct amdgpu_cs_parser *p,
amdgpu_ctx_put(ctx);
return r;
} else if (fence) {
r = amdgpu_sync_fence(p->adev, &p->job->sync,
fence);
r = amdgpu_sync_fence(p->adev, &p->job->sync, fence,
true);
dma_fence_put(fence);
amdgpu_ctx_put(ctx);
if (r)
@ -1053,7 +1063,7 @@ static int amdgpu_syncobj_lookup_and_add_to_sync(struct amdgpu_cs_parser *p,
if (r)
return r;
r = amdgpu_sync_fence(p->adev, &p->job->sync, fence);
r = amdgpu_sync_fence(p->adev, &p->job->sync, fence, true);
dma_fence_put(fence);
return r;
@ -1145,7 +1155,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
union drm_amdgpu_cs *cs)
{
struct amdgpu_ring *ring = p->job->ring;
struct amd_sched_entity *entity = &p->ctx->rings[ring->idx].entity;
struct drm_sched_entity *entity = &p->ctx->rings[ring->idx].entity;
struct amdgpu_job *job;
unsigned i;
uint64_t seq;
@ -1168,7 +1178,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
job = p->job;
p->job = NULL;
r = amd_sched_job_init(&job->base, &ring->sched, entity, p->filp);
r = drm_sched_job_init(&job->base, &ring->sched, entity, p->filp);
if (r) {
amdgpu_job_free(job);
amdgpu_mn_unlock(p->mn);
@ -1194,11 +1204,10 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
job->uf_sequence = seq;
amdgpu_job_free_resources(job);
amdgpu_ring_priority_get(job->ring,
amd_sched_get_job_priority(&job->base));
amdgpu_ring_priority_get(job->ring, job->base.s_priority);
trace_amdgpu_cs_ioctl(job);
amd_sched_entity_push_job(&job->base);
drm_sched_entity_push_job(&job->base, entity);
ttm_eu_fence_buffer_objects(&p->ticket, &p->validated, p->fence);
amdgpu_mn_unlock(p->mn);
@ -1570,6 +1579,7 @@ int amdgpu_cs_find_mapping(struct amdgpu_cs_parser *parser,
struct amdgpu_bo_va_mapping **map)
{
struct amdgpu_fpriv *fpriv = parser->filp->driver_priv;
struct ttm_operation_ctx ctx = { false, false };
struct amdgpu_vm *vm = &fpriv->vm;
struct amdgpu_bo_va_mapping *mapping;
int r;
@ -1590,11 +1600,10 @@ int amdgpu_cs_find_mapping(struct amdgpu_cs_parser *parser,
if (!((*bo)->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS)) {
(*bo)->flags |= AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
amdgpu_ttm_placement_from_domain(*bo, (*bo)->allowed_domains);
r = ttm_bo_validate(&(*bo)->tbo, &(*bo)->placement, false,
false);
r = ttm_bo_validate(&(*bo)->tbo, &(*bo)->placement, &ctx);
if (r)
return r;
}
return amdgpu_ttm_bind(&(*bo)->tbo, &(*bo)->tbo.mem);
return amdgpu_ttm_alloc_gart(&(*bo)->tbo);
}

View file

@ -28,10 +28,10 @@
#include "amdgpu_sched.h"
static int amdgpu_ctx_priority_permit(struct drm_file *filp,
enum amd_sched_priority priority)
enum drm_sched_priority priority)
{
/* NORMAL and below are accessible by everyone */
if (priority <= AMD_SCHED_PRIORITY_NORMAL)
if (priority <= DRM_SCHED_PRIORITY_NORMAL)
return 0;
if (capable(CAP_SYS_NICE))
@ -44,14 +44,14 @@ static int amdgpu_ctx_priority_permit(struct drm_file *filp,
}
static int amdgpu_ctx_init(struct amdgpu_device *adev,
enum amd_sched_priority priority,
enum drm_sched_priority priority,
struct drm_file *filp,
struct amdgpu_ctx *ctx)
{
unsigned i, j;
int r;
if (priority < 0 || priority >= AMD_SCHED_PRIORITY_MAX)
if (priority < 0 || priority >= DRM_SCHED_PRIORITY_MAX)
return -EINVAL;
r = amdgpu_ctx_priority_permit(filp, priority);
@ -75,22 +75,23 @@ static int amdgpu_ctx_init(struct amdgpu_device *adev,
}
ctx->reset_counter = atomic_read(&adev->gpu_reset_counter);
ctx->reset_counter_query = ctx->reset_counter;
ctx->vram_lost_counter = atomic_read(&adev->vram_lost_counter);
ctx->init_priority = priority;
ctx->override_priority = AMD_SCHED_PRIORITY_UNSET;
ctx->override_priority = DRM_SCHED_PRIORITY_UNSET;
/* create context entity for each ring */
for (i = 0; i < adev->num_rings; i++) {
struct amdgpu_ring *ring = adev->rings[i];
struct amd_sched_rq *rq;
struct drm_sched_rq *rq;
rq = &ring->sched.sched_rq[priority];
if (ring == &adev->gfx.kiq.ring)
continue;
r = amd_sched_entity_init(&ring->sched, &ctx->rings[i].entity,
rq, amdgpu_sched_jobs);
r = drm_sched_entity_init(&ring->sched, &ctx->rings[i].entity,
rq, amdgpu_sched_jobs, &ctx->guilty);
if (r)
goto failed;
}
@ -103,7 +104,7 @@ static int amdgpu_ctx_init(struct amdgpu_device *adev,
failed:
for (j = 0; j < i; j++)
amd_sched_entity_fini(&adev->rings[j]->sched,
drm_sched_entity_fini(&adev->rings[j]->sched,
&ctx->rings[j].entity);
kfree(ctx->fences);
ctx->fences = NULL;
@ -125,7 +126,7 @@ static void amdgpu_ctx_fini(struct amdgpu_ctx *ctx)
ctx->fences = NULL;
for (i = 0; i < adev->num_rings; i++)
amd_sched_entity_fini(&adev->rings[i]->sched,
drm_sched_entity_fini(&adev->rings[i]->sched,
&ctx->rings[i].entity);
amdgpu_queue_mgr_fini(adev, &ctx->queue_mgr);
@ -136,7 +137,7 @@ static void amdgpu_ctx_fini(struct amdgpu_ctx *ctx)
static int amdgpu_ctx_alloc(struct amdgpu_device *adev,
struct amdgpu_fpriv *fpriv,
struct drm_file *filp,
enum amd_sched_priority priority,
enum drm_sched_priority priority,
uint32_t *id)
{
struct amdgpu_ctx_mgr *mgr = &fpriv->ctx_mgr;
@ -216,11 +217,45 @@ static int amdgpu_ctx_query(struct amdgpu_device *adev,
/* determine if a GPU reset has occured since the last call */
reset_counter = atomic_read(&adev->gpu_reset_counter);
/* TODO: this should ideally return NO, GUILTY, or INNOCENT. */
if (ctx->reset_counter == reset_counter)
if (ctx->reset_counter_query == reset_counter)
out->state.reset_status = AMDGPU_CTX_NO_RESET;
else
out->state.reset_status = AMDGPU_CTX_UNKNOWN_RESET;
ctx->reset_counter = reset_counter;
ctx->reset_counter_query = reset_counter;
mutex_unlock(&mgr->lock);
return 0;
}
static int amdgpu_ctx_query2(struct amdgpu_device *adev,
struct amdgpu_fpriv *fpriv, uint32_t id,
union drm_amdgpu_ctx_out *out)
{
struct amdgpu_ctx *ctx;
struct amdgpu_ctx_mgr *mgr;
if (!fpriv)
return -EINVAL;
mgr = &fpriv->ctx_mgr;
mutex_lock(&mgr->lock);
ctx = idr_find(&mgr->ctx_handles, id);
if (!ctx) {
mutex_unlock(&mgr->lock);
return -EINVAL;
}
out->state.flags = 0x0;
out->state.hangs = 0x0;
if (ctx->reset_counter != atomic_read(&adev->gpu_reset_counter))
out->state.flags |= AMDGPU_CTX_QUERY2_FLAGS_RESET;
if (ctx->vram_lost_counter != atomic_read(&adev->vram_lost_counter))
out->state.flags |= AMDGPU_CTX_QUERY2_FLAGS_VRAMLOST;
if (atomic_read(&ctx->guilty))
out->state.flags |= AMDGPU_CTX_QUERY2_FLAGS_GUILTY;
mutex_unlock(&mgr->lock);
return 0;
@ -231,7 +266,7 @@ int amdgpu_ctx_ioctl(struct drm_device *dev, void *data,
{
int r;
uint32_t id;
enum amd_sched_priority priority;
enum drm_sched_priority priority;
union drm_amdgpu_ctx *args = data;
struct amdgpu_device *adev = dev->dev_private;
@ -243,8 +278,8 @@ int amdgpu_ctx_ioctl(struct drm_device *dev, void *data,
/* For backwards compatibility reasons, we need to accept
* ioctls with garbage in the priority field */
if (priority == AMD_SCHED_PRIORITY_INVALID)
priority = AMD_SCHED_PRIORITY_NORMAL;
if (priority == DRM_SCHED_PRIORITY_INVALID)
priority = DRM_SCHED_PRIORITY_NORMAL;
switch (args->in.op) {
case AMDGPU_CTX_OP_ALLOC_CTX:
@ -257,6 +292,9 @@ int amdgpu_ctx_ioctl(struct drm_device *dev, void *data,
case AMDGPU_CTX_OP_QUERY_STATE:
r = amdgpu_ctx_query(adev, fpriv, id, &args->out);
break;
case AMDGPU_CTX_OP_QUERY_STATE2:
r = amdgpu_ctx_query2(adev, fpriv, id, &args->out);
break;
default:
return -EINVAL;
}
@ -347,18 +385,18 @@ struct dma_fence *amdgpu_ctx_get_fence(struct amdgpu_ctx *ctx,
}
void amdgpu_ctx_priority_override(struct amdgpu_ctx *ctx,
enum amd_sched_priority priority)
enum drm_sched_priority priority)
{
int i;
struct amdgpu_device *adev = ctx->adev;
struct amd_sched_rq *rq;
struct amd_sched_entity *entity;
struct drm_sched_rq *rq;
struct drm_sched_entity *entity;
struct amdgpu_ring *ring;
enum amd_sched_priority ctx_prio;
enum drm_sched_priority ctx_prio;
ctx->override_priority = priority;
ctx_prio = (ctx->override_priority == AMD_SCHED_PRIORITY_UNSET) ?
ctx_prio = (ctx->override_priority == DRM_SCHED_PRIORITY_UNSET) ?
ctx->init_priority : ctx->override_priority;
for (i = 0; i < adev->num_rings; i++) {
@ -369,7 +407,7 @@ void amdgpu_ctx_priority_override(struct amdgpu_ctx *ctx,
if (ring->funcs->type == AMDGPU_RING_TYPE_KIQ)
continue;
amd_sched_entity_set_rq(entity, rq);
drm_sched_entity_set_rq(entity, rq);
}
}

View file

@ -0,0 +1,792 @@
/*
* Copyright 2008 Advanced Micro Devices, Inc.
* Copyright 2008 Red Hat Inc.
* Copyright 2009 Jerome Glisse.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
* OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
* ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*
*/
#include <linux/kthread.h>
#include <drm/drmP.h>
#include <linux/debugfs.h>
#include "amdgpu.h"
/*
* Debugfs
*/
int amdgpu_debugfs_add_files(struct amdgpu_device *adev,
const struct drm_info_list *files,
unsigned nfiles)
{
unsigned i;
for (i = 0; i < adev->debugfs_count; i++) {
if (adev->debugfs[i].files == files) {
/* Already registered */
return 0;
}
}
i = adev->debugfs_count + 1;
if (i > AMDGPU_DEBUGFS_MAX_COMPONENTS) {
DRM_ERROR("Reached maximum number of debugfs components.\n");
DRM_ERROR("Report so we increase "
"AMDGPU_DEBUGFS_MAX_COMPONENTS.\n");
return -EINVAL;
}
adev->debugfs[adev->debugfs_count].files = files;
adev->debugfs[adev->debugfs_count].num_files = nfiles;
adev->debugfs_count = i;
#if defined(CONFIG_DEBUG_FS)
drm_debugfs_create_files(files, nfiles,
adev->ddev->primary->debugfs_root,
adev->ddev->primary);
#endif
return 0;
}
#if defined(CONFIG_DEBUG_FS)
static ssize_t amdgpu_debugfs_regs_read(struct file *f, char __user *buf,
size_t size, loff_t *pos)
{
struct amdgpu_device *adev = file_inode(f)->i_private;
ssize_t result = 0;
int r;
bool pm_pg_lock, use_bank;
unsigned instance_bank, sh_bank, se_bank;
if (size & 0x3 || *pos & 0x3)
return -EINVAL;
/* are we reading registers for which a PG lock is necessary? */
pm_pg_lock = (*pos >> 23) & 1;
if (*pos & (1ULL << 62)) {
se_bank = (*pos & GENMASK_ULL(33, 24)) >> 24;
sh_bank = (*pos & GENMASK_ULL(43, 34)) >> 34;
instance_bank = (*pos & GENMASK_ULL(53, 44)) >> 44;
if (se_bank == 0x3FF)
se_bank = 0xFFFFFFFF;
if (sh_bank == 0x3FF)
sh_bank = 0xFFFFFFFF;
if (instance_bank == 0x3FF)
instance_bank = 0xFFFFFFFF;
use_bank = 1;
} else {
use_bank = 0;
}
*pos &= (1UL << 22) - 1;
if (use_bank) {
if ((sh_bank != 0xFFFFFFFF && sh_bank >= adev->gfx.config.max_sh_per_se) ||
(se_bank != 0xFFFFFFFF && se_bank >= adev->gfx.config.max_shader_engines))
return -EINVAL;
mutex_lock(&adev->grbm_idx_mutex);
amdgpu_gfx_select_se_sh(adev, se_bank,
sh_bank, instance_bank);
}
if (pm_pg_lock)
mutex_lock(&adev->pm.mutex);
while (size) {
uint32_t value;
if (*pos > adev->rmmio_size)
goto end;
value = RREG32(*pos >> 2);
r = put_user(value, (uint32_t *)buf);
if (r) {
result = r;
goto end;
}
result += 4;
buf += 4;
*pos += 4;
size -= 4;
}
end:
if (use_bank) {
amdgpu_gfx_select_se_sh(adev, 0xffffffff, 0xffffffff, 0xffffffff);
mutex_unlock(&adev->grbm_idx_mutex);
}
if (pm_pg_lock)
mutex_unlock(&adev->pm.mutex);
return result;
}
static ssize_t amdgpu_debugfs_regs_write(struct file *f, const char __user *buf,
size_t size, loff_t *pos)
{
struct amdgpu_device *adev = file_inode(f)->i_private;
ssize_t result = 0;
int r;
bool pm_pg_lock, use_bank;
unsigned instance_bank, sh_bank, se_bank;
if (size & 0x3 || *pos & 0x3)
return -EINVAL;
/* are we reading registers for which a PG lock is necessary? */
pm_pg_lock = (*pos >> 23) & 1;
if (*pos & (1ULL << 62)) {
se_bank = (*pos & GENMASK_ULL(33, 24)) >> 24;
sh_bank = (*pos & GENMASK_ULL(43, 34)) >> 34;
instance_bank = (*pos & GENMASK_ULL(53, 44)) >> 44;
if (se_bank == 0x3FF)
se_bank = 0xFFFFFFFF;
if (sh_bank == 0x3FF)
sh_bank = 0xFFFFFFFF;
if (instance_bank == 0x3FF)
instance_bank = 0xFFFFFFFF;
use_bank = 1;
} else {
use_bank = 0;
}
*pos &= (1UL << 22) - 1;
if (use_bank) {
if ((sh_bank != 0xFFFFFFFF && sh_bank >= adev->gfx.config.max_sh_per_se) ||
(se_bank != 0xFFFFFFFF && se_bank >= adev->gfx.config.max_shader_engines))
return -EINVAL;
mutex_lock(&adev->grbm_idx_mutex);
amdgpu_gfx_select_se_sh(adev, se_bank,
sh_bank, instance_bank);
}
if (pm_pg_lock)
mutex_lock(&adev->pm.mutex);
while (size) {
uint32_t value;
if (*pos > adev->rmmio_size)
return result;
r = get_user(value, (uint32_t *)buf);
if (r)
return r;
WREG32(*pos >> 2, value);
result += 4;
buf += 4;
*pos += 4;
size -= 4;
}
if (use_bank) {
amdgpu_gfx_select_se_sh(adev, 0xffffffff, 0xffffffff, 0xffffffff);
mutex_unlock(&adev->grbm_idx_mutex);
}
if (pm_pg_lock)
mutex_unlock(&adev->pm.mutex);
return result;
}
static ssize_t amdgpu_debugfs_regs_pcie_read(struct file *f, char __user *buf,
size_t size, loff_t *pos)
{
struct amdgpu_device *adev = file_inode(f)->i_private;
ssize_t result = 0;
int r;
if (size & 0x3 || *pos & 0x3)
return -EINVAL;
while (size) {
uint32_t value;
value = RREG32_PCIE(*pos >> 2);
r = put_user(value, (uint32_t *)buf);
if (r)
return r;
result += 4;
buf += 4;
*pos += 4;
size -= 4;
}
return result;
}
static ssize_t amdgpu_debugfs_regs_pcie_write(struct file *f, const char __user *buf,
size_t size, loff_t *pos)
{
struct amdgpu_device *adev = file_inode(f)->i_private;
ssize_t result = 0;
int r;
if (size & 0x3 || *pos & 0x3)
return -EINVAL;
while (size) {
uint32_t value;
r = get_user(value, (uint32_t *)buf);
if (r)
return r;
WREG32_PCIE(*pos >> 2, value);
result += 4;
buf += 4;
*pos += 4;
size -= 4;
}
return result;
}
static ssize_t amdgpu_debugfs_regs_didt_read(struct file *f, char __user *buf,
size_t size, loff_t *pos)
{
struct amdgpu_device *adev = file_inode(f)->i_private;
ssize_t result = 0;
int r;
if (size & 0x3 || *pos & 0x3)
return -EINVAL;
while (size) {
uint32_t value;
value = RREG32_DIDT(*pos >> 2);
r = put_user(value, (uint32_t *)buf);
if (r)
return r;
result += 4;
buf += 4;
*pos += 4;
size -= 4;
}
return result;
}
static ssize_t amdgpu_debugfs_regs_didt_write(struct file *f, const char __user *buf,
size_t size, loff_t *pos)
{
struct amdgpu_device *adev = file_inode(f)->i_private;
ssize_t result = 0;
int r;
if (size & 0x3 || *pos & 0x3)
return -EINVAL;
while (size) {
uint32_t value;
r = get_user(value, (uint32_t *)buf);
if (r)
return r;
WREG32_DIDT(*pos >> 2, value);
result += 4;
buf += 4;
*pos += 4;
size -= 4;
}
return result;
}
static ssize_t amdgpu_debugfs_regs_smc_read(struct file *f, char __user *buf,
size_t size, loff_t *pos)
{
struct amdgpu_device *adev = file_inode(f)->i_private;
ssize_t result = 0;
int r;
if (size & 0x3 || *pos & 0x3)
return -EINVAL;
while (size) {
uint32_t value;
value = RREG32_SMC(*pos);
r = put_user(value, (uint32_t *)buf);
if (r)
return r;
result += 4;
buf += 4;
*pos += 4;
size -= 4;
}
return result;
}
static ssize_t amdgpu_debugfs_regs_smc_write(struct file *f, const char __user *buf,
size_t size, loff_t *pos)
{
struct amdgpu_device *adev = file_inode(f)->i_private;
ssize_t result = 0;
int r;
if (size & 0x3 || *pos & 0x3)
return -EINVAL;
while (size) {
uint32_t value;
r = get_user(value, (uint32_t *)buf);
if (r)
return r;
WREG32_SMC(*pos, value);
result += 4;
buf += 4;
*pos += 4;
size -= 4;
}
return result;
}
static ssize_t amdgpu_debugfs_gca_config_read(struct file *f, char __user *buf,
size_t size, loff_t *pos)
{
struct amdgpu_device *adev = file_inode(f)->i_private;
ssize_t result = 0;
int r;
uint32_t *config, no_regs = 0;
if (size & 0x3 || *pos & 0x3)
return -EINVAL;
config = kmalloc_array(256, sizeof(*config), GFP_KERNEL);
if (!config)
return -ENOMEM;
/* version, increment each time something is added */
config[no_regs++] = 3;
config[no_regs++] = adev->gfx.config.max_shader_engines;
config[no_regs++] = adev->gfx.config.max_tile_pipes;
config[no_regs++] = adev->gfx.config.max_cu_per_sh;
config[no_regs++] = adev->gfx.config.max_sh_per_se;
config[no_regs++] = adev->gfx.config.max_backends_per_se;
config[no_regs++] = adev->gfx.config.max_texture_channel_caches;
config[no_regs++] = adev->gfx.config.max_gprs;
config[no_regs++] = adev->gfx.config.max_gs_threads;
config[no_regs++] = adev->gfx.config.max_hw_contexts;
config[no_regs++] = adev->gfx.config.sc_prim_fifo_size_frontend;
config[no_regs++] = adev->gfx.config.sc_prim_fifo_size_backend;
config[no_regs++] = adev->gfx.config.sc_hiz_tile_fifo_size;
config[no_regs++] = adev->gfx.config.sc_earlyz_tile_fifo_size;
config[no_regs++] = adev->gfx.config.num_tile_pipes;
config[no_regs++] = adev->gfx.config.backend_enable_mask;
config[no_regs++] = adev->gfx.config.mem_max_burst_length_bytes;
config[no_regs++] = adev->gfx.config.mem_row_size_in_kb;
config[no_regs++] = adev->gfx.config.shader_engine_tile_size;
config[no_regs++] = adev->gfx.config.num_gpus;
config[no_regs++] = adev->gfx.config.multi_gpu_tile_size;
config[no_regs++] = adev->gfx.config.mc_arb_ramcfg;
config[no_regs++] = adev->gfx.config.gb_addr_config;
config[no_regs++] = adev->gfx.config.num_rbs;
/* rev==1 */
config[no_regs++] = adev->rev_id;
config[no_regs++] = adev->pg_flags;
config[no_regs++] = adev->cg_flags;
/* rev==2 */
config[no_regs++] = adev->family;
config[no_regs++] = adev->external_rev_id;
/* rev==3 */
config[no_regs++] = adev->pdev->device;
config[no_regs++] = adev->pdev->revision;
config[no_regs++] = adev->pdev->subsystem_device;
config[no_regs++] = adev->pdev->subsystem_vendor;
while (size && (*pos < no_regs * 4)) {
uint32_t value;
value = config[*pos >> 2];
r = put_user(value, (uint32_t *)buf);
if (r) {
kfree(config);
return r;
}
result += 4;
buf += 4;
*pos += 4;
size -= 4;
}
kfree(config);
return result;
}
static ssize_t amdgpu_debugfs_sensor_read(struct file *f, char __user *buf,
size_t size, loff_t *pos)
{
struct amdgpu_device *adev = file_inode(f)->i_private;
int idx, x, outsize, r, valuesize;
uint32_t values[16];
if (size & 3 || *pos & 0x3)
return -EINVAL;
if (amdgpu_dpm == 0)
return -EINVAL;
/* convert offset to sensor number */
idx = *pos >> 2;
valuesize = sizeof(values);
if (adev->powerplay.pp_funcs && adev->powerplay.pp_funcs->read_sensor)
r = amdgpu_dpm_read_sensor(adev, idx, &values[0], &valuesize);
else
return -EINVAL;
if (size > valuesize)
return -EINVAL;
outsize = 0;
x = 0;
if (!r) {
while (size) {
r = put_user(values[x++], (int32_t *)buf);
buf += 4;
size -= 4;
outsize += 4;
}
}
return !r ? outsize : r;
}
static ssize_t amdgpu_debugfs_wave_read(struct file *f, char __user *buf,
size_t size, loff_t *pos)
{
struct amdgpu_device *adev = f->f_inode->i_private;
int r, x;
ssize_t result=0;
uint32_t offset, se, sh, cu, wave, simd, data[32];
if (size & 3 || *pos & 3)
return -EINVAL;
/* decode offset */
offset = (*pos & GENMASK_ULL(6, 0));
se = (*pos & GENMASK_ULL(14, 7)) >> 7;
sh = (*pos & GENMASK_ULL(22, 15)) >> 15;
cu = (*pos & GENMASK_ULL(30, 23)) >> 23;
wave = (*pos & GENMASK_ULL(36, 31)) >> 31;
simd = (*pos & GENMASK_ULL(44, 37)) >> 37;
/* switch to the specific se/sh/cu */
mutex_lock(&adev->grbm_idx_mutex);
amdgpu_gfx_select_se_sh(adev, se, sh, cu);
x = 0;
if (adev->gfx.funcs->read_wave_data)
adev->gfx.funcs->read_wave_data(adev, simd, wave, data, &x);
amdgpu_gfx_select_se_sh(adev, 0xFFFFFFFF, 0xFFFFFFFF, 0xFFFFFFFF);
mutex_unlock(&adev->grbm_idx_mutex);
if (!x)
return -EINVAL;
while (size && (offset < x * 4)) {
uint32_t value;
value = data[offset >> 2];
r = put_user(value, (uint32_t *)buf);
if (r)
return r;
result += 4;
buf += 4;
offset += 4;
size -= 4;
}
return result;
}
static ssize_t amdgpu_debugfs_gpr_read(struct file *f, char __user *buf,
size_t size, loff_t *pos)
{
struct amdgpu_device *adev = f->f_inode->i_private;
int r;
ssize_t result = 0;
uint32_t offset, se, sh, cu, wave, simd, thread, bank, *data;
if (size & 3 || *pos & 3)
return -EINVAL;
/* decode offset */
offset = *pos & GENMASK_ULL(11, 0);
se = (*pos & GENMASK_ULL(19, 12)) >> 12;
sh = (*pos & GENMASK_ULL(27, 20)) >> 20;
cu = (*pos & GENMASK_ULL(35, 28)) >> 28;
wave = (*pos & GENMASK_ULL(43, 36)) >> 36;
simd = (*pos & GENMASK_ULL(51, 44)) >> 44;
thread = (*pos & GENMASK_ULL(59, 52)) >> 52;
bank = (*pos & GENMASK_ULL(61, 60)) >> 60;
data = kmalloc_array(1024, sizeof(*data), GFP_KERNEL);
if (!data)
return -ENOMEM;
/* switch to the specific se/sh/cu */
mutex_lock(&adev->grbm_idx_mutex);
amdgpu_gfx_select_se_sh(adev, se, sh, cu);
if (bank == 0) {
if (adev->gfx.funcs->read_wave_vgprs)
adev->gfx.funcs->read_wave_vgprs(adev, simd, wave, thread, offset, size>>2, data);
} else {
if (adev->gfx.funcs->read_wave_sgprs)
adev->gfx.funcs->read_wave_sgprs(adev, simd, wave, offset, size>>2, data);
}
amdgpu_gfx_select_se_sh(adev, 0xFFFFFFFF, 0xFFFFFFFF, 0xFFFFFFFF);
mutex_unlock(&adev->grbm_idx_mutex);
while (size) {
uint32_t value;
value = data[offset++];
r = put_user(value, (uint32_t *)buf);
if (r) {
result = r;
goto err;
}
result += 4;
buf += 4;
size -= 4;
}
err:
kfree(data);
return result;
}
static const struct file_operations amdgpu_debugfs_regs_fops = {
.owner = THIS_MODULE,
.read = amdgpu_debugfs_regs_read,
.write = amdgpu_debugfs_regs_write,
.llseek = default_llseek
};
static const struct file_operations amdgpu_debugfs_regs_didt_fops = {
.owner = THIS_MODULE,
.read = amdgpu_debugfs_regs_didt_read,
.write = amdgpu_debugfs_regs_didt_write,
.llseek = default_llseek
};
static const struct file_operations amdgpu_debugfs_regs_pcie_fops = {
.owner = THIS_MODULE,
.read = amdgpu_debugfs_regs_pcie_read,
.write = amdgpu_debugfs_regs_pcie_write,
.llseek = default_llseek
};
static const struct file_operations amdgpu_debugfs_regs_smc_fops = {
.owner = THIS_MODULE,
.read = amdgpu_debugfs_regs_smc_read,
.write = amdgpu_debugfs_regs_smc_write,
.llseek = default_llseek
};
static const struct file_operations amdgpu_debugfs_gca_config_fops = {
.owner = THIS_MODULE,
.read = amdgpu_debugfs_gca_config_read,
.llseek = default_llseek
};
static const struct file_operations amdgpu_debugfs_sensors_fops = {
.owner = THIS_MODULE,
.read = amdgpu_debugfs_sensor_read,
.llseek = default_llseek
};
static const struct file_operations amdgpu_debugfs_wave_fops = {
.owner = THIS_MODULE,
.read = amdgpu_debugfs_wave_read,
.llseek = default_llseek
};
static const struct file_operations amdgpu_debugfs_gpr_fops = {
.owner = THIS_MODULE,
.read = amdgpu_debugfs_gpr_read,
.llseek = default_llseek
};
static const struct file_operations *debugfs_regs[] = {
&amdgpu_debugfs_regs_fops,
&amdgpu_debugfs_regs_didt_fops,
&amdgpu_debugfs_regs_pcie_fops,
&amdgpu_debugfs_regs_smc_fops,
&amdgpu_debugfs_gca_config_fops,
&amdgpu_debugfs_sensors_fops,
&amdgpu_debugfs_wave_fops,
&amdgpu_debugfs_gpr_fops,
};
static const char *debugfs_regs_names[] = {
"amdgpu_regs",
"amdgpu_regs_didt",
"amdgpu_regs_pcie",
"amdgpu_regs_smc",
"amdgpu_gca_config",
"amdgpu_sensors",
"amdgpu_wave",
"amdgpu_gpr",
};
int amdgpu_debugfs_regs_init(struct amdgpu_device *adev)
{
struct drm_minor *minor = adev->ddev->primary;
struct dentry *ent, *root = minor->debugfs_root;
unsigned i, j;
for (i = 0; i < ARRAY_SIZE(debugfs_regs); i++) {
ent = debugfs_create_file(debugfs_regs_names[i],
S_IFREG | S_IRUGO, root,
adev, debugfs_regs[i]);
if (IS_ERR(ent)) {
for (j = 0; j < i; j++) {
debugfs_remove(adev->debugfs_regs[i]);
adev->debugfs_regs[i] = NULL;
}
return PTR_ERR(ent);
}
if (!i)
i_size_write(ent->d_inode, adev->rmmio_size);
adev->debugfs_regs[i] = ent;
}
return 0;
}
void amdgpu_debugfs_regs_cleanup(struct amdgpu_device *adev)
{
unsigned i;
for (i = 0; i < ARRAY_SIZE(debugfs_regs); i++) {
if (adev->debugfs_regs[i]) {
debugfs_remove(adev->debugfs_regs[i]);
adev->debugfs_regs[i] = NULL;
}
}
}
static int amdgpu_debugfs_test_ib(struct seq_file *m, void *data)
{
struct drm_info_node *node = (struct drm_info_node *) m->private;
struct drm_device *dev = node->minor->dev;
struct amdgpu_device *adev = dev->dev_private;
int r = 0, i;
/* hold on the scheduler */
for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
struct amdgpu_ring *ring = adev->rings[i];
if (!ring || !ring->sched.thread)
continue;
kthread_park(ring->sched.thread);
}
seq_printf(m, "run ib test:\n");
r = amdgpu_ib_ring_tests(adev);
if (r)
seq_printf(m, "ib ring tests failed (%d).\n", r);
else
seq_printf(m, "ib ring tests passed.\n");
/* go on the scheduler */
for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
struct amdgpu_ring *ring = adev->rings[i];
if (!ring || !ring->sched.thread)
continue;
kthread_unpark(ring->sched.thread);
}
return 0;
}
static int amdgpu_debugfs_get_vbios_dump(struct seq_file *m, void *data)
{
struct drm_info_node *node = (struct drm_info_node *) m->private;
struct drm_device *dev = node->minor->dev;
struct amdgpu_device *adev = dev->dev_private;
seq_write(m, adev->bios, adev->bios_size);
return 0;
}
static int amdgpu_debugfs_evict_vram(struct seq_file *m, void *data)
{
struct drm_info_node *node = (struct drm_info_node *)m->private;
struct drm_device *dev = node->minor->dev;
struct amdgpu_device *adev = dev->dev_private;
seq_printf(m, "(%d)\n", amdgpu_bo_evict_vram(adev));
return 0;
}
static const struct drm_info_list amdgpu_debugfs_list[] = {
{"amdgpu_vbios", amdgpu_debugfs_get_vbios_dump},
{"amdgpu_test_ib", &amdgpu_debugfs_test_ib},
{"amdgpu_evict_vram", &amdgpu_debugfs_evict_vram}
};
int amdgpu_debugfs_init(struct amdgpu_device *adev)
{
return amdgpu_debugfs_add_files(adev, amdgpu_debugfs_list,
ARRAY_SIZE(amdgpu_debugfs_list));
}
#else
int amdgpu_debugfs_init(struct amdgpu_device *adev)
{
return 0;
}
int amdgpu_debugfs_regs_init(struct amdgpu_device *adev)
{
return 0;
}
void amdgpu_debugfs_regs_cleanup(struct amdgpu_device *adev) { }
#endif

View file

@ -1,5 +1,7 @@
/*
* Copyright 2012-15 Advanced Micro Devices, Inc.
* Copyright 2008 Advanced Micro Devices, Inc.
* Copyright 2008 Red Hat Inc.
* Copyright 2009 Jerome Glisse.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
@ -19,57 +21,22 @@
* ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*
* Authors: AMD
*
*/
#include "dm_services.h"
#include "include/grph_object_id.h"
static bool dal_graphics_object_id_is_valid(struct graphics_object_id id)
{
bool rc = true;
switch (id.type) {
case OBJECT_TYPE_UNKNOWN:
rc = false;
break;
case OBJECT_TYPE_GPU:
case OBJECT_TYPE_ENGINE:
/* do NOT check for id.id == 0 */
if (id.enum_id == ENUM_ID_UNKNOWN)
rc = false;
break;
default:
if (id.id == 0 || id.enum_id == ENUM_ID_UNKNOWN)
rc = false;
break;
}
return rc;
}
bool dal_graphics_object_id_is_equal(
struct graphics_object_id id1,
struct graphics_object_id id2)
{
if (false == dal_graphics_object_id_is_valid(id1)) {
dm_output_to_console(
"%s: Warning: comparing invalid object 'id1'!\n", __func__);
return false;
}
if (false == dal_graphics_object_id_is_valid(id2)) {
dm_output_to_console(
"%s: Warning: comparing invalid object 'id2'!\n", __func__);
return false;
}
if (id1.id == id2.id && id1.enum_id == id2.enum_id
&& id1.type == id2.type)
return true;
return false;
}
/*
* Debugfs
*/
struct amdgpu_debugfs {
const struct drm_info_list *files;
unsigned num_files;
};
int amdgpu_debugfs_regs_init(struct amdgpu_device *adev);
void amdgpu_debugfs_regs_cleanup(struct amdgpu_device *adev);
int amdgpu_debugfs_init(struct amdgpu_device *adev);
int amdgpu_debugfs_add_files(struct amdgpu_device *adev,
const struct drm_info_list *files,
unsigned nfiles);
int amdgpu_debugfs_fence_init(struct amdgpu_device *adev);
int amdgpu_debugfs_firmware_init(struct amdgpu_device *adev);
int amdgpu_debugfs_gem_init(struct amdgpu_device *adev);

File diff suppressed because it is too large Load diff

View file

@ -34,6 +34,7 @@
#include <linux/pm_runtime.h>
#include <drm/drm_crtc_helper.h>
#include <drm/drm_edid.h>
#include <drm/drm_fb_helper.h>
static void amdgpu_flip_callback(struct dma_fence *f, struct dma_fence_cb *cb)
{
@ -556,15 +557,9 @@ amdgpu_user_framebuffer_create(struct drm_device *dev,
return &amdgpu_fb->base;
}
void amdgpu_output_poll_changed(struct drm_device *dev)
{
struct amdgpu_device *adev = dev->dev_private;
amdgpu_fb_output_poll_changed(adev);
}
const struct drm_mode_config_funcs amdgpu_mode_funcs = {
.fb_create = amdgpu_user_framebuffer_create,
.output_poll_changed = amdgpu_output_poll_changed
.output_poll_changed = drm_fb_helper_output_poll_changed,
};
static const struct drm_prop_enum_list amdgpu_underscan_enum_list[] =

View file

@ -25,9 +25,7 @@
struct drm_framebuffer *
amdgpu_user_framebuffer_create(struct drm_device *dev,
struct drm_file *file_priv,
const struct drm_mode_fb_cmd2 *mode_cmd);
void amdgpu_output_poll_changed(struct drm_device *dev);
struct drm_file *file_priv,
const struct drm_mode_fb_cmd2 *mode_cmd);
#endif

View file

@ -360,6 +360,12 @@ enum amdgpu_pcie_gen {
((adev)->powerplay.pp_funcs->set_clockgating_by_smu(\
(adev)->powerplay.pp_handle, msg_id))
#define amdgpu_dpm_notify_smu_memory_info(adev, virtual_addr_low, \
virtual_addr_hi, mc_addr_low, mc_addr_hi, size) \
((adev)->powerplay.pp_funcs->notify_smu_memory_info)( \
(adev)->powerplay.pp_handle, virtual_addr_low, \
virtual_addr_hi, mc_addr_low, mc_addr_hi, size)
struct amdgpu_dpm {
struct amdgpu_ps *ps;
/* number of valid power states */

View file

@ -90,7 +90,7 @@ int amdgpu_disp_priority = 0;
int amdgpu_hw_i2c = 0;
int amdgpu_pcie_gen2 = -1;
int amdgpu_msi = -1;
int amdgpu_lockup_timeout = 0;
int amdgpu_lockup_timeout = 10000;
int amdgpu_dpm = -1;
int amdgpu_fw_load_type = -1;
int amdgpu_aspm = -1;
@ -128,6 +128,7 @@ int amdgpu_param_buf_per_se = 0;
int amdgpu_job_hang_limit = 0;
int amdgpu_lbpw = -1;
int amdgpu_compute_multipipe = -1;
int amdgpu_gpu_recovery = -1; /* auto */
MODULE_PARM_DESC(vramlimit, "Restrict VRAM for testing, in megabytes");
module_param_named(vramlimit, amdgpu_vram_limit, int, 0600);
@ -165,7 +166,7 @@ module_param_named(pcie_gen2, amdgpu_pcie_gen2, int, 0444);
MODULE_PARM_DESC(msi, "MSI support (1 = enable, 0 = disable, -1 = auto)");
module_param_named(msi, amdgpu_msi, int, 0444);
MODULE_PARM_DESC(lockup_timeout, "GPU lockup timeout in ms (default 0 = disable)");
MODULE_PARM_DESC(lockup_timeout, "GPU lockup timeout in ms > 0 (default 10000)");
module_param_named(lockup_timeout, amdgpu_lockup_timeout, int, 0444);
MODULE_PARM_DESC(dpm, "DPM support (1 = enable, 0 = disable, -1 = auto)");
@ -216,7 +217,7 @@ module_param_named(exp_hw_support, amdgpu_exp_hw_support, int, 0444);
MODULE_PARM_DESC(dc, "Display Core driver (1 = enable, 0 = disable, -1 = auto (default))");
module_param_named(dc, amdgpu_dc, int, 0444);
MODULE_PARM_DESC(dc, "Display Core Log Level (0 = minimal (default), 1 = chatty");
MODULE_PARM_DESC(dc_log, "Display Core Log Level (0 = minimal (default), 1 = chatty");
module_param_named(dc_log, amdgpu_dc_log, int, 0444);
MODULE_PARM_DESC(sched_jobs, "the max number of jobs supported in the sw queue (default 32)");
@ -280,6 +281,9 @@ module_param_named(lbpw, amdgpu_lbpw, int, 0444);
MODULE_PARM_DESC(compute_multipipe, "Force compute queues to be spread across pipes (1 = enable, 0 = disable, -1 = auto)");
module_param_named(compute_multipipe, amdgpu_compute_multipipe, int, 0444);
MODULE_PARM_DESC(gpu_recovery, "Enable GPU recovery mechanism, (1 = enable, 0 = disable, -1 = auto");
module_param_named(gpu_recovery, amdgpu_gpu_recovery, int, 0444);
#ifdef CONFIG_DRM_AMDGPU_SI
#if defined(CONFIG_DRM_RADEON) || defined(CONFIG_DRM_RADEON_MODULE)
@ -306,7 +310,6 @@ MODULE_PARM_DESC(cik_support, "CIK support (1 = enabled (default), 0 = disabled)
module_param_named(cik_support, amdgpu_cik_support, int, 0444);
#endif
static const struct pci_device_id pciidlist[] = {
#ifdef CONFIG_DRM_AMDGPU_SI
{0x1002, 0x6780, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
@ -566,12 +569,13 @@ static int amdgpu_kick_out_firmware_fb(struct pci_dev *pdev)
return 0;
}
static int amdgpu_pci_probe(struct pci_dev *pdev,
const struct pci_device_id *ent)
{
struct drm_device *dev;
unsigned long flags = ent->driver_data;
int ret;
int ret, retry = 0;
if ((flags & AMD_EXP_HW_SUPPORT) && !amdgpu_exp_hw_support) {
DRM_INFO("This hardware requires experimental hardware support.\n"
@ -604,8 +608,14 @@ static int amdgpu_pci_probe(struct pci_dev *pdev,
pci_set_drvdata(pdev, dev);
retry_init:
ret = drm_dev_register(dev, ent->driver_data);
if (ret)
if (ret == -EAGAIN && ++retry <= 3) {
DRM_INFO("retry init %d\n", retry);
/* Don't request EX mode too frequently which is attacking */
msleep(5000);
goto retry_init;
} else if (ret)
goto err_pci;
return 0;
@ -639,7 +649,7 @@ amdgpu_pci_shutdown(struct pci_dev *pdev)
* unfortunately we can't detect certain
* hypervisors so just do this all the time.
*/
amdgpu_suspend(adev);
amdgpu_device_ip_suspend(adev);
}
static int amdgpu_pmops_suspend(struct device *dev)
@ -844,9 +854,6 @@ static struct drm_driver kms_driver = {
.disable_vblank = amdgpu_disable_vblank_kms,
.get_vblank_timestamp = drm_calc_vbltimestamp_from_scanoutpos,
.get_scanout_position = amdgpu_get_crtc_scanout_position,
#if defined(CONFIG_DEBUG_FS)
.debugfs_init = amdgpu_debugfs_init,
#endif
.irq_preinstall = amdgpu_irq_preinstall,
.irq_postinstall = amdgpu_irq_postinstall,
.irq_uninstall = amdgpu_irq_uninstall,
@ -906,10 +913,6 @@ static int __init amdgpu_init(void)
if (r)
goto error_fence;
r = amd_sched_fence_slab_init();
if (r)
goto error_sched;
if (vgacon_text_force()) {
DRM_ERROR("VGACON disables amdgpu kernel modesetting.\n");
return -EINVAL;
@ -922,9 +925,6 @@ static int __init amdgpu_init(void)
/* let modprobe override vga console setting */
return pci_register_driver(pdriver);
error_sched:
amdgpu_fence_slab_fini();
error_fence:
amdgpu_sync_fini();
@ -938,7 +938,6 @@ static void __exit amdgpu_exit(void)
pci_unregister_driver(pdriver);
amdgpu_unregister_atpx_handler();
amdgpu_sync_fini();
amd_sched_fence_slab_fini();
amdgpu_fence_slab_fini();
}

View file

@ -283,12 +283,6 @@ out:
return ret;
}
void amdgpu_fb_output_poll_changed(struct amdgpu_device *adev)
{
if (adev->mode_info.rfbdev)
drm_fb_helper_hotplug_event(&adev->mode_info.rfbdev->helper);
}
static int amdgpu_fbdev_destroy(struct drm_device *dev, struct amdgpu_fbdev *rfbdev)
{
struct amdgpu_framebuffer *rfb = &rfbdev->rfb;
@ -393,24 +387,3 @@ bool amdgpu_fbdev_robj_is_fb(struct amdgpu_device *adev, struct amdgpu_bo *robj)
return true;
return false;
}
void amdgpu_fbdev_restore_mode(struct amdgpu_device *adev)
{
struct amdgpu_fbdev *afbdev;
struct drm_fb_helper *fb_helper;
int ret;
if (!adev)
return;
afbdev = adev->mode_info.rfbdev;
if (!afbdev)
return;
fb_helper = &afbdev->helper;
ret = drm_fb_helper_restore_fbdev_mode_unlocked(fb_helper);
if (ret)
DRM_DEBUG("failed to restore crtc mode\n");
}

View file

@ -187,7 +187,7 @@ int amdgpu_fence_emit_polling(struct amdgpu_ring *ring, uint32_t *s)
seq = ++ring->fence_drv.sync_seq;
amdgpu_ring_emit_fence(ring, ring->fence_drv.gpu_addr,
seq, AMDGPU_FENCE_FLAG_INT);
seq, 0);
*s = seq;
@ -391,9 +391,9 @@ int amdgpu_fence_driver_start_ring(struct amdgpu_ring *ring,
ring->fence_drv.irq_type = irq_type;
ring->fence_drv.initialized = true;
dev_info(adev->dev, "fence driver on ring %d use gpu addr 0x%016llx, "
"cpu addr 0x%p\n", ring->idx,
ring->fence_drv.gpu_addr, ring->fence_drv.cpu_addr);
dev_dbg(adev->dev, "fence driver on ring %d use gpu addr 0x%016llx, "
"cpu addr 0x%p\n", ring->idx,
ring->fence_drv.gpu_addr, ring->fence_drv.cpu_addr);
return 0;
}
@ -410,7 +410,6 @@ int amdgpu_fence_driver_start_ring(struct amdgpu_ring *ring,
int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring,
unsigned num_hw_submission)
{
long timeout;
int r;
/* Check that num_hw_submission is a power of two */
@ -434,20 +433,9 @@ int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring,
/* No need to setup the GPU scheduler for KIQ ring */
if (ring->funcs->type != AMDGPU_RING_TYPE_KIQ) {
timeout = msecs_to_jiffies(amdgpu_lockup_timeout);
if (timeout == 0) {
/*
* FIXME:
* Delayed workqueue cannot use it directly,
* so the scheduler will not use delayed workqueue if
* MAX_SCHEDULE_TIMEOUT is set.
* Currently keep it simple and silly.
*/
timeout = MAX_SCHEDULE_TIMEOUT;
}
r = amd_sched_init(&ring->sched, &amdgpu_sched_ops,
num_hw_submission,
timeout, ring->name);
r = drm_sched_init(&ring->sched, &amdgpu_sched_ops,
num_hw_submission, amdgpu_job_hang_limit,
msecs_to_jiffies(amdgpu_lockup_timeout), ring->name);
if (r) {
DRM_ERROR("Failed to create scheduler on ring %s.\n",
ring->name);
@ -499,11 +487,11 @@ void amdgpu_fence_driver_fini(struct amdgpu_device *adev)
r = amdgpu_fence_wait_empty(ring);
if (r) {
/* no need to trigger GPU reset as we are unloading */
amdgpu_fence_driver_force_completion(adev);
amdgpu_fence_driver_force_completion(ring);
}
amdgpu_irq_put(adev, ring->fence_drv.irq_src,
ring->fence_drv.irq_type);
amd_sched_fini(&ring->sched);
drm_sched_fini(&ring->sched);
del_timer_sync(&ring->fence_drv.fallback_timer);
for (j = 0; j <= ring->fence_drv.num_fences_mask; ++j)
dma_fence_put(ring->fence_drv.fences[j]);
@ -534,7 +522,7 @@ void amdgpu_fence_driver_suspend(struct amdgpu_device *adev)
r = amdgpu_fence_wait_empty(ring);
if (r) {
/* delay GPU reset to resume */
amdgpu_fence_driver_force_completion(adev);
amdgpu_fence_driver_force_completion(ring);
}
/* disable the interrupt */
@ -571,30 +559,15 @@ void amdgpu_fence_driver_resume(struct amdgpu_device *adev)
}
/**
* amdgpu_fence_driver_force_completion - force all fence waiter to complete
* amdgpu_fence_driver_force_completion - force signal latest fence of ring
*
* @adev: amdgpu device pointer
* @ring: fence of the ring to signal
*
* In case of GPU reset failure make sure no process keep waiting on fence
* that will never complete.
*/
void amdgpu_fence_driver_force_completion(struct amdgpu_device *adev)
void amdgpu_fence_driver_force_completion(struct amdgpu_ring *ring)
{
int i;
for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
struct amdgpu_ring *ring = adev->rings[i];
if (!ring || !ring->fence_drv.initialized)
continue;
amdgpu_fence_write(ring, ring->fence_drv.sync_seq);
}
}
void amdgpu_fence_driver_force_completion_ring(struct amdgpu_ring *ring)
{
if (ring)
amdgpu_fence_write(ring, ring->fence_drv.sync_seq);
amdgpu_fence_write(ring, ring->fence_drv.sync_seq);
amdgpu_fence_process(ring);
}
/*
@ -709,25 +682,25 @@ static int amdgpu_debugfs_fence_info(struct seq_file *m, void *data)
}
/**
* amdgpu_debugfs_gpu_reset - manually trigger a gpu reset
* amdgpu_debugfs_gpu_recover - manually trigger a gpu reset & recover
*
* Manually trigger a gpu reset at the next fence wait.
*/
static int amdgpu_debugfs_gpu_reset(struct seq_file *m, void *data)
static int amdgpu_debugfs_gpu_recover(struct seq_file *m, void *data)
{
struct drm_info_node *node = (struct drm_info_node *) m->private;
struct drm_device *dev = node->minor->dev;
struct amdgpu_device *adev = dev->dev_private;
seq_printf(m, "gpu reset\n");
amdgpu_gpu_reset(adev);
seq_printf(m, "gpu recover\n");
amdgpu_device_gpu_recover(adev, NULL, true);
return 0;
}
static const struct drm_info_list amdgpu_debugfs_fence_list[] = {
{"amdgpu_fence_info", &amdgpu_debugfs_fence_info, 0, NULL},
{"amdgpu_gpu_reset", &amdgpu_debugfs_gpu_reset, 0, NULL}
{"amdgpu_gpu_recover", &amdgpu_debugfs_gpu_recover, 0, NULL}
};
static const struct drm_info_list amdgpu_debugfs_fence_list_sriov[] = {

View file

@ -57,60 +57,48 @@
*/
/**
* amdgpu_gart_table_ram_alloc - allocate system ram for gart page table
* amdgpu_dummy_page_init - init dummy page used by the driver
*
* @adev: amdgpu_device pointer
*
* Allocate system memory for GART page table
* (r1xx-r3xx, non-pcie r4xx, rs400). These asics require the
* gart table to be in system memory.
* Returns 0 for success, -ENOMEM for failure.
* Allocate the dummy page used by the driver (all asics).
* This dummy page is used by the driver as a filler for gart entries
* when pages are taken out of the GART
* Returns 0 on sucess, -ENOMEM on failure.
*/
int amdgpu_gart_table_ram_alloc(struct amdgpu_device *adev)
static int amdgpu_gart_dummy_page_init(struct amdgpu_device *adev)
{
void *ptr;
ptr = pci_alloc_consistent(adev->pdev, adev->gart.table_size,
&adev->gart.table_addr);
if (ptr == NULL) {
if (adev->dummy_page.page)
return 0;
adev->dummy_page.page = alloc_page(GFP_DMA32 | GFP_KERNEL | __GFP_ZERO);
if (adev->dummy_page.page == NULL)
return -ENOMEM;
adev->dummy_page.addr = pci_map_page(adev->pdev, adev->dummy_page.page,
0, PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
if (pci_dma_mapping_error(adev->pdev, adev->dummy_page.addr)) {
dev_err(&adev->pdev->dev, "Failed to DMA MAP the dummy page\n");
__free_page(adev->dummy_page.page);
adev->dummy_page.page = NULL;
return -ENOMEM;
}
#ifdef CONFIG_X86
if (0) {
set_memory_uc((unsigned long)ptr,
adev->gart.table_size >> PAGE_SHIFT);
}
#endif
adev->gart.ptr = ptr;
memset((void *)adev->gart.ptr, 0, adev->gart.table_size);
return 0;
}
/**
* amdgpu_gart_table_ram_free - free system ram for gart page table
* amdgpu_dummy_page_fini - free dummy page used by the driver
*
* @adev: amdgpu_device pointer
*
* Free system memory for GART page table
* (r1xx-r3xx, non-pcie r4xx, rs400). These asics require the
* gart table to be in system memory.
* Frees the dummy page used by the driver (all asics).
*/
void amdgpu_gart_table_ram_free(struct amdgpu_device *adev)
static void amdgpu_gart_dummy_page_fini(struct amdgpu_device *adev)
{
if (adev->gart.ptr == NULL) {
if (adev->dummy_page.page == NULL)
return;
}
#ifdef CONFIG_X86
if (0) {
set_memory_wb((unsigned long)adev->gart.ptr,
adev->gart.table_size >> PAGE_SHIFT);
}
#endif
pci_free_consistent(adev->pdev, adev->gart.table_size,
(void *)adev->gart.ptr,
adev->gart.table_addr);
adev->gart.ptr = NULL;
adev->gart.table_addr = 0;
pci_unmap_page(adev->pdev, adev->dummy_page.addr,
PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
__free_page(adev->dummy_page.page);
adev->dummy_page.page = NULL;
}
/**
@ -365,7 +353,7 @@ int amdgpu_gart_init(struct amdgpu_device *adev)
DRM_ERROR("Page size is smaller than GPU page size!\n");
return -EINVAL;
}
r = amdgpu_dummy_page_init(adev);
r = amdgpu_gart_dummy_page_init(adev);
if (r)
return r;
/* Compute table size */
@ -377,10 +365,8 @@ int amdgpu_gart_init(struct amdgpu_device *adev)
#ifdef CONFIG_DRM_AMDGPU_GART_DEBUGFS
/* Allocate pages table */
adev->gart.pages = vzalloc(sizeof(void *) * adev->gart.num_cpu_pages);
if (adev->gart.pages == NULL) {
amdgpu_gart_fini(adev);
if (adev->gart.pages == NULL)
return -ENOMEM;
}
#endif
return 0;
@ -395,14 +381,9 @@ int amdgpu_gart_init(struct amdgpu_device *adev)
*/
void amdgpu_gart_fini(struct amdgpu_device *adev)
{
if (adev->gart.ready) {
/* unbind pages */
amdgpu_gart_unbind(adev, 0, adev->gart.num_cpu_pages);
}
adev->gart.ready = false;
#ifdef CONFIG_DRM_AMDGPU_GART_DEBUGFS
vfree(adev->gart.pages);
adev->gart.pages = NULL;
#endif
amdgpu_dummy_page_fini(adev);
amdgpu_gart_dummy_page_fini(adev);
}

View file

@ -39,7 +39,7 @@ struct amdgpu_gart_funcs;
#define AMDGPU_GPU_PAGE_ALIGN(a) (((a) + AMDGPU_GPU_PAGE_MASK) & ~AMDGPU_GPU_PAGE_MASK)
struct amdgpu_gart {
dma_addr_t table_addr;
u64 table_addr;
struct amdgpu_bo *robj;
void *ptr;
unsigned num_gpu_pages;
@ -56,8 +56,6 @@ struct amdgpu_gart {
const struct amdgpu_gart_funcs *gart_funcs;
};
int amdgpu_gart_table_ram_alloc(struct amdgpu_device *adev);
void amdgpu_gart_table_ram_free(struct amdgpu_device *adev);
int amdgpu_gart_table_vram_alloc(struct amdgpu_device *adev);
void amdgpu_gart_table_vram_free(struct amdgpu_device *adev);
int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);

View file

@ -72,7 +72,7 @@ retry:
initial_domain |= AMDGPU_GEM_DOMAIN_GTT;
goto retry;
}
DRM_ERROR("Failed to allocate GEM object (%ld, %d, %u, %d)\n",
DRM_DEBUG("Failed to allocate GEM object (%ld, %d, %u, %d)\n",
size, initial_domain, alignment, r);
}
return r;
@ -282,6 +282,7 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void *data,
int amdgpu_gem_userptr_ioctl(struct drm_device *dev, void *data,
struct drm_file *filp)
{
struct ttm_operation_ctx ctx = { true, false };
struct amdgpu_device *adev = dev->dev_private;
struct drm_amdgpu_gem_userptr *args = data;
struct drm_gem_object *gobj;
@ -335,7 +336,7 @@ int amdgpu_gem_userptr_ioctl(struct drm_device *dev, void *data,
goto free_pages;
amdgpu_ttm_placement_from_domain(bo, AMDGPU_GEM_DOMAIN_GTT);
r = ttm_bo_validate(&bo->tbo, &bo->placement, true, false);
r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
amdgpu_bo_unreserve(bo);
if (r)
goto free_pages;
@ -517,10 +518,6 @@ static void amdgpu_gem_va_update_vm(struct amdgpu_device *adev,
if (!amdgpu_vm_ready(vm))
return;
r = amdgpu_vm_update_directories(adev, vm);
if (r)
goto error;
r = amdgpu_vm_clear_freed(adev, vm, NULL);
if (r)
goto error;
@ -529,6 +526,10 @@ static void amdgpu_gem_va_update_vm(struct amdgpu_device *adev,
operation == AMDGPU_VA_OP_REPLACE)
r = amdgpu_vm_bo_update(adev, bo_va, false);
r = amdgpu_vm_update_directories(adev, vm);
if (r)
goto error;
error:
if (r && r != -ERESTARTSYS)
DRM_ERROR("Couldn't update BO_VA (%d)\n", r);
@ -557,14 +558,25 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,
int r = 0;
if (args->va_address < AMDGPU_VA_RESERVED_SIZE) {
dev_err(&dev->pdev->dev,
dev_dbg(&dev->pdev->dev,
"va_address 0x%LX is in reserved area 0x%LX\n",
args->va_address, AMDGPU_VA_RESERVED_SIZE);
return -EINVAL;
}
if (args->va_address >= AMDGPU_VA_HOLE_START &&
args->va_address < AMDGPU_VA_HOLE_END) {
dev_dbg(&dev->pdev->dev,
"va_address 0x%LX is in VA hole 0x%LX-0x%LX\n",
args->va_address, AMDGPU_VA_HOLE_START,
AMDGPU_VA_HOLE_END);
return -EINVAL;
}
args->va_address &= AMDGPU_VA_HOLE_MASK;
if ((args->flags & ~valid_flags) && (args->flags & ~prt_flags)) {
dev_err(&dev->pdev->dev, "invalid flags combination 0x%08X\n",
dev_dbg(&dev->pdev->dev, "invalid flags combination 0x%08X\n",
args->flags);
return -EINVAL;
}
@ -576,7 +588,7 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,
case AMDGPU_VA_OP_REPLACE:
break;
default:
dev_err(&dev->pdev->dev, "unsupported operation %d\n",
dev_dbg(&dev->pdev->dev, "unsupported operation %d\n",
args->operation);
return -EINVAL;
}
@ -839,7 +851,7 @@ static const struct drm_info_list amdgpu_debugfs_gem_list[] = {
};
#endif
int amdgpu_gem_debugfs_init(struct amdgpu_device *adev)
int amdgpu_debugfs_gem_init(struct amdgpu_device *adev)
{
#if defined(CONFIG_DEBUG_FS)
return amdgpu_debugfs_add_files(adev, amdgpu_debugfs_gem_list, 1);

View file

@ -203,7 +203,7 @@ int amdgpu_gfx_kiq_init_ring(struct amdgpu_device *adev,
spin_lock_init(&kiq->ring_lock);
r = amdgpu_wb_get(adev, &adev->virt.reg_val_offs);
r = amdgpu_device_wb_get(adev, &adev->virt.reg_val_offs);
if (r)
return r;
@ -229,7 +229,7 @@ int amdgpu_gfx_kiq_init_ring(struct amdgpu_device *adev,
void amdgpu_gfx_kiq_free_ring(struct amdgpu_ring *ring,
struct amdgpu_irq_src *irq)
{
amdgpu_wb_free(ring->adev, ring->adev->virt.reg_val_offs);
amdgpu_device_wb_free(ring->adev, ring->adev->virt.reg_val_offs);
amdgpu_ring_fini(ring);
}

View file

@ -31,6 +31,11 @@ struct amdgpu_gtt_mgr {
atomic64_t available;
};
struct amdgpu_gtt_node {
struct drm_mm_node node;
struct ttm_buffer_object *tbo;
};
/**
* amdgpu_gtt_mgr_init - init GTT manager and DRM MM
*
@ -79,17 +84,17 @@ static int amdgpu_gtt_mgr_fini(struct ttm_mem_type_manager *man)
}
/**
* amdgpu_gtt_mgr_is_allocated - Check if mem has address space
* amdgpu_gtt_mgr_has_gart_addr - Check if mem has address space
*
* @mem: the mem object to check
*
* Check if a mem object has already address space allocated.
*/
bool amdgpu_gtt_mgr_is_allocated(struct ttm_mem_reg *mem)
bool amdgpu_gtt_mgr_has_gart_addr(struct ttm_mem_reg *mem)
{
struct drm_mm_node *node = mem->mm_node;
struct amdgpu_gtt_node *node = mem->mm_node;
return (node->start != AMDGPU_BO_INVALID_OFFSET);
return (node->node.start != AMDGPU_BO_INVALID_OFFSET);
}
/**
@ -109,12 +114,12 @@ static int amdgpu_gtt_mgr_alloc(struct ttm_mem_type_manager *man,
{
struct amdgpu_device *adev = amdgpu_ttm_adev(man->bdev);
struct amdgpu_gtt_mgr *mgr = man->priv;
struct drm_mm_node *node = mem->mm_node;
struct amdgpu_gtt_node *node = mem->mm_node;
enum drm_mm_insert_mode mode;
unsigned long fpfn, lpfn;
int r;
if (amdgpu_gtt_mgr_is_allocated(mem))
if (amdgpu_gtt_mgr_has_gart_addr(mem))
return 0;
if (place)
@ -132,13 +137,13 @@ static int amdgpu_gtt_mgr_alloc(struct ttm_mem_type_manager *man,
mode = DRM_MM_INSERT_HIGH;
spin_lock(&mgr->lock);
r = drm_mm_insert_node_in_range(&mgr->mm, node,
mem->num_pages, mem->page_alignment, 0,
fpfn, lpfn, mode);
r = drm_mm_insert_node_in_range(&mgr->mm, &node->node, mem->num_pages,
mem->page_alignment, 0, fpfn, lpfn,
mode);
spin_unlock(&mgr->lock);
if (!r)
mem->start = node->start;
mem->start = node->node.start;
return r;
}
@ -159,7 +164,7 @@ static int amdgpu_gtt_mgr_new(struct ttm_mem_type_manager *man,
struct ttm_mem_reg *mem)
{
struct amdgpu_gtt_mgr *mgr = man->priv;
struct drm_mm_node *node;
struct amdgpu_gtt_node *node;
int r;
spin_lock(&mgr->lock);
@ -177,8 +182,9 @@ static int amdgpu_gtt_mgr_new(struct ttm_mem_type_manager *man,
goto err_out;
}
node->start = AMDGPU_BO_INVALID_OFFSET;
node->size = mem->num_pages;
node->node.start = AMDGPU_BO_INVALID_OFFSET;
node->node.size = mem->num_pages;
node->tbo = tbo;
mem->mm_node = node;
if (place->fpfn || place->lpfn || place->flags & TTM_PL_FLAG_TOPDOWN) {
@ -190,7 +196,7 @@ static int amdgpu_gtt_mgr_new(struct ttm_mem_type_manager *man,
goto err_out;
}
} else {
mem->start = node->start;
mem->start = node->node.start;
}
return 0;
@ -214,14 +220,14 @@ static void amdgpu_gtt_mgr_del(struct ttm_mem_type_manager *man,
struct ttm_mem_reg *mem)
{
struct amdgpu_gtt_mgr *mgr = man->priv;
struct drm_mm_node *node = mem->mm_node;
struct amdgpu_gtt_node *node = mem->mm_node;
if (!node)
return;
spin_lock(&mgr->lock);
if (node->start != AMDGPU_BO_INVALID_OFFSET)
drm_mm_remove_node(node);
if (node->node.start != AMDGPU_BO_INVALID_OFFSET)
drm_mm_remove_node(&node->node);
spin_unlock(&mgr->lock);
atomic64_add(mem->num_pages, &mgr->available);
@ -244,6 +250,25 @@ uint64_t amdgpu_gtt_mgr_usage(struct ttm_mem_type_manager *man)
return (result > 0 ? result : 0) * PAGE_SIZE;
}
int amdgpu_gtt_mgr_recover(struct ttm_mem_type_manager *man)
{
struct amdgpu_gtt_mgr *mgr = man->priv;
struct amdgpu_gtt_node *node;
struct drm_mm_node *mm_node;
int r = 0;
spin_lock(&mgr->lock);
drm_mm_for_each_node(mm_node, &mgr->mm) {
node = container_of(mm_node, struct amdgpu_gtt_node, node);
r = amdgpu_ttm_recover_gart(node->tbo);
if (r)
break;
}
spin_unlock(&mgr->lock);
return r;
}
/**
* amdgpu_gtt_mgr_debug - dump VRAM table
*

View file

@ -149,7 +149,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
return -EINVAL;
}
if (vm && !job->vm_id) {
if (vm && !job->vmid) {
dev_err(adev->dev, "VM IB without ID\n");
return -EINVAL;
}
@ -164,7 +164,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
}
if (ring->funcs->emit_pipeline_sync && job &&
((tmp = amdgpu_sync_get_fence(&job->sched_sync)) ||
((tmp = amdgpu_sync_get_fence(&job->sched_sync, NULL)) ||
amdgpu_vm_need_pipeline_sync(ring, job))) {
need_pipe_sync = true;
dma_fence_put(tmp);
@ -211,7 +211,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
!amdgpu_sriov_vf(adev)) /* for SRIOV preemption, Preamble CE ib must be inserted anyway */
continue;
amdgpu_ring_emit_ib(ring, ib, job ? job->vm_id : 0,
amdgpu_ring_emit_ib(ring, ib, job ? job->vmid : 0,
need_ctx_switch);
need_ctx_switch = false;
}
@ -229,9 +229,8 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
r = amdgpu_fence_emit(ring, f);
if (r) {
dev_err(adev->dev, "failed to emit fence (%d)\n", r);
if (job && job->vm_id)
amdgpu_vm_reset_id(adev, ring->funcs->vmhub,
job->vm_id);
if (job && job->vmid)
amdgpu_vmid_reset(adev, ring->funcs->vmhub, job->vmid);
amdgpu_ring_undo(ring);
return r;
}

View file

@ -0,0 +1,459 @@
/*
* Copyright 2017 Advanced Micro Devices, Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
* OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
* ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*
*/
#include "amdgpu_ids.h"
#include <linux/idr.h>
#include <linux/dma-fence-array.h>
#include <drm/drmP.h>
#include "amdgpu.h"
#include "amdgpu_trace.h"
/*
* PASID manager
*
* PASIDs are global address space identifiers that can be shared
* between the GPU, an IOMMU and the driver. VMs on different devices
* may use the same PASID if they share the same address
* space. Therefore PASIDs are allocated using a global IDA. VMs are
* looked up from the PASID per amdgpu_device.
*/
static DEFINE_IDA(amdgpu_pasid_ida);
/**
* amdgpu_pasid_alloc - Allocate a PASID
* @bits: Maximum width of the PASID in bits, must be at least 1
*
* Allocates a PASID of the given width while keeping smaller PASIDs
* available if possible.
*
* Returns a positive integer on success. Returns %-EINVAL if bits==0.
* Returns %-ENOSPC if no PASID was available. Returns %-ENOMEM on
* memory allocation failure.
*/
int amdgpu_pasid_alloc(unsigned int bits)
{
int pasid = -EINVAL;
for (bits = min(bits, 31U); bits > 0; bits--) {
pasid = ida_simple_get(&amdgpu_pasid_ida,
1U << (bits - 1), 1U << bits,
GFP_KERNEL);
if (pasid != -ENOSPC)
break;
}
return pasid;
}
/**
* amdgpu_pasid_free - Free a PASID
* @pasid: PASID to free
*/
void amdgpu_pasid_free(unsigned int pasid)
{
ida_simple_remove(&amdgpu_pasid_ida, pasid);
}
/*
* VMID manager
*
* VMIDs are a per VMHUB identifier for page tables handling.
*/
/**
* amdgpu_vmid_had_gpu_reset - check if reset occured since last use
*
* @adev: amdgpu_device pointer
* @id: VMID structure
*
* Check if GPU reset occured since last use of the VMID.
*/
bool amdgpu_vmid_had_gpu_reset(struct amdgpu_device *adev,
struct amdgpu_vmid *id)
{
return id->current_gpu_reset_count !=
atomic_read(&adev->gpu_reset_counter);
}
/* idr_mgr->lock must be held */
static int amdgpu_vmid_grab_reserved_locked(struct amdgpu_vm *vm,
struct amdgpu_ring *ring,
struct amdgpu_sync *sync,
struct dma_fence *fence,
struct amdgpu_job *job)
{
struct amdgpu_device *adev = ring->adev;
unsigned vmhub = ring->funcs->vmhub;
uint64_t fence_context = adev->fence_context + ring->idx;
struct amdgpu_vmid *id = vm->reserved_vmid[vmhub];
struct amdgpu_vmid_mgr *id_mgr = &adev->vm_manager.id_mgr[vmhub];
struct dma_fence *updates = sync->last_vm_update;
int r = 0;
struct dma_fence *flushed, *tmp;
bool needs_flush = vm->use_cpu_for_update;
flushed = id->flushed_updates;
if ((amdgpu_vmid_had_gpu_reset(adev, id)) ||
(atomic64_read(&id->owner) != vm->entity.fence_context) ||
(job->vm_pd_addr != id->pd_gpu_addr) ||
(updates && (!flushed || updates->context != flushed->context ||
dma_fence_is_later(updates, flushed))) ||
(!id->last_flush || (id->last_flush->context != fence_context &&
!dma_fence_is_signaled(id->last_flush)))) {
needs_flush = true;
/* to prevent one context starved by another context */
id->pd_gpu_addr = 0;
tmp = amdgpu_sync_peek_fence(&id->active, ring);
if (tmp) {
r = amdgpu_sync_fence(adev, sync, tmp, false);
return r;
}
}
/* Good we can use this VMID. Remember this submission as
* user of the VMID.
*/
r = amdgpu_sync_fence(ring->adev, &id->active, fence, false);
if (r)
goto out;
if (updates && (!flushed || updates->context != flushed->context ||
dma_fence_is_later(updates, flushed))) {
dma_fence_put(id->flushed_updates);
id->flushed_updates = dma_fence_get(updates);
}
id->pd_gpu_addr = job->vm_pd_addr;
atomic64_set(&id->owner, vm->entity.fence_context);
job->vm_needs_flush = needs_flush;
if (needs_flush) {
dma_fence_put(id->last_flush);
id->last_flush = NULL;
}
job->vmid = id - id_mgr->ids;
trace_amdgpu_vm_grab_id(vm, ring, job);
out:
return r;
}
/**
* amdgpu_vm_grab_id - allocate the next free VMID
*
* @vm: vm to allocate id for
* @ring: ring we want to submit job to
* @sync: sync object where we add dependencies
* @fence: fence protecting ID from reuse
*
* Allocate an id for the vm, adding fences to the sync obj as necessary.
*/
int amdgpu_vmid_grab(struct amdgpu_vm *vm, struct amdgpu_ring *ring,
struct amdgpu_sync *sync, struct dma_fence *fence,
struct amdgpu_job *job)
{
struct amdgpu_device *adev = ring->adev;
unsigned vmhub = ring->funcs->vmhub;
struct amdgpu_vmid_mgr *id_mgr = &adev->vm_manager.id_mgr[vmhub];
uint64_t fence_context = adev->fence_context + ring->idx;
struct dma_fence *updates = sync->last_vm_update;
struct amdgpu_vmid *id, *idle;
struct dma_fence **fences;
unsigned i;
int r = 0;
mutex_lock(&id_mgr->lock);
if (vm->reserved_vmid[vmhub]) {
r = amdgpu_vmid_grab_reserved_locked(vm, ring, sync, fence, job);
mutex_unlock(&id_mgr->lock);
return r;
}
fences = kmalloc_array(sizeof(void *), id_mgr->num_ids, GFP_KERNEL);
if (!fences) {
mutex_unlock(&id_mgr->lock);
return -ENOMEM;
}
/* Check if we have an idle VMID */
i = 0;
list_for_each_entry(idle, &id_mgr->ids_lru, list) {
fences[i] = amdgpu_sync_peek_fence(&idle->active, ring);
if (!fences[i])
break;
++i;
}
/* If we can't find a idle VMID to use, wait till one becomes available */
if (&idle->list == &id_mgr->ids_lru) {
u64 fence_context = adev->vm_manager.fence_context + ring->idx;
unsigned seqno = ++adev->vm_manager.seqno[ring->idx];
struct dma_fence_array *array;
unsigned j;
for (j = 0; j < i; ++j)
dma_fence_get(fences[j]);
array = dma_fence_array_create(i, fences, fence_context,
seqno, true);
if (!array) {
for (j = 0; j < i; ++j)
dma_fence_put(fences[j]);
kfree(fences);
r = -ENOMEM;
goto error;
}
r = amdgpu_sync_fence(ring->adev, sync, &array->base, false);
dma_fence_put(&array->base);
if (r)
goto error;
mutex_unlock(&id_mgr->lock);
return 0;
}
kfree(fences);
job->vm_needs_flush = vm->use_cpu_for_update;
/* Check if we can use a VMID already assigned to this VM */
list_for_each_entry_reverse(id, &id_mgr->ids_lru, list) {
struct dma_fence *flushed;
bool needs_flush = vm->use_cpu_for_update;
/* Check all the prerequisites to using this VMID */
if (amdgpu_vmid_had_gpu_reset(adev, id))
continue;
if (atomic64_read(&id->owner) != vm->entity.fence_context)
continue;
if (job->vm_pd_addr != id->pd_gpu_addr)
continue;
if (!id->last_flush ||
(id->last_flush->context != fence_context &&
!dma_fence_is_signaled(id->last_flush)))
needs_flush = true;
flushed = id->flushed_updates;
if (updates && (!flushed || dma_fence_is_later(updates, flushed)))
needs_flush = true;
/* Concurrent flushes are only possible starting with Vega10 */
if (adev->asic_type < CHIP_VEGA10 && needs_flush)
continue;
/* Good we can use this VMID. Remember this submission as
* user of the VMID.
*/
r = amdgpu_sync_fence(ring->adev, &id->active, fence, false);
if (r)
goto error;
if (updates && (!flushed || dma_fence_is_later(updates, flushed))) {
dma_fence_put(id->flushed_updates);
id->flushed_updates = dma_fence_get(updates);
}
if (needs_flush)
goto needs_flush;
else
goto no_flush_needed;
};
/* Still no ID to use? Then use the idle one found earlier */
id = idle;
/* Remember this submission as user of the VMID */
r = amdgpu_sync_fence(ring->adev, &id->active, fence, false);
if (r)
goto error;
id->pd_gpu_addr = job->vm_pd_addr;
dma_fence_put(id->flushed_updates);
id->flushed_updates = dma_fence_get(updates);
atomic64_set(&id->owner, vm->entity.fence_context);
needs_flush:
job->vm_needs_flush = true;
dma_fence_put(id->last_flush);
id->last_flush = NULL;
no_flush_needed:
list_move_tail(&id->list, &id_mgr->ids_lru);
job->vmid = id - id_mgr->ids;
trace_amdgpu_vm_grab_id(vm, ring, job);
error:
mutex_unlock(&id_mgr->lock);
return r;
}
int amdgpu_vmid_alloc_reserved(struct amdgpu_device *adev,
struct amdgpu_vm *vm,
unsigned vmhub)
{
struct amdgpu_vmid_mgr *id_mgr;
struct amdgpu_vmid *idle;
int r = 0;
id_mgr = &adev->vm_manager.id_mgr[vmhub];
mutex_lock(&id_mgr->lock);
if (vm->reserved_vmid[vmhub])
goto unlock;
if (atomic_inc_return(&id_mgr->reserved_vmid_num) >
AMDGPU_VM_MAX_RESERVED_VMID) {
DRM_ERROR("Over limitation of reserved vmid\n");
atomic_dec(&id_mgr->reserved_vmid_num);
r = -EINVAL;
goto unlock;
}
/* Select the first entry VMID */
idle = list_first_entry(&id_mgr->ids_lru, struct amdgpu_vmid, list);
list_del_init(&idle->list);
vm->reserved_vmid[vmhub] = idle;
mutex_unlock(&id_mgr->lock);
return 0;
unlock:
mutex_unlock(&id_mgr->lock);
return r;
}
void amdgpu_vmid_free_reserved(struct amdgpu_device *adev,
struct amdgpu_vm *vm,
unsigned vmhub)
{
struct amdgpu_vmid_mgr *id_mgr = &adev->vm_manager.id_mgr[vmhub];
mutex_lock(&id_mgr->lock);
if (vm->reserved_vmid[vmhub]) {
list_add(&vm->reserved_vmid[vmhub]->list,
&id_mgr->ids_lru);
vm->reserved_vmid[vmhub] = NULL;
atomic_dec(&id_mgr->reserved_vmid_num);
}
mutex_unlock(&id_mgr->lock);
}
/**
* amdgpu_vmid_reset - reset VMID to zero
*
* @adev: amdgpu device structure
* @vmid: vmid number to use
*
* Reset saved GDW, GWS and OA to force switch on next flush.
*/
void amdgpu_vmid_reset(struct amdgpu_device *adev, unsigned vmhub,
unsigned vmid)
{
struct amdgpu_vmid_mgr *id_mgr = &adev->vm_manager.id_mgr[vmhub];
struct amdgpu_vmid *id = &id_mgr->ids[vmid];
atomic64_set(&id->owner, 0);
id->gds_base = 0;
id->gds_size = 0;
id->gws_base = 0;
id->gws_size = 0;
id->oa_base = 0;
id->oa_size = 0;
}
/**
* amdgpu_vmid_reset_all - reset VMID to zero
*
* @adev: amdgpu device structure
*
* Reset VMID to force flush on next use
*/
void amdgpu_vmid_reset_all(struct amdgpu_device *adev)
{
unsigned i, j;
for (i = 0; i < AMDGPU_MAX_VMHUBS; ++i) {
struct amdgpu_vmid_mgr *id_mgr =
&adev->vm_manager.id_mgr[i];
for (j = 1; j < id_mgr->num_ids; ++j)
amdgpu_vmid_reset(adev, i, j);
}
}
/**
* amdgpu_vmid_mgr_init - init the VMID manager
*
* @adev: amdgpu_device pointer
*
* Initialize the VM manager structures
*/
void amdgpu_vmid_mgr_init(struct amdgpu_device *adev)
{
unsigned i, j;
for (i = 0; i < AMDGPU_MAX_VMHUBS; ++i) {
struct amdgpu_vmid_mgr *id_mgr =
&adev->vm_manager.id_mgr[i];
mutex_init(&id_mgr->lock);
INIT_LIST_HEAD(&id_mgr->ids_lru);
atomic_set(&id_mgr->reserved_vmid_num, 0);
/* skip over VMID 0, since it is the system VM */
for (j = 1; j < id_mgr->num_ids; ++j) {
amdgpu_vmid_reset(adev, i, j);
amdgpu_sync_create(&id_mgr->ids[i].active);
list_add_tail(&id_mgr->ids[j].list, &id_mgr->ids_lru);
}
}
adev->vm_manager.fence_context =
dma_fence_context_alloc(AMDGPU_MAX_RINGS);
for (i = 0; i < AMDGPU_MAX_RINGS; ++i)
adev->vm_manager.seqno[i] = 0;
}
/**
* amdgpu_vmid_mgr_fini - cleanup VM manager
*
* @adev: amdgpu_device pointer
*
* Cleanup the VM manager and free resources.
*/
void amdgpu_vmid_mgr_fini(struct amdgpu_device *adev)
{
unsigned i, j;
for (i = 0; i < AMDGPU_MAX_VMHUBS; ++i) {
struct amdgpu_vmid_mgr *id_mgr =
&adev->vm_manager.id_mgr[i];
mutex_destroy(&id_mgr->lock);
for (j = 0; j < AMDGPU_NUM_VMID; ++j) {
struct amdgpu_vmid *id = &id_mgr->ids[j];
amdgpu_sync_free(&id->active);
dma_fence_put(id->flushed_updates);
dma_fence_put(id->last_flush);
}
}
}

View file

@ -0,0 +1,91 @@
/*
* Copyright 2017 Advanced Micro Devices, Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
* OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
* ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*
*/
#ifndef __AMDGPU_IDS_H__
#define __AMDGPU_IDS_H__
#include <linux/types.h>
#include <linux/mutex.h>
#include <linux/list.h>
#include <linux/dma-fence.h>
#include "amdgpu_sync.h"
/* maximum number of VMIDs */
#define AMDGPU_NUM_VMID 16
struct amdgpu_device;
struct amdgpu_vm;
struct amdgpu_ring;
struct amdgpu_sync;
struct amdgpu_job;
struct amdgpu_vmid {
struct list_head list;
struct amdgpu_sync active;
struct dma_fence *last_flush;
atomic64_t owner;
uint64_t pd_gpu_addr;
/* last flushed PD/PT update */
struct dma_fence *flushed_updates;
uint32_t current_gpu_reset_count;
uint32_t gds_base;
uint32_t gds_size;
uint32_t gws_base;
uint32_t gws_size;
uint32_t oa_base;
uint32_t oa_size;
};
struct amdgpu_vmid_mgr {
struct mutex lock;
unsigned num_ids;
struct list_head ids_lru;
struct amdgpu_vmid ids[AMDGPU_NUM_VMID];
atomic_t reserved_vmid_num;
};
int amdgpu_pasid_alloc(unsigned int bits);
void amdgpu_pasid_free(unsigned int pasid);
bool amdgpu_vmid_had_gpu_reset(struct amdgpu_device *adev,
struct amdgpu_vmid *id);
int amdgpu_vmid_alloc_reserved(struct amdgpu_device *adev,
struct amdgpu_vm *vm,
unsigned vmhub);
void amdgpu_vmid_free_reserved(struct amdgpu_device *adev,
struct amdgpu_vm *vm,
unsigned vmhub);
int amdgpu_vmid_grab(struct amdgpu_vm *vm, struct amdgpu_ring *ring,
struct amdgpu_sync *sync, struct dma_fence *fence,
struct amdgpu_job *job);
void amdgpu_vmid_reset(struct amdgpu_device *adev, unsigned vmhub,
unsigned vmid);
void amdgpu_vmid_reset_all(struct amdgpu_device *adev);
void amdgpu_vmid_mgr_init(struct amdgpu_device *adev);
void amdgpu_vmid_mgr_fini(struct amdgpu_device *adev);
#endif

View file

@ -92,15 +92,15 @@ int amdgpu_ih_ring_init(struct amdgpu_device *adev, unsigned ring_size,
}
return 0;
} else {
r = amdgpu_wb_get(adev, &adev->irq.ih.wptr_offs);
r = amdgpu_device_wb_get(adev, &adev->irq.ih.wptr_offs);
if (r) {
dev_err(adev->dev, "(%d) ih wptr_offs wb alloc failed\n", r);
return r;
}
r = amdgpu_wb_get(adev, &adev->irq.ih.rptr_offs);
r = amdgpu_device_wb_get(adev, &adev->irq.ih.rptr_offs);
if (r) {
amdgpu_wb_free(adev, adev->irq.ih.wptr_offs);
amdgpu_device_wb_free(adev, adev->irq.ih.wptr_offs);
dev_err(adev->dev, "(%d) ih rptr_offs wb alloc failed\n", r);
return r;
}
@ -133,8 +133,8 @@ void amdgpu_ih_ring_fini(struct amdgpu_device *adev)
amdgpu_bo_free_kernel(&adev->irq.ih.ring_obj,
&adev->irq.ih.gpu_addr,
(void **)&adev->irq.ih.ring);
amdgpu_wb_free(adev, adev->irq.ih.wptr_offs);
amdgpu_wb_free(adev, adev->irq.ih.rptr_offs);
amdgpu_device_wb_free(adev, adev->irq.ih.wptr_offs);
amdgpu_device_wb_free(adev, adev->irq.ih.rptr_offs);
}
}

View file

@ -105,8 +105,8 @@ struct amdgpu_iv_entry {
unsigned client_id;
unsigned src_id;
unsigned ring_id;
unsigned vm_id;
unsigned vm_id_src;
unsigned vmid;
unsigned vmid_src;
uint64_t timestamp;
unsigned timestamp_src;
unsigned pas_id;

View file

@ -88,7 +88,7 @@ static void amdgpu_irq_reset_work_func(struct work_struct *work)
reset_work);
if (!amdgpu_sriov_vf(adev))
amdgpu_gpu_reset(adev);
amdgpu_device_gpu_recover(adev, NULL, false);
}
/* Disable *all* interrupts */
@ -232,7 +232,7 @@ int amdgpu_irq_init(struct amdgpu_device *adev)
int ret = pci_enable_msi(adev->pdev);
if (!ret) {
adev->irq.msi_enabled = true;
dev_info(adev->dev, "amdgpu: using MSI.\n");
dev_dbg(adev->dev, "amdgpu: using MSI.\n");
}
}
@ -262,7 +262,7 @@ int amdgpu_irq_init(struct amdgpu_device *adev)
return r;
}
DRM_INFO("amdgpu: irq initialized.\n");
DRM_DEBUG("amdgpu: irq initialized.\n");
return 0;
}

View file

@ -28,7 +28,7 @@
#include "amdgpu.h"
#include "amdgpu_trace.h"
static void amdgpu_job_timedout(struct amd_sched_job *s_job)
static void amdgpu_job_timedout(struct drm_sched_job *s_job)
{
struct amdgpu_job *job = container_of(s_job, struct amdgpu_job, base);
@ -37,10 +37,7 @@ static void amdgpu_job_timedout(struct amd_sched_job *s_job)
atomic_read(&job->ring->fence_drv.last_seq),
job->ring->fence_drv.sync_seq);
if (amdgpu_sriov_vf(job->adev))
amdgpu_sriov_gpu_reset(job->adev, job);
else
amdgpu_gpu_reset(job->adev);
amdgpu_device_gpu_recover(job->adev, job, false);
}
int amdgpu_job_alloc(struct amdgpu_device *adev, unsigned num_ibs,
@ -63,7 +60,6 @@ int amdgpu_job_alloc(struct amdgpu_device *adev, unsigned num_ibs,
(*job)->num_ibs = num_ibs;
amdgpu_sync_create(&(*job)->sync);
amdgpu_sync_create(&(*job)->dep_sync);
amdgpu_sync_create(&(*job)->sched_sync);
(*job)->vram_lost_counter = atomic_read(&adev->vram_lost_counter);
@ -100,14 +96,13 @@ void amdgpu_job_free_resources(struct amdgpu_job *job)
amdgpu_ib_free(job->adev, &job->ibs[i], f);
}
static void amdgpu_job_free_cb(struct amd_sched_job *s_job)
static void amdgpu_job_free_cb(struct drm_sched_job *s_job)
{
struct amdgpu_job *job = container_of(s_job, struct amdgpu_job, base);
amdgpu_ring_priority_put(job->ring, amd_sched_get_job_priority(s_job));
amdgpu_ring_priority_put(job->ring, s_job->s_priority);
dma_fence_put(job->fence);
amdgpu_sync_free(&job->sync);
amdgpu_sync_free(&job->dep_sync);
amdgpu_sync_free(&job->sched_sync);
kfree(job);
}
@ -118,13 +113,12 @@ void amdgpu_job_free(struct amdgpu_job *job)
dma_fence_put(job->fence);
amdgpu_sync_free(&job->sync);
amdgpu_sync_free(&job->dep_sync);
amdgpu_sync_free(&job->sched_sync);
kfree(job);
}
int amdgpu_job_submit(struct amdgpu_job *job, struct amdgpu_ring *ring,
struct amd_sched_entity *entity, void *owner,
struct drm_sched_entity *entity, void *owner,
struct dma_fence **f)
{
int r;
@ -133,7 +127,7 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct amdgpu_ring *ring,
if (!f)
return -EINVAL;
r = amd_sched_job_init(&job->base, &ring->sched, entity, owner);
r = drm_sched_job_init(&job->base, &ring->sched, entity, owner);
if (r)
return r;
@ -141,46 +135,47 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct amdgpu_ring *ring,
job->fence_ctx = entity->fence_context;
*f = dma_fence_get(&job->base.s_fence->finished);
amdgpu_job_free_resources(job);
amdgpu_ring_priority_get(job->ring,
amd_sched_get_job_priority(&job->base));
amd_sched_entity_push_job(&job->base);
amdgpu_ring_priority_get(job->ring, job->base.s_priority);
drm_sched_entity_push_job(&job->base, entity);
return 0;
}
static struct dma_fence *amdgpu_job_dependency(struct amd_sched_job *sched_job)
static struct dma_fence *amdgpu_job_dependency(struct drm_sched_job *sched_job,
struct drm_sched_entity *s_entity)
{
struct amdgpu_job *job = to_amdgpu_job(sched_job);
struct amdgpu_vm *vm = job->vm;
struct dma_fence *fence = amdgpu_sync_get_fence(&job->dep_sync);
bool explicit = false;
int r;
struct dma_fence *fence = amdgpu_sync_get_fence(&job->sync, &explicit);
if (amd_sched_dependency_optimized(fence, sched_job->s_entity)) {
r = amdgpu_sync_fence(job->adev, &job->sched_sync, fence);
if (r)
DRM_ERROR("Error adding fence to sync (%d)\n", r);
if (fence && explicit) {
if (drm_sched_dependency_optimized(fence, s_entity)) {
r = amdgpu_sync_fence(job->adev, &job->sched_sync, fence, false);
if (r)
DRM_ERROR("Error adding fence to sync (%d)\n", r);
}
}
if (!fence)
fence = amdgpu_sync_get_fence(&job->sync);
while (fence == NULL && vm && !job->vm_id) {
while (fence == NULL && vm && !job->vmid) {
struct amdgpu_ring *ring = job->ring;
r = amdgpu_vm_grab_id(vm, ring, &job->sync,
&job->base.s_fence->finished,
job);
r = amdgpu_vmid_grab(vm, ring, &job->sync,
&job->base.s_fence->finished,
job);
if (r)
DRM_ERROR("Error getting VM ID (%d)\n", r);
fence = amdgpu_sync_get_fence(&job->sync);
fence = amdgpu_sync_get_fence(&job->sync, NULL);
}
return fence;
}
static struct dma_fence *amdgpu_job_run(struct amd_sched_job *sched_job)
static struct dma_fence *amdgpu_job_run(struct drm_sched_job *sched_job)
{
struct dma_fence *fence = NULL;
struct dma_fence *fence = NULL, *finished;
struct amdgpu_device *adev;
struct amdgpu_job *job;
int r;
@ -190,15 +185,18 @@ static struct dma_fence *amdgpu_job_run(struct amd_sched_job *sched_job)
return NULL;
}
job = to_amdgpu_job(sched_job);
finished = &job->base.s_fence->finished;
adev = job->adev;
BUG_ON(amdgpu_sync_peek_fence(&job->sync, NULL));
trace_amdgpu_sched_run_job(job);
/* skip ib schedule when vram is lost */
if (job->vram_lost_counter != atomic_read(&adev->vram_lost_counter)) {
dma_fence_set_error(&job->base.s_fence->finished, -ECANCELED);
DRM_ERROR("Skip scheduling IBs!\n");
if (job->vram_lost_counter != atomic_read(&adev->vram_lost_counter))
dma_fence_set_error(finished, -ECANCELED);/* skip IB as well if VRAM lost */
if (finished->error < 0) {
DRM_INFO("Skip scheduling IBs!\n");
} else {
r = amdgpu_ib_schedule(job->ring, job->num_ibs, job->ibs, job,
&fence);
@ -213,7 +211,7 @@ static struct dma_fence *amdgpu_job_run(struct amd_sched_job *sched_job)
return fence;
}
const struct amd_sched_backend_ops amdgpu_sched_ops = {
const struct drm_sched_backend_ops amdgpu_sched_ops = {
.dependency = amdgpu_job_dependency,
.run_job = amdgpu_job_run,
.timedout_job = amdgpu_job_timedout,

View file

@ -63,8 +63,6 @@ void amdgpu_driver_unload_kms(struct drm_device *dev)
pm_runtime_forbid(dev->dev);
}
amdgpu_amdkfd_device_fini(adev);
amdgpu_acpi_fini(adev);
amdgpu_device_fini(adev);
@ -159,9 +157,6 @@ int amdgpu_driver_load_kms(struct drm_device *dev, unsigned long flags)
"Error during ACPI methods call\n");
}
amdgpu_amdkfd_device_probe(adev);
amdgpu_amdkfd_device_init(adev);
if (amdgpu_device_is_px(dev)) {
pm_runtime_use_autosuspend(dev->dev);
pm_runtime_set_autosuspend_delay(dev->dev, 5000);
@ -171,9 +166,6 @@ int amdgpu_driver_load_kms(struct drm_device *dev, unsigned long flags)
pm_runtime_put_autosuspend(dev->dev);
}
if (amdgpu_sriov_vf(adev))
amdgpu_virt_release_full_gpu(adev, true);
out:
if (r) {
/* balance pm_runtime_get_sync in amdgpu_driver_unload_kms */
@ -558,6 +550,7 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file
}
case AMDGPU_INFO_DEV_INFO: {
struct drm_amdgpu_info_device dev_info = {};
uint64_t vm_size;
dev_info.device_id = dev->pdev->device;
dev_info.chip_rev = adev->rev_id;
@ -585,8 +578,17 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file
dev_info.ids_flags |= AMDGPU_IDS_FLAGS_FUSION;
if (amdgpu_sriov_vf(adev))
dev_info.ids_flags |= AMDGPU_IDS_FLAGS_PREEMPTION;
vm_size = adev->vm_manager.max_pfn * AMDGPU_GPU_PAGE_SIZE;
dev_info.virtual_address_offset = AMDGPU_VA_RESERVED_SIZE;
dev_info.virtual_address_max = (uint64_t)adev->vm_manager.max_pfn * AMDGPU_GPU_PAGE_SIZE;
dev_info.virtual_address_max =
min(vm_size, AMDGPU_VA_HOLE_START);
vm_size -= AMDGPU_VA_RESERVED_SIZE;
if (vm_size > AMDGPU_VA_HOLE_START) {
dev_info.high_va_offset = AMDGPU_VA_HOLE_END;
dev_info.high_va_max = AMDGPU_VA_HOLE_END | vm_size;
}
dev_info.virtual_address_alignment = max((int)PAGE_SIZE, AMDGPU_GPU_PAGE_SIZE);
dev_info.pte_fragment_size = (1 << adev->vm_manager.fragment_size) * AMDGPU_GPU_PAGE_SIZE;
dev_info.gart_page_size = AMDGPU_GPU_PAGE_SIZE;
@ -786,9 +788,7 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file
*/
void amdgpu_driver_lastclose_kms(struct drm_device *dev)
{
struct amdgpu_device *adev = dev->dev_private;
amdgpu_fbdev_restore_mode(adev);
drm_fb_helper_lastclose(dev);
vga_switcheroo_process_delayed_switch();
}

View file

@ -89,7 +89,6 @@ enum amdgpu_hpd_id {
AMDGPU_HPD_4,
AMDGPU_HPD_5,
AMDGPU_HPD_6,
AMDGPU_HPD_LAST,
AMDGPU_HPD_NONE = 0xff,
};
@ -106,7 +105,6 @@ enum amdgpu_crtc_irq {
AMDGPU_CRTC_IRQ_VLINE4,
AMDGPU_CRTC_IRQ_VLINE5,
AMDGPU_CRTC_IRQ_VLINE6,
AMDGPU_CRTC_IRQ_LAST,
AMDGPU_CRTC_IRQ_NONE = 0xff
};
@ -117,7 +115,6 @@ enum amdgpu_pageflip_irq {
AMDGPU_PAGEFLIP_IRQ_D4,
AMDGPU_PAGEFLIP_IRQ_D5,
AMDGPU_PAGEFLIP_IRQ_D6,
AMDGPU_PAGEFLIP_IRQ_LAST,
AMDGPU_PAGEFLIP_IRQ_NONE = 0xff
};
@ -661,10 +658,6 @@ void amdgpu_fbdev_fini(struct amdgpu_device *adev);
void amdgpu_fbdev_set_suspend(struct amdgpu_device *adev, int state);
int amdgpu_fbdev_total_size(struct amdgpu_device *adev);
bool amdgpu_fbdev_robj_is_fb(struct amdgpu_device *adev, struct amdgpu_bo *robj);
void amdgpu_fbdev_restore_mode(struct amdgpu_device *adev);
void amdgpu_fb_output_poll_changed(struct amdgpu_device *adev);
int amdgpu_align_pitch(struct amdgpu_device *adev, int width, int bpp, bool tiled);

View file

@ -37,6 +37,18 @@
#include "amdgpu.h"
#include "amdgpu_trace.h"
static bool amdgpu_need_backup(struct amdgpu_device *adev)
{
if (adev->flags & AMD_IS_APU)
return false;
if (amdgpu_gpu_recovery == 0 ||
(amdgpu_gpu_recovery == -1 && !amdgpu_sriov_vf(adev)))
return false;
return true;
}
static void amdgpu_ttm_bo_destroy(struct ttm_buffer_object *tbo)
{
struct amdgpu_device *adev = amdgpu_ttm_adev(tbo->bdev);
@ -281,6 +293,44 @@ void amdgpu_bo_free_kernel(struct amdgpu_bo **bo, u64 *gpu_addr,
*cpu_addr = NULL;
}
/* Validate bo size is bit bigger then the request domain */
static bool amdgpu_bo_validate_size(struct amdgpu_device *adev,
unsigned long size, u32 domain)
{
struct ttm_mem_type_manager *man = NULL;
/*
* If GTT is part of requested domains the check must succeed to
* allow fall back to GTT
*/
if (domain & AMDGPU_GEM_DOMAIN_GTT) {
man = &adev->mman.bdev.man[TTM_PL_TT];
if (size < (man->size << PAGE_SHIFT))
return true;
else
goto fail;
}
if (domain & AMDGPU_GEM_DOMAIN_VRAM) {
man = &adev->mman.bdev.man[TTM_PL_VRAM];
if (size < (man->size << PAGE_SHIFT))
return true;
else
goto fail;
}
/* TODO add more domains checks, such as AMDGPU_GEM_DOMAIN_CPU */
return true;
fail:
DRM_DEBUG("BO size %lu > total memory in domain: %llu\n", size,
man->size << PAGE_SHIFT);
return false;
}
static int amdgpu_bo_do_create(struct amdgpu_device *adev,
unsigned long size, int byte_align,
bool kernel, u32 domain, u64 flags,
@ -289,16 +339,24 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
uint64_t init_value,
struct amdgpu_bo **bo_ptr)
{
struct ttm_operation_ctx ctx = {
.interruptible = !kernel,
.no_wait_gpu = false,
.allow_reserved_eviction = true,
.resv = resv
};
struct amdgpu_bo *bo;
enum ttm_bo_type type;
unsigned long page_align;
u64 initial_bytes_moved, bytes_moved;
size_t acc_size;
int r;
page_align = roundup(byte_align, PAGE_SIZE) >> PAGE_SHIFT;
size = ALIGN(size, PAGE_SIZE);
if (!amdgpu_bo_validate_size(adev, size, domain))
return -ENOMEM;
if (kernel) {
type = ttm_bo_type_kernel;
} else if (sg) {
@ -364,22 +422,19 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
bo->tbo.bdev = &adev->mman.bdev;
amdgpu_ttm_placement_from_domain(bo, domain);
initial_bytes_moved = atomic64_read(&adev->num_bytes_moved);
/* Kernel allocation are uninterruptible */
r = ttm_bo_init_reserved(&adev->mman.bdev, &bo->tbo, size, type,
&bo->placement, page_align, !kernel, NULL,
&bo->placement, page_align, &ctx, NULL,
acc_size, sg, resv, &amdgpu_ttm_bo_destroy);
if (unlikely(r != 0))
return r;
bytes_moved = atomic64_read(&adev->num_bytes_moved) -
initial_bytes_moved;
if (adev->mc.visible_vram_size < adev->mc.real_vram_size &&
bo->tbo.mem.mem_type == TTM_PL_VRAM &&
bo->tbo.mem.start < adev->mc.visible_vram_size >> PAGE_SHIFT)
amdgpu_cs_report_moved_bytes(adev, bytes_moved, bytes_moved);
amdgpu_cs_report_moved_bytes(adev, ctx.bytes_moved,
ctx.bytes_moved);
else
amdgpu_cs_report_moved_bytes(adev, bytes_moved, 0);
amdgpu_cs_report_moved_bytes(adev, ctx.bytes_moved, 0);
if (kernel)
bo->tbo.priority = 1;
@ -511,6 +566,7 @@ err:
int amdgpu_bo_validate(struct amdgpu_bo *bo)
{
struct ttm_operation_ctx ctx = { false, false };
uint32_t domain;
int r;
@ -521,7 +577,7 @@ int amdgpu_bo_validate(struct amdgpu_bo *bo)
retry:
amdgpu_ttm_placement_from_domain(bo, domain);
r = ttm_bo_validate(&bo->tbo, &bo->placement, false, false);
r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
if (unlikely(r == -ENOMEM) && domain != bo->allowed_domains) {
domain = bo->allowed_domains;
goto retry;
@ -632,6 +688,7 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo *bo, u32 domain,
u64 *gpu_addr)
{
struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
struct ttm_operation_ctx ctx = { false, false };
int r, i;
if (amdgpu_ttm_tt_get_usermm(bo->tbo.ttm))
@ -647,7 +704,7 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo *bo, u32 domain,
if (bo->pin_count) {
uint32_t mem_type = bo->tbo.mem.mem_type;
if (domain != amdgpu_mem_type_to_domain(mem_type))
if (!(domain & amdgpu_mem_type_to_domain(mem_type)))
return -EINVAL;
bo->pin_count++;
@ -682,21 +739,23 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo *bo, u32 domain,
bo->placements[i].flags |= TTM_PL_FLAG_NO_EVICT;
}
r = ttm_bo_validate(&bo->tbo, &bo->placement, false, false);
r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
if (unlikely(r)) {
dev_err(adev->dev, "%p pin failed\n", bo);
goto error;
}
bo->pin_count = 1;
if (gpu_addr != NULL) {
r = amdgpu_ttm_bind(&bo->tbo, &bo->tbo.mem);
if (unlikely(r)) {
dev_err(adev->dev, "%p bind failed\n", bo);
goto error;
}
*gpu_addr = amdgpu_bo_gpu_offset(bo);
r = amdgpu_ttm_alloc_gart(&bo->tbo);
if (unlikely(r)) {
dev_err(adev->dev, "%p bind failed\n", bo);
goto error;
}
bo->pin_count = 1;
if (gpu_addr != NULL)
*gpu_addr = amdgpu_bo_gpu_offset(bo);
domain = amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type);
if (domain == AMDGPU_GEM_DOMAIN_VRAM) {
adev->vram_pin_size += amdgpu_bo_size(bo);
if (bo->flags & AMDGPU_GEM_CREATE_NO_CPU_ACCESS)
@ -717,6 +776,7 @@ int amdgpu_bo_pin(struct amdgpu_bo *bo, u32 domain, u64 *gpu_addr)
int amdgpu_bo_unpin(struct amdgpu_bo *bo)
{
struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
struct ttm_operation_ctx ctx = { false, false };
int r, i;
if (!bo->pin_count) {
@ -730,7 +790,7 @@ int amdgpu_bo_unpin(struct amdgpu_bo *bo)
bo->placements[i].lpfn = 0;
bo->placements[i].flags &= ~TTM_PL_FLAG_NO_EVICT;
}
r = ttm_bo_validate(&bo->tbo, &bo->placement, false, false);
r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
if (unlikely(r)) {
dev_err(adev->dev, "%p validate failed for unpin\n", bo);
goto error;
@ -779,8 +839,8 @@ int amdgpu_bo_init(struct amdgpu_device *adev)
adev->mc.vram_mtrr = arch_phys_wc_add(adev->mc.aper_base,
adev->mc.aper_size);
DRM_INFO("Detected VRAM RAM=%lluM, BAR=%lluM\n",
adev->mc.mc_vram_size >> 20,
(unsigned long long)adev->mc.aper_size >> 20);
adev->mc.mc_vram_size >> 20,
(unsigned long long)adev->mc.aper_size >> 20);
DRM_INFO("RAM width %dbits %s\n",
adev->mc.vram_width, amdgpu_vram_names[adev->mc.vram_type]);
return amdgpu_ttm_init(adev);
@ -902,6 +962,7 @@ void amdgpu_bo_move_notify(struct ttm_buffer_object *bo,
int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo)
{
struct amdgpu_device *adev = amdgpu_ttm_adev(bo->bdev);
struct ttm_operation_ctx ctx = { false, false };
struct amdgpu_bo *abo;
unsigned long offset, size;
int r;
@ -935,7 +996,7 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo)
abo->placement.num_busy_placement = 1;
abo->placement.busy_placement = &abo->placements[1];
r = ttm_bo_validate(bo, &abo->placement, false, false);
r = ttm_bo_validate(bo, &abo->placement, &ctx);
if (unlikely(r != 0))
return r;
@ -980,7 +1041,7 @@ u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo)
{
WARN_ON_ONCE(bo->tbo.mem.mem_type == TTM_PL_SYSTEM);
WARN_ON_ONCE(bo->tbo.mem.mem_type == TTM_PL_TT &&
!amdgpu_ttm_is_bound(bo->tbo.ttm));
!amdgpu_gtt_mgr_has_gart_addr(&bo->tbo.mem));
WARN_ON_ONCE(!ww_mutex_is_locked(&bo->tbo.resv->lock) &&
!bo->pin_count);
WARN_ON_ONCE(bo->tbo.mem.start == AMDGPU_BO_INVALID_OFFSET);

View file

@ -187,7 +187,7 @@ static inline u64 amdgpu_bo_mmap_offset(struct amdgpu_bo *bo)
static inline bool amdgpu_bo_gpu_accessible(struct amdgpu_bo *bo)
{
switch (bo->tbo.mem.mem_type) {
case TTM_PL_TT: return amdgpu_ttm_is_bound(bo->tbo.ttm);
case TTM_PL_TT: return amdgpu_gtt_mgr_has_gart_addr(&bo->tbo.mem);
case TTM_PL_VRAM: return true;
default: return false;
}

View file

@ -32,7 +32,6 @@
#include <linux/hwmon.h>
#include <linux/hwmon-sysfs.h>
#include "amd_powerplay.h"
static int amdgpu_debugfs_pm_init(struct amdgpu_device *adev);
@ -1279,16 +1278,16 @@ void amdgpu_dpm_enable_vce(struct amdgpu_device *adev, bool enable)
/* XXX select vce level based on ring/task */
adev->pm.dpm.vce_level = AMD_VCE_LEVEL_AC_ALL;
mutex_unlock(&adev->pm.mutex);
amdgpu_set_clockgating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
AMD_CG_STATE_UNGATE);
amdgpu_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
AMD_PG_STATE_UNGATE);
amdgpu_device_ip_set_clockgating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
AMD_CG_STATE_UNGATE);
amdgpu_device_ip_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
AMD_PG_STATE_UNGATE);
amdgpu_pm_compute_clocks(adev);
} else {
amdgpu_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
AMD_PG_STATE_GATE);
amdgpu_set_clockgating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
AMD_CG_STATE_GATE);
amdgpu_device_ip_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
AMD_PG_STATE_GATE);
amdgpu_device_ip_set_clockgating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
AMD_CG_STATE_GATE);
mutex_lock(&adev->pm.mutex);
adev->pm.dpm.vce_active = false;
mutex_unlock(&adev->pm.mutex);
@ -1585,7 +1584,7 @@ static int amdgpu_debugfs_pm_info(struct seq_file *m, void *data)
struct drm_device *ddev = adev->ddev;
u32 flags = 0;
amdgpu_get_clockgating_state(adev, &flags);
amdgpu_device_ip_get_clockgating_state(adev, &flags);
seq_printf(m, "Clock Gating Flags Mask: 0x%x\n", flags);
amdgpu_parse_cg_state(m, flags);
seq_printf(m, "\n");

View file

@ -264,7 +264,7 @@ static int psp_hw_start(struct psp_context *psp)
struct amdgpu_device *adev = psp->adev;
int ret;
if (!amdgpu_sriov_vf(adev) || !adev->in_sriov_reset) {
if (!amdgpu_sriov_vf(adev) || !adev->in_gpu_reset) {
ret = psp_bootloader_load_sysdrv(psp);
if (ret)
return ret;
@ -334,23 +334,26 @@ static int psp_load_fw(struct amdgpu_device *adev)
int ret;
struct psp_context *psp = &adev->psp;
if (amdgpu_sriov_vf(adev) && adev->in_gpu_reset != 0)
goto skip_memalloc;
psp->cmd = kzalloc(sizeof(struct psp_gfx_cmd_resp), GFP_KERNEL);
if (!psp->cmd)
return -ENOMEM;
ret = amdgpu_bo_create_kernel(adev, PSP_1_MEG, PSP_1_MEG,
AMDGPU_GEM_DOMAIN_GTT,
&psp->fw_pri_bo,
&psp->fw_pri_mc_addr,
&psp->fw_pri_buf);
AMDGPU_GEM_DOMAIN_GTT,
&psp->fw_pri_bo,
&psp->fw_pri_mc_addr,
&psp->fw_pri_buf);
if (ret)
goto failed;
ret = amdgpu_bo_create_kernel(adev, PSP_FENCE_BUFFER_SIZE, PAGE_SIZE,
AMDGPU_GEM_DOMAIN_VRAM,
&psp->fence_buf_bo,
&psp->fence_buf_mc_addr,
&psp->fence_buf);
AMDGPU_GEM_DOMAIN_VRAM,
&psp->fence_buf_bo,
&psp->fence_buf_mc_addr,
&psp->fence_buf);
if (ret)
goto failed_mem2;
@ -375,6 +378,7 @@ static int psp_load_fw(struct amdgpu_device *adev)
if (ret)
goto failed_mem;
skip_memalloc:
ret = psp_hw_start(psp);
if (ret)
goto failed_mem;

View file

@ -225,7 +225,7 @@ int amdgpu_queue_mgr_map(struct amdgpu_device *adev,
/* Right now all IPs have only one instance - multiple rings. */
if (instance != 0) {
DRM_ERROR("invalid ip instance: %d\n", instance);
DRM_DEBUG("invalid ip instance: %d\n", instance);
return -EINVAL;
}
@ -255,13 +255,13 @@ int amdgpu_queue_mgr_map(struct amdgpu_device *adev,
ip_num_rings = adev->vcn.num_enc_rings;
break;
default:
DRM_ERROR("unknown ip type: %d\n", hw_ip);
DRM_DEBUG("unknown ip type: %d\n", hw_ip);
return -EINVAL;
}
if (ring >= ip_num_rings) {
DRM_ERROR("Ring index:%d exceeds maximum:%d for ip:%d\n",
ring, ip_num_rings, hw_ip);
DRM_DEBUG("Ring index:%d exceeds maximum:%d for ip:%d\n",
ring, ip_num_rings, hw_ip);
return -EINVAL;
}
@ -292,7 +292,7 @@ int amdgpu_queue_mgr_map(struct amdgpu_device *adev,
default:
*out_ring = NULL;
r = -EINVAL;
DRM_ERROR("unknown HW IP type: %d\n", mapper->hw_ip);
DRM_DEBUG("unknown HW IP type: %d\n", mapper->hw_ip);
}
out_unlock:

View file

@ -164,7 +164,7 @@ void amdgpu_ring_undo(struct amdgpu_ring *ring)
* Release a request for executing at @priority
*/
void amdgpu_ring_priority_put(struct amdgpu_ring *ring,
enum amd_sched_priority priority)
enum drm_sched_priority priority)
{
int i;
@ -175,7 +175,7 @@ void amdgpu_ring_priority_put(struct amdgpu_ring *ring,
return;
/* no need to restore if the job is already at the lowest priority */
if (priority == AMD_SCHED_PRIORITY_NORMAL)
if (priority == DRM_SCHED_PRIORITY_NORMAL)
return;
mutex_lock(&ring->priority_mutex);
@ -184,8 +184,8 @@ void amdgpu_ring_priority_put(struct amdgpu_ring *ring,
goto out_unlock;
/* decay priority to the next level with a job available */
for (i = priority; i >= AMD_SCHED_PRIORITY_MIN; i--) {
if (i == AMD_SCHED_PRIORITY_NORMAL
for (i = priority; i >= DRM_SCHED_PRIORITY_MIN; i--) {
if (i == DRM_SCHED_PRIORITY_NORMAL
|| atomic_read(&ring->num_jobs[i])) {
ring->priority = i;
ring->funcs->set_priority(ring, i);
@ -206,7 +206,7 @@ out_unlock:
* Request a ring's priority to be raised to @priority (refcounted).
*/
void amdgpu_ring_priority_get(struct amdgpu_ring *ring,
enum amd_sched_priority priority)
enum drm_sched_priority priority)
{
if (!ring->funcs->set_priority)
return;
@ -263,25 +263,25 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct amdgpu_ring *ring,
return r;
}
r = amdgpu_wb_get(adev, &ring->rptr_offs);
r = amdgpu_device_wb_get(adev, &ring->rptr_offs);
if (r) {
dev_err(adev->dev, "(%d) ring rptr_offs wb alloc failed\n", r);
return r;
}
r = amdgpu_wb_get(adev, &ring->wptr_offs);
r = amdgpu_device_wb_get(adev, &ring->wptr_offs);
if (r) {
dev_err(adev->dev, "(%d) ring wptr_offs wb alloc failed\n", r);
return r;
}
r = amdgpu_wb_get(adev, &ring->fence_offs);
r = amdgpu_device_wb_get(adev, &ring->fence_offs);
if (r) {
dev_err(adev->dev, "(%d) ring fence_offs wb alloc failed\n", r);
return r;
}
r = amdgpu_wb_get(adev, &ring->cond_exe_offs);
r = amdgpu_device_wb_get(adev, &ring->cond_exe_offs);
if (r) {
dev_err(adev->dev, "(%d) ring cond_exec_polling wb alloc failed\n", r);
return r;
@ -317,12 +317,12 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct amdgpu_ring *ring,
}
ring->max_dw = max_dw;
ring->priority = AMD_SCHED_PRIORITY_NORMAL;
ring->priority = DRM_SCHED_PRIORITY_NORMAL;
mutex_init(&ring->priority_mutex);
INIT_LIST_HEAD(&ring->lru_list);
amdgpu_ring_lru_touch(adev, ring);
for (i = 0; i < AMD_SCHED_PRIORITY_MAX; ++i)
for (i = 0; i < DRM_SCHED_PRIORITY_MAX; ++i)
atomic_set(&ring->num_jobs[i], 0);
if (amdgpu_debugfs_ring_init(adev, ring)) {
@ -348,11 +348,11 @@ void amdgpu_ring_fini(struct amdgpu_ring *ring)
if (!(ring->adev) || !(ring->adev->rings[ring->idx]))
return;
amdgpu_wb_free(ring->adev, ring->rptr_offs);
amdgpu_wb_free(ring->adev, ring->wptr_offs);
amdgpu_device_wb_free(ring->adev, ring->rptr_offs);
amdgpu_device_wb_free(ring->adev, ring->wptr_offs);
amdgpu_wb_free(ring->adev, ring->cond_exe_offs);
amdgpu_wb_free(ring->adev, ring->fence_offs);
amdgpu_device_wb_free(ring->adev, ring->cond_exe_offs);
amdgpu_device_wb_free(ring->adev, ring->fence_offs);
amdgpu_bo_free_kernel(&ring->ring_obj,
&ring->gpu_addr,

View file

@ -25,7 +25,7 @@
#define __AMDGPU_RING_H__
#include <drm/amdgpu_drm.h>
#include "gpu_scheduler.h"
#include <drm/gpu_scheduler.h>
/* max number of rings */
#define AMDGPU_MAX_RINGS 18
@ -79,8 +79,7 @@ struct amdgpu_fence_driver {
int amdgpu_fence_driver_init(struct amdgpu_device *adev);
void amdgpu_fence_driver_fini(struct amdgpu_device *adev);
void amdgpu_fence_driver_force_completion(struct amdgpu_device *adev);
void amdgpu_fence_driver_force_completion_ring(struct amdgpu_ring *ring);
void amdgpu_fence_driver_force_completion(struct amdgpu_ring *ring);
int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring,
unsigned num_hw_submission);
@ -122,11 +121,11 @@ struct amdgpu_ring_funcs {
/* command emit functions */
void (*emit_ib)(struct amdgpu_ring *ring,
struct amdgpu_ib *ib,
unsigned vm_id, bool ctx_switch);
unsigned vmid, bool ctx_switch);
void (*emit_fence)(struct amdgpu_ring *ring, uint64_t addr,
uint64_t seq, unsigned flags);
void (*emit_pipeline_sync)(struct amdgpu_ring *ring);
void (*emit_vm_flush)(struct amdgpu_ring *ring, unsigned vm_id,
void (*emit_vm_flush)(struct amdgpu_ring *ring, unsigned vmid,
uint64_t pd_addr);
void (*emit_hdp_flush)(struct amdgpu_ring *ring);
void (*emit_hdp_invalidate)(struct amdgpu_ring *ring);
@ -155,14 +154,14 @@ struct amdgpu_ring_funcs {
void (*emit_tmz)(struct amdgpu_ring *ring, bool start);
/* priority functions */
void (*set_priority) (struct amdgpu_ring *ring,
enum amd_sched_priority priority);
enum drm_sched_priority priority);
};
struct amdgpu_ring {
struct amdgpu_device *adev;
const struct amdgpu_ring_funcs *funcs;
struct amdgpu_fence_driver fence_drv;
struct amd_gpu_scheduler sched;
struct drm_gpu_scheduler sched;
struct list_head lru_list;
struct amdgpu_bo *ring_obj;
@ -187,6 +186,7 @@ struct amdgpu_ring {
uint64_t eop_gpu_addr;
u32 doorbell_index;
bool use_doorbell;
bool use_pollmem;
unsigned wptr_offs;
unsigned fence_offs;
uint64_t current_ctx;
@ -197,7 +197,7 @@ struct amdgpu_ring {
unsigned vm_inv_eng;
bool has_compute_vm_bug;
atomic_t num_jobs[AMD_SCHED_PRIORITY_MAX];
atomic_t num_jobs[DRM_SCHED_PRIORITY_MAX];
struct mutex priority_mutex;
/* protected by priority_mutex */
int priority;
@ -213,9 +213,9 @@ void amdgpu_ring_generic_pad_ib(struct amdgpu_ring *ring, struct amdgpu_ib *ib);
void amdgpu_ring_commit(struct amdgpu_ring *ring);
void amdgpu_ring_undo(struct amdgpu_ring *ring);
void amdgpu_ring_priority_get(struct amdgpu_ring *ring,
enum amd_sched_priority priority);
enum drm_sched_priority priority);
void amdgpu_ring_priority_put(struct amdgpu_ring *ring,
enum amd_sched_priority priority);
enum drm_sched_priority priority);
int amdgpu_ring_init(struct amdgpu_device *adev, struct amdgpu_ring *ring,
unsigned ring_size, struct amdgpu_irq_src *irq_src,
unsigned irq_type);

View file

@ -29,29 +29,29 @@
#include "amdgpu_vm.h"
enum amd_sched_priority amdgpu_to_sched_priority(int amdgpu_priority)
enum drm_sched_priority amdgpu_to_sched_priority(int amdgpu_priority)
{
switch (amdgpu_priority) {
case AMDGPU_CTX_PRIORITY_VERY_HIGH:
return AMD_SCHED_PRIORITY_HIGH_HW;
return DRM_SCHED_PRIORITY_HIGH_HW;
case AMDGPU_CTX_PRIORITY_HIGH:
return AMD_SCHED_PRIORITY_HIGH_SW;
return DRM_SCHED_PRIORITY_HIGH_SW;
case AMDGPU_CTX_PRIORITY_NORMAL:
return AMD_SCHED_PRIORITY_NORMAL;
return DRM_SCHED_PRIORITY_NORMAL;
case AMDGPU_CTX_PRIORITY_LOW:
case AMDGPU_CTX_PRIORITY_VERY_LOW:
return AMD_SCHED_PRIORITY_LOW;
return DRM_SCHED_PRIORITY_LOW;
case AMDGPU_CTX_PRIORITY_UNSET:
return AMD_SCHED_PRIORITY_UNSET;
return DRM_SCHED_PRIORITY_UNSET;
default:
WARN(1, "Invalid context priority %d\n", amdgpu_priority);
return AMD_SCHED_PRIORITY_INVALID;
return DRM_SCHED_PRIORITY_INVALID;
}
}
static int amdgpu_sched_process_priority_override(struct amdgpu_device *adev,
int fd,
enum amd_sched_priority priority)
enum drm_sched_priority priority)
{
struct file *filp = fcheck(fd);
struct drm_file *file;
@ -86,11 +86,11 @@ int amdgpu_sched_ioctl(struct drm_device *dev, void *data,
{
union drm_amdgpu_sched *args = data;
struct amdgpu_device *adev = dev->dev_private;
enum amd_sched_priority priority;
enum drm_sched_priority priority;
int r;
priority = amdgpu_to_sched_priority(args->in.priority);
if (args->in.flags || priority == AMD_SCHED_PRIORITY_INVALID)
if (args->in.flags || priority == DRM_SCHED_PRIORITY_INVALID)
return -EINVAL;
switch (args->in.op) {

View file

@ -27,7 +27,7 @@
#include <drm/drmP.h>
enum amd_sched_priority amdgpu_to_sched_priority(int amdgpu_priority);
enum drm_sched_priority amdgpu_to_sched_priority(int amdgpu_priority);
int amdgpu_sched_ioctl(struct drm_device *dev, void *data,
struct drm_file *filp);

View file

@ -35,6 +35,7 @@
struct amdgpu_sync_entry {
struct hlist_node node;
struct dma_fence *fence;
bool explicit;
};
static struct kmem_cache *amdgpu_sync_slab;
@ -63,7 +64,7 @@ void amdgpu_sync_create(struct amdgpu_sync *sync)
static bool amdgpu_sync_same_dev(struct amdgpu_device *adev,
struct dma_fence *f)
{
struct amd_sched_fence *s_fence = to_amd_sched_fence(f);
struct drm_sched_fence *s_fence = to_drm_sched_fence(f);
if (s_fence) {
struct amdgpu_ring *ring;
@ -84,7 +85,7 @@ static bool amdgpu_sync_same_dev(struct amdgpu_device *adev,
*/
static void *amdgpu_sync_get_owner(struct dma_fence *f)
{
struct amd_sched_fence *s_fence = to_amd_sched_fence(f);
struct drm_sched_fence *s_fence = to_drm_sched_fence(f);
if (s_fence)
return s_fence->owner;
@ -119,7 +120,7 @@ static void amdgpu_sync_keep_later(struct dma_fence **keep,
* Tries to add the fence to an existing hash entry. Returns true when an entry
* was found, false otherwise.
*/
static bool amdgpu_sync_add_later(struct amdgpu_sync *sync, struct dma_fence *f)
static bool amdgpu_sync_add_later(struct amdgpu_sync *sync, struct dma_fence *f, bool explicit)
{
struct amdgpu_sync_entry *e;
@ -128,6 +129,10 @@ static bool amdgpu_sync_add_later(struct amdgpu_sync *sync, struct dma_fence *f)
continue;
amdgpu_sync_keep_later(&e->fence, f);
/* Preserve eplicit flag to not loose pipe line sync */
e->explicit |= explicit;
return true;
}
return false;
@ -141,24 +146,25 @@ static bool amdgpu_sync_add_later(struct amdgpu_sync *sync, struct dma_fence *f)
*
*/
int amdgpu_sync_fence(struct amdgpu_device *adev, struct amdgpu_sync *sync,
struct dma_fence *f)
struct dma_fence *f, bool explicit)
{
struct amdgpu_sync_entry *e;
if (!f)
return 0;
if (amdgpu_sync_same_dev(adev, f) &&
amdgpu_sync_get_owner(f) == AMDGPU_FENCE_OWNER_VM)
amdgpu_sync_keep_later(&sync->last_vm_update, f);
if (amdgpu_sync_add_later(sync, f))
if (amdgpu_sync_add_later(sync, f, explicit))
return 0;
e = kmem_cache_alloc(amdgpu_sync_slab, GFP_KERNEL);
if (!e)
return -ENOMEM;
e->explicit = explicit;
hash_add(sync->fences, &e->node, f->context);
e->fence = dma_fence_get(f);
return 0;
@ -189,10 +195,7 @@ int amdgpu_sync_resv(struct amdgpu_device *adev,
/* always sync to the exclusive fence */
f = reservation_object_get_excl(resv);
r = amdgpu_sync_fence(adev, sync, f);
if (explicit_sync)
return r;
r = amdgpu_sync_fence(adev, sync, f, false);
flist = reservation_object_get_list(resv);
if (!flist || r)
@ -212,15 +215,15 @@ int amdgpu_sync_resv(struct amdgpu_device *adev,
(fence_owner == AMDGPU_FENCE_OWNER_VM)))
continue;
/* Ignore fence from the same owner as
/* Ignore fence from the same owner and explicit one as
* long as it isn't undefined.
*/
if (owner != AMDGPU_FENCE_OWNER_UNDEFINED &&
fence_owner == owner)
(fence_owner == owner || explicit_sync))
continue;
}
r = amdgpu_sync_fence(adev, sync, f);
r = amdgpu_sync_fence(adev, sync, f, false);
if (r)
break;
}
@ -245,7 +248,7 @@ struct dma_fence *amdgpu_sync_peek_fence(struct amdgpu_sync *sync,
hash_for_each_safe(sync->fences, i, tmp, e, node) {
struct dma_fence *f = e->fence;
struct amd_sched_fence *s_fence = to_amd_sched_fence(f);
struct drm_sched_fence *s_fence = to_drm_sched_fence(f);
if (dma_fence_is_signaled(f)) {
hash_del(&e->node);
@ -275,19 +278,21 @@ struct dma_fence *amdgpu_sync_peek_fence(struct amdgpu_sync *sync,
* amdgpu_sync_get_fence - get the next fence from the sync object
*
* @sync: sync object to use
* @explicit: true if the next fence is explicit
*
* Get and removes the next fence from the sync object not signaled yet.
*/
struct dma_fence *amdgpu_sync_get_fence(struct amdgpu_sync *sync)
struct dma_fence *amdgpu_sync_get_fence(struct amdgpu_sync *sync, bool *explicit)
{
struct amdgpu_sync_entry *e;
struct hlist_node *tmp;
struct dma_fence *f;
int i;
hash_for_each_safe(sync->fences, i, tmp, e, node) {
f = e->fence;
if (explicit)
*explicit = e->explicit;
hash_del(&e->node);
kmem_cache_free(amdgpu_sync_slab, e);

View file

@ -41,7 +41,7 @@ struct amdgpu_sync {
void amdgpu_sync_create(struct amdgpu_sync *sync);
int amdgpu_sync_fence(struct amdgpu_device *adev, struct amdgpu_sync *sync,
struct dma_fence *f);
struct dma_fence *f, bool explicit);
int amdgpu_sync_resv(struct amdgpu_device *adev,
struct amdgpu_sync *sync,
struct reservation_object *resv,
@ -49,7 +49,7 @@ int amdgpu_sync_resv(struct amdgpu_device *adev,
bool explicit_sync);
struct dma_fence *amdgpu_sync_peek_fence(struct amdgpu_sync *sync,
struct amdgpu_ring *ring);
struct dma_fence *amdgpu_sync_get_fence(struct amdgpu_sync *sync);
struct dma_fence *amdgpu_sync_get_fence(struct amdgpu_sync *sync, bool *explicit);
int amdgpu_sync_wait(struct amdgpu_sync *sync, bool intr);
void amdgpu_sync_free(struct amdgpu_sync *sync);
int amdgpu_sync_init(void);

View file

@ -82,8 +82,8 @@ TRACE_EVENT(amdgpu_iv,
__field(unsigned, client_id)
__field(unsigned, src_id)
__field(unsigned, ring_id)
__field(unsigned, vm_id)
__field(unsigned, vm_id_src)
__field(unsigned, vmid)
__field(unsigned, vmid_src)
__field(uint64_t, timestamp)
__field(unsigned, timestamp_src)
__field(unsigned, pas_id)
@ -93,8 +93,8 @@ TRACE_EVENT(amdgpu_iv,
__entry->client_id = iv->client_id;
__entry->src_id = iv->src_id;
__entry->ring_id = iv->ring_id;
__entry->vm_id = iv->vm_id;
__entry->vm_id_src = iv->vm_id_src;
__entry->vmid = iv->vmid;
__entry->vmid_src = iv->vmid_src;
__entry->timestamp = iv->timestamp;
__entry->timestamp_src = iv->timestamp_src;
__entry->pas_id = iv->pas_id;
@ -103,9 +103,9 @@ TRACE_EVENT(amdgpu_iv,
__entry->src_data[2] = iv->src_data[2];
__entry->src_data[3] = iv->src_data[3];
),
TP_printk("client_id:%u src_id:%u ring:%u vm_id:%u timestamp: %llu pas_id:%u src_data: %08x %08x %08x %08x\n",
TP_printk("client_id:%u src_id:%u ring:%u vmid:%u timestamp: %llu pas_id:%u src_data: %08x %08x %08x %08x\n",
__entry->client_id, __entry->src_id,
__entry->ring_id, __entry->vm_id,
__entry->ring_id, __entry->vmid,
__entry->timestamp, __entry->pas_id,
__entry->src_data[0], __entry->src_data[1],
__entry->src_data[2], __entry->src_data[3])
@ -219,7 +219,7 @@ TRACE_EVENT(amdgpu_vm_grab_id,
TP_STRUCT__entry(
__field(struct amdgpu_vm *, vm)
__field(u32, ring)
__field(u32, vm_id)
__field(u32, vmid)
__field(u32, vm_hub)
__field(u64, pd_addr)
__field(u32, needs_flush)
@ -228,13 +228,13 @@ TRACE_EVENT(amdgpu_vm_grab_id,
TP_fast_assign(
__entry->vm = vm;
__entry->ring = ring->idx;
__entry->vm_id = job->vm_id;
__entry->vmid = job->vmid;
__entry->vm_hub = ring->funcs->vmhub,
__entry->pd_addr = job->vm_pd_addr;
__entry->needs_flush = job->vm_needs_flush;
),
TP_printk("vm=%p, ring=%u, id=%u, hub=%u, pd_addr=%010Lx needs_flush=%u",
__entry->vm, __entry->ring, __entry->vm_id,
__entry->vm, __entry->ring, __entry->vmid,
__entry->vm_hub, __entry->pd_addr, __entry->needs_flush)
);
@ -357,24 +357,24 @@ TRACE_EVENT(amdgpu_vm_copy_ptes,
);
TRACE_EVENT(amdgpu_vm_flush,
TP_PROTO(struct amdgpu_ring *ring, unsigned vm_id,
TP_PROTO(struct amdgpu_ring *ring, unsigned vmid,
uint64_t pd_addr),
TP_ARGS(ring, vm_id, pd_addr),
TP_ARGS(ring, vmid, pd_addr),
TP_STRUCT__entry(
__field(u32, ring)
__field(u32, vm_id)
__field(u32, vmid)
__field(u32, vm_hub)
__field(u64, pd_addr)
),
TP_fast_assign(
__entry->ring = ring->idx;
__entry->vm_id = vm_id;
__entry->vmid = vmid;
__entry->vm_hub = ring->funcs->vmhub;
__entry->pd_addr = pd_addr;
),
TP_printk("ring=%u, id=%u, hub=%u, pd_addr=%010Lx",
__entry->ring, __entry->vm_id,
__entry->ring, __entry->vmid,
__entry->vm_hub,__entry->pd_addr)
);

View file

@ -76,7 +76,7 @@ static int amdgpu_ttm_global_init(struct amdgpu_device *adev)
{
struct drm_global_reference *global_ref;
struct amdgpu_ring *ring;
struct amd_sched_rq *rq;
struct drm_sched_rq *rq;
int r;
adev->mman.mem_global_referenced = false;
@ -108,9 +108,9 @@ static int amdgpu_ttm_global_init(struct amdgpu_device *adev)
mutex_init(&adev->mman.gtt_window_lock);
ring = adev->mman.buffer_funcs_ring;
rq = &ring->sched.sched_rq[AMD_SCHED_PRIORITY_KERNEL];
r = amd_sched_entity_init(&ring->sched, &adev->mman.entity,
rq, amdgpu_sched_jobs);
rq = &ring->sched.sched_rq[DRM_SCHED_PRIORITY_KERNEL];
r = drm_sched_entity_init(&ring->sched, &adev->mman.entity,
rq, amdgpu_sched_jobs, NULL);
if (r) {
DRM_ERROR("Failed setting up TTM BO move run queue.\n");
goto error_entity;
@ -131,7 +131,7 @@ error_mem:
static void amdgpu_ttm_global_fini(struct amdgpu_device *adev)
{
if (adev->mman.mem_global_referenced) {
amd_sched_entity_fini(adev->mman.entity.sched,
drm_sched_entity_fini(adev->mman.entity.sched,
&adev->mman.entity);
mutex_destroy(&adev->mman.gtt_window_lock);
drm_global_item_unref(&adev->mman.bo_global_ref.ref);
@ -282,8 +282,7 @@ static uint64_t amdgpu_mm_node_addr(struct ttm_buffer_object *bo,
{
uint64_t addr = 0;
if (mem->mem_type != TTM_PL_TT ||
amdgpu_gtt_mgr_is_allocated(mem)) {
if (mem->mem_type != TTM_PL_TT || amdgpu_gtt_mgr_has_gart_addr(mem)) {
addr = mm_node->start << PAGE_SHIFT;
addr += bo->bdev->man[mem->mem_type].gpu_offset;
}
@ -369,7 +368,7 @@ int amdgpu_ttm_copy_mem_to_mem(struct amdgpu_device *adev,
* dst to window 1
*/
if (src->mem->mem_type == TTM_PL_TT &&
!amdgpu_gtt_mgr_is_allocated(src->mem)) {
!amdgpu_gtt_mgr_has_gart_addr(src->mem)) {
r = amdgpu_map_buffer(src->bo, src->mem,
PFN_UP(cur_size + src_page_offset),
src_node_start, 0, ring,
@ -383,7 +382,7 @@ int amdgpu_ttm_copy_mem_to_mem(struct amdgpu_device *adev,
}
if (dst->mem->mem_type == TTM_PL_TT &&
!amdgpu_gtt_mgr_is_allocated(dst->mem)) {
!amdgpu_gtt_mgr_has_gart_addr(dst->mem)) {
r = amdgpu_map_buffer(dst->bo, dst->mem,
PFN_UP(cur_size + dst_page_offset),
dst_node_start, 1, ring,
@ -467,9 +466,8 @@ error:
return r;
}
static int amdgpu_move_vram_ram(struct ttm_buffer_object *bo,
bool evict, bool interruptible,
bool no_wait_gpu,
static int amdgpu_move_vram_ram(struct ttm_buffer_object *bo, bool evict,
struct ttm_operation_ctx *ctx,
struct ttm_mem_reg *new_mem)
{
struct amdgpu_device *adev;
@ -489,8 +487,7 @@ static int amdgpu_move_vram_ram(struct ttm_buffer_object *bo,
placements.fpfn = 0;
placements.lpfn = 0;
placements.flags = TTM_PL_MASK_CACHING | TTM_PL_FLAG_TT;
r = ttm_bo_mem_space(bo, &placement, &tmp_mem,
interruptible, no_wait_gpu);
r = ttm_bo_mem_space(bo, &placement, &tmp_mem, ctx);
if (unlikely(r)) {
return r;
}
@ -500,23 +497,22 @@ static int amdgpu_move_vram_ram(struct ttm_buffer_object *bo,
goto out_cleanup;
}
r = ttm_tt_bind(bo->ttm, &tmp_mem);
r = ttm_tt_bind(bo->ttm, &tmp_mem, ctx);
if (unlikely(r)) {
goto out_cleanup;
}
r = amdgpu_move_blit(bo, true, no_wait_gpu, &tmp_mem, old_mem);
r = amdgpu_move_blit(bo, true, ctx->no_wait_gpu, &tmp_mem, old_mem);
if (unlikely(r)) {
goto out_cleanup;
}
r = ttm_bo_move_ttm(bo, interruptible, no_wait_gpu, new_mem);
r = ttm_bo_move_ttm(bo, ctx, new_mem);
out_cleanup:
ttm_bo_mem_put(bo, &tmp_mem);
return r;
}
static int amdgpu_move_ram_vram(struct ttm_buffer_object *bo,
bool evict, bool interruptible,
bool no_wait_gpu,
static int amdgpu_move_ram_vram(struct ttm_buffer_object *bo, bool evict,
struct ttm_operation_ctx *ctx,
struct ttm_mem_reg *new_mem)
{
struct amdgpu_device *adev;
@ -536,16 +532,15 @@ static int amdgpu_move_ram_vram(struct ttm_buffer_object *bo,
placements.fpfn = 0;
placements.lpfn = 0;
placements.flags = TTM_PL_MASK_CACHING | TTM_PL_FLAG_TT;
r = ttm_bo_mem_space(bo, &placement, &tmp_mem,
interruptible, no_wait_gpu);
r = ttm_bo_mem_space(bo, &placement, &tmp_mem, ctx);
if (unlikely(r)) {
return r;
}
r = ttm_bo_move_ttm(bo, interruptible, no_wait_gpu, &tmp_mem);
r = ttm_bo_move_ttm(bo, ctx, &tmp_mem);
if (unlikely(r)) {
goto out_cleanup;
}
r = amdgpu_move_blit(bo, true, no_wait_gpu, new_mem, old_mem);
r = amdgpu_move_blit(bo, true, ctx->no_wait_gpu, new_mem, old_mem);
if (unlikely(r)) {
goto out_cleanup;
}
@ -554,10 +549,9 @@ out_cleanup:
return r;
}
static int amdgpu_bo_move(struct ttm_buffer_object *bo,
bool evict, bool interruptible,
bool no_wait_gpu,
struct ttm_mem_reg *new_mem)
static int amdgpu_bo_move(struct ttm_buffer_object *bo, bool evict,
struct ttm_operation_ctx *ctx,
struct ttm_mem_reg *new_mem)
{
struct amdgpu_device *adev;
struct amdgpu_bo *abo;
@ -592,19 +586,18 @@ static int amdgpu_bo_move(struct ttm_buffer_object *bo,
if (old_mem->mem_type == TTM_PL_VRAM &&
new_mem->mem_type == TTM_PL_SYSTEM) {
r = amdgpu_move_vram_ram(bo, evict, interruptible,
no_wait_gpu, new_mem);
r = amdgpu_move_vram_ram(bo, evict, ctx, new_mem);
} else if (old_mem->mem_type == TTM_PL_SYSTEM &&
new_mem->mem_type == TTM_PL_VRAM) {
r = amdgpu_move_ram_vram(bo, evict, interruptible,
no_wait_gpu, new_mem);
r = amdgpu_move_ram_vram(bo, evict, ctx, new_mem);
} else {
r = amdgpu_move_blit(bo, evict, no_wait_gpu, new_mem, old_mem);
r = amdgpu_move_blit(bo, evict, ctx->no_wait_gpu,
new_mem, old_mem);
}
if (r) {
memcpy:
r = ttm_bo_move_memcpy(bo, interruptible, no_wait_gpu, new_mem);
r = ttm_bo_move_memcpy(bo, ctx, new_mem);
if (r) {
return r;
}
@ -690,7 +683,6 @@ struct amdgpu_ttm_tt {
struct list_head guptasks;
atomic_t mmu_invalidations;
uint32_t last_set_pages;
struct list_head list;
};
int amdgpu_ttm_tt_get_user_pages(struct ttm_tt *ttm, struct page **pages)
@ -861,44 +853,35 @@ static int amdgpu_ttm_backend_bind(struct ttm_tt *ttm,
bo_mem->mem_type == AMDGPU_PL_OA)
return -EINVAL;
if (!amdgpu_gtt_mgr_is_allocated(bo_mem))
if (!amdgpu_gtt_mgr_has_gart_addr(bo_mem)) {
gtt->offset = AMDGPU_BO_INVALID_OFFSET;
return 0;
}
spin_lock(&gtt->adev->gtt_list_lock);
flags = amdgpu_ttm_tt_pte_flags(gtt->adev, ttm, bo_mem);
gtt->offset = (u64)bo_mem->start << PAGE_SHIFT;
r = amdgpu_gart_bind(gtt->adev, gtt->offset, ttm->num_pages,
ttm->pages, gtt->ttm.dma_address, flags);
if (r) {
if (r)
DRM_ERROR("failed to bind %lu pages at 0x%08llX\n",
ttm->num_pages, gtt->offset);
goto error_gart_bind;
}
list_add_tail(&gtt->list, &gtt->adev->gtt_list);
error_gart_bind:
spin_unlock(&gtt->adev->gtt_list_lock);
return r;
}
bool amdgpu_ttm_is_bound(struct ttm_tt *ttm)
{
struct amdgpu_ttm_tt *gtt = (void *)ttm;
return gtt && !list_empty(&gtt->list);
}
int amdgpu_ttm_bind(struct ttm_buffer_object *bo, struct ttm_mem_reg *bo_mem)
int amdgpu_ttm_alloc_gart(struct ttm_buffer_object *bo)
{
struct amdgpu_device *adev = amdgpu_ttm_adev(bo->bdev);
struct ttm_tt *ttm = bo->ttm;
struct ttm_operation_ctx ctx = { false, false };
struct amdgpu_ttm_tt *gtt = (void*)bo->ttm;
struct ttm_mem_reg tmp;
struct ttm_placement placement;
struct ttm_place placements;
uint64_t flags;
int r;
if (!ttm || amdgpu_ttm_is_bound(ttm))
if (bo->mem.mem_type != TTM_PL_TT ||
amdgpu_gtt_mgr_has_gart_addr(&bo->mem))
return 0;
tmp = bo->mem;
@ -912,43 +895,44 @@ int amdgpu_ttm_bind(struct ttm_buffer_object *bo, struct ttm_mem_reg *bo_mem)
placements.flags = (bo->mem.placement & ~TTM_PL_MASK_MEM) |
TTM_PL_FLAG_TT;
r = ttm_bo_mem_space(bo, &placement, &tmp, true, false);
r = ttm_bo_mem_space(bo, &placement, &tmp, &ctx);
if (unlikely(r))
return r;
r = ttm_bo_move_ttm(bo, true, false, &tmp);
if (unlikely(r))
flags = amdgpu_ttm_tt_pte_flags(adev, bo->ttm, &tmp);
gtt->offset = (u64)tmp.start << PAGE_SHIFT;
r = amdgpu_gart_bind(adev, gtt->offset, bo->ttm->num_pages,
bo->ttm->pages, gtt->ttm.dma_address, flags);
if (unlikely(r)) {
ttm_bo_mem_put(bo, &tmp);
else
bo->offset = (bo->mem.start << PAGE_SHIFT) +
bo->bdev->man[bo->mem.mem_type].gpu_offset;
return r;
}
return r;
ttm_bo_mem_put(bo, &bo->mem);
bo->mem = tmp;
bo->offset = (bo->mem.start << PAGE_SHIFT) +
bo->bdev->man[bo->mem.mem_type].gpu_offset;
return 0;
}
int amdgpu_ttm_recover_gart(struct amdgpu_device *adev)
int amdgpu_ttm_recover_gart(struct ttm_buffer_object *tbo)
{
struct amdgpu_ttm_tt *gtt, *tmp;
struct ttm_mem_reg bo_mem;
struct amdgpu_device *adev = amdgpu_ttm_adev(tbo->bdev);
struct amdgpu_ttm_tt *gtt = (void *)tbo->ttm;
uint64_t flags;
int r;
bo_mem.mem_type = TTM_PL_TT;
spin_lock(&adev->gtt_list_lock);
list_for_each_entry_safe(gtt, tmp, &adev->gtt_list, list) {
flags = amdgpu_ttm_tt_pte_flags(gtt->adev, &gtt->ttm.ttm, &bo_mem);
r = amdgpu_gart_bind(adev, gtt->offset, gtt->ttm.ttm.num_pages,
gtt->ttm.ttm.pages, gtt->ttm.dma_address,
flags);
if (r) {
spin_unlock(&adev->gtt_list_lock);
DRM_ERROR("failed to bind %lu pages at 0x%08llX\n",
gtt->ttm.ttm.num_pages, gtt->offset);
return r;
}
}
spin_unlock(&adev->gtt_list_lock);
return 0;
if (!gtt)
return 0;
flags = amdgpu_ttm_tt_pte_flags(adev, &gtt->ttm.ttm, &tbo->mem);
r = amdgpu_gart_bind(adev, gtt->offset, gtt->ttm.ttm.num_pages,
gtt->ttm.ttm.pages, gtt->ttm.dma_address, flags);
if (r)
DRM_ERROR("failed to bind %lu pages at 0x%08llX\n",
gtt->ttm.ttm.num_pages, gtt->offset);
return r;
}
static int amdgpu_ttm_backend_unbind(struct ttm_tt *ttm)
@ -959,20 +943,14 @@ static int amdgpu_ttm_backend_unbind(struct ttm_tt *ttm)
if (gtt->userptr)
amdgpu_ttm_tt_unpin_userptr(ttm);
if (!amdgpu_ttm_is_bound(ttm))
if (gtt->offset == AMDGPU_BO_INVALID_OFFSET)
return 0;
/* unbind shouldn't be done for GDS/GWS/OA in ttm_bo_clean_mm */
spin_lock(&gtt->adev->gtt_list_lock);
r = amdgpu_gart_unbind(gtt->adev, gtt->offset, ttm->num_pages);
if (r) {
if (r)
DRM_ERROR("failed to unbind %lu pages at 0x%08llX\n",
gtt->ttm.ttm.num_pages, gtt->offset);
goto error_unbind;
}
list_del_init(&gtt->list);
error_unbind:
spin_unlock(&gtt->adev->gtt_list_lock);
return r;
}
@ -1009,11 +987,11 @@ static struct ttm_tt *amdgpu_ttm_tt_create(struct ttm_bo_device *bdev,
kfree(gtt);
return NULL;
}
INIT_LIST_HEAD(&gtt->list);
return &gtt->ttm.ttm;
}
static int amdgpu_ttm_tt_populate(struct ttm_tt *ttm)
static int amdgpu_ttm_tt_populate(struct ttm_tt *ttm,
struct ttm_operation_ctx *ctx)
{
struct amdgpu_device *adev = amdgpu_ttm_adev(ttm->bdev);
struct amdgpu_ttm_tt *gtt = (void *)ttm;
@ -1041,11 +1019,11 @@ static int amdgpu_ttm_tt_populate(struct ttm_tt *ttm)
#ifdef CONFIG_SWIOTLB
if (swiotlb_nr_tbl()) {
return ttm_dma_populate(&gtt->ttm, adev->dev);
return ttm_dma_populate(&gtt->ttm, adev->dev, ctx);
}
#endif
return ttm_populate_and_map_pages(adev->dev, &gtt->ttm);
return ttm_populate_and_map_pages(adev->dev, &gtt->ttm, ctx);
}
static void amdgpu_ttm_tt_unpopulate(struct ttm_tt *ttm)
@ -1292,6 +1270,101 @@ static struct ttm_bo_driver amdgpu_bo_driver = {
.access_memory = &amdgpu_ttm_access_memory
};
/*
* Firmware Reservation functions
*/
/**
* amdgpu_ttm_fw_reserve_vram_fini - free fw reserved vram
*
* @adev: amdgpu_device pointer
*
* free fw reserved vram if it has been reserved.
*/
static void amdgpu_ttm_fw_reserve_vram_fini(struct amdgpu_device *adev)
{
amdgpu_bo_free_kernel(&adev->fw_vram_usage.reserved_bo,
NULL, &adev->fw_vram_usage.va);
}
/**
* amdgpu_ttm_fw_reserve_vram_init - create bo vram reservation from fw
*
* @adev: amdgpu_device pointer
*
* create bo vram reservation from fw.
*/
static int amdgpu_ttm_fw_reserve_vram_init(struct amdgpu_device *adev)
{
struct ttm_operation_ctx ctx = { false, false };
int r = 0;
int i;
u64 vram_size = adev->mc.visible_vram_size;
u64 offset = adev->fw_vram_usage.start_offset;
u64 size = adev->fw_vram_usage.size;
struct amdgpu_bo *bo;
adev->fw_vram_usage.va = NULL;
adev->fw_vram_usage.reserved_bo = NULL;
if (adev->fw_vram_usage.size > 0 &&
adev->fw_vram_usage.size <= vram_size) {
r = amdgpu_bo_create(adev, adev->fw_vram_usage.size,
PAGE_SIZE, true, AMDGPU_GEM_DOMAIN_VRAM,
AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |
AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS, NULL, NULL, 0,
&adev->fw_vram_usage.reserved_bo);
if (r)
goto error_create;
r = amdgpu_bo_reserve(adev->fw_vram_usage.reserved_bo, false);
if (r)
goto error_reserve;
/* remove the original mem node and create a new one at the
* request position
*/
bo = adev->fw_vram_usage.reserved_bo;
offset = ALIGN(offset, PAGE_SIZE);
for (i = 0; i < bo->placement.num_placement; ++i) {
bo->placements[i].fpfn = offset >> PAGE_SHIFT;
bo->placements[i].lpfn = (offset + size) >> PAGE_SHIFT;
}
ttm_bo_mem_put(&bo->tbo, &bo->tbo.mem);
r = ttm_bo_mem_space(&bo->tbo, &bo->placement,
&bo->tbo.mem, &ctx);
if (r)
goto error_pin;
r = amdgpu_bo_pin_restricted(adev->fw_vram_usage.reserved_bo,
AMDGPU_GEM_DOMAIN_VRAM,
adev->fw_vram_usage.start_offset,
(adev->fw_vram_usage.start_offset +
adev->fw_vram_usage.size), NULL);
if (r)
goto error_pin;
r = amdgpu_bo_kmap(adev->fw_vram_usage.reserved_bo,
&adev->fw_vram_usage.va);
if (r)
goto error_kmap;
amdgpu_bo_unreserve(adev->fw_vram_usage.reserved_bo);
}
return r;
error_kmap:
amdgpu_bo_unpin(adev->fw_vram_usage.reserved_bo);
error_pin:
amdgpu_bo_unreserve(adev->fw_vram_usage.reserved_bo);
error_reserve:
amdgpu_bo_unref(&adev->fw_vram_usage.reserved_bo);
error_create:
adev->fw_vram_usage.va = NULL;
adev->fw_vram_usage.reserved_bo = NULL;
return r;
}
int amdgpu_ttm_init(struct amdgpu_device *adev)
{
uint64_t gtt_size;
@ -1334,7 +1407,7 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
*The reserved vram for firmware must be pinned to the specified
*place on the VRAM, so reserve it early.
*/
r = amdgpu_fw_reserve_vram_init(adev);
r = amdgpu_ttm_fw_reserve_vram_init(adev);
if (r) {
return r;
}
@ -1348,9 +1421,14 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
DRM_INFO("amdgpu: %uM of VRAM memory ready\n",
(unsigned) (adev->mc.real_vram_size / (1024 * 1024)));
if (amdgpu_gtt_size == -1)
gtt_size = max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
adev->mc.mc_vram_size);
if (amdgpu_gtt_size == -1) {
struct sysinfo si;
si_meminfo(&si);
gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
adev->mc.mc_vram_size),
((uint64_t)si.totalram * si.mem_unit * 3/4));
}
else
gtt_size = (uint64_t)amdgpu_gtt_size << 20;
r = ttm_bo_init_mm(&adev->mman.bdev, TTM_PL_TT, gtt_size >> PAGE_SHIFT);
@ -1410,19 +1488,13 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
void amdgpu_ttm_fini(struct amdgpu_device *adev)
{
int r;
if (!adev->mman.initialized)
return;
amdgpu_ttm_debugfs_fini(adev);
if (adev->stolen_vga_memory) {
r = amdgpu_bo_reserve(adev->stolen_vga_memory, true);
if (r == 0) {
amdgpu_bo_unpin(adev->stolen_vga_memory);
amdgpu_bo_unreserve(adev->stolen_vga_memory);
}
amdgpu_bo_unref(&adev->stolen_vga_memory);
}
amdgpu_bo_free_kernel(&adev->stolen_vga_memory, NULL, NULL);
amdgpu_ttm_fw_reserve_vram_fini(adev);
ttm_bo_clean_mm(&adev->mman.bdev, TTM_PL_VRAM);
ttm_bo_clean_mm(&adev->mman.bdev, TTM_PL_TT);
if (adev->gds.mem.total_size)
@ -1432,7 +1504,6 @@ void amdgpu_ttm_fini(struct amdgpu_device *adev)
if (adev->gds.oa.total_size)
ttm_bo_clean_mm(&adev->mman.bdev, AMDGPU_PL_OA);
ttm_bo_device_release(&adev->mman.bdev);
amdgpu_gart_fini(adev);
amdgpu_ttm_global_fini(adev);
adev->mman.initialized = false;
DRM_INFO("amdgpu: ttm finalized\n");
@ -1628,7 +1699,7 @@ int amdgpu_fill_buffer(struct amdgpu_bo *bo,
}
if (bo->tbo.mem.mem_type == TTM_PL_TT) {
r = amdgpu_ttm_bind(&bo->tbo, &bo->tbo.mem);
r = amdgpu_ttm_alloc_gart(&bo->tbo);
if (r)
return r;
}

View file

@ -25,7 +25,7 @@
#define __AMDGPU_TTM_H__
#include "amdgpu.h"
#include "gpu_scheduler.h"
#include <drm/gpu_scheduler.h>
#define AMDGPU_PL_GDS (TTM_PL_PRIV + 0)
#define AMDGPU_PL_GWS (TTM_PL_PRIV + 1)
@ -55,7 +55,7 @@ struct amdgpu_mman {
struct mutex gtt_window_lock;
/* Scheduler entity for buffer moves */
struct amd_sched_entity entity;
struct drm_sched_entity entity;
};
struct amdgpu_copy_mem {
@ -67,8 +67,9 @@ struct amdgpu_copy_mem {
extern const struct ttm_mem_type_manager_func amdgpu_gtt_mgr_func;
extern const struct ttm_mem_type_manager_func amdgpu_vram_mgr_func;
bool amdgpu_gtt_mgr_is_allocated(struct ttm_mem_reg *mem);
bool amdgpu_gtt_mgr_has_gart_addr(struct ttm_mem_reg *mem);
uint64_t amdgpu_gtt_mgr_usage(struct ttm_mem_type_manager *man);
int amdgpu_gtt_mgr_recover(struct ttm_mem_type_manager *man);
uint64_t amdgpu_vram_mgr_usage(struct ttm_mem_type_manager *man);
uint64_t amdgpu_vram_mgr_vis_usage(struct ttm_mem_type_manager *man);
@ -90,9 +91,8 @@ int amdgpu_fill_buffer(struct amdgpu_bo *bo,
struct dma_fence **fence);
int amdgpu_mmap(struct file *filp, struct vm_area_struct *vma);
bool amdgpu_ttm_is_bound(struct ttm_tt *ttm);
int amdgpu_ttm_bind(struct ttm_buffer_object *bo, struct ttm_mem_reg *bo_mem);
int amdgpu_ttm_recover_gart(struct amdgpu_device *adev);
int amdgpu_ttm_alloc_gart(struct ttm_buffer_object *bo);
int amdgpu_ttm_recover_gart(struct ttm_buffer_object *tbo);
int amdgpu_ttm_tt_get_user_pages(struct ttm_tt *ttm, struct page **pages);
void amdgpu_ttm_tt_set_user_pages(struct ttm_tt *ttm, struct page **pages);

View file

@ -359,7 +359,6 @@ static int amdgpu_ucode_patch_jt(struct amdgpu_firmware_info *ucode,
int amdgpu_ucode_init_bo(struct amdgpu_device *adev)
{
struct amdgpu_bo **bo = &adev->firmware.fw_buf;
uint64_t fw_offset = 0;
int i, err;
struct amdgpu_firmware_info *ucode = NULL;
@ -370,36 +369,16 @@ int amdgpu_ucode_init_bo(struct amdgpu_device *adev)
return 0;
}
if (!amdgpu_sriov_vf(adev) || !adev->in_sriov_reset) {
err = amdgpu_bo_create(adev, adev->firmware.fw_size, PAGE_SIZE, true,
if (!adev->in_gpu_reset) {
err = amdgpu_bo_create_kernel(adev, adev->firmware.fw_size, PAGE_SIZE,
amdgpu_sriov_vf(adev) ? AMDGPU_GEM_DOMAIN_VRAM : AMDGPU_GEM_DOMAIN_GTT,
AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS,
NULL, NULL, 0, bo);
&adev->firmware.fw_buf,
&adev->firmware.fw_buf_mc,
&adev->firmware.fw_buf_ptr);
if (err) {
dev_err(adev->dev, "(%d) Firmware buffer allocate failed\n", err);
dev_err(adev->dev, "failed to create kernel buffer for firmware.fw_buf\n");
goto failed;
}
err = amdgpu_bo_reserve(*bo, false);
if (err) {
dev_err(adev->dev, "(%d) Firmware buffer reserve failed\n", err);
goto failed_reserve;
}
err = amdgpu_bo_pin(*bo, amdgpu_sriov_vf(adev) ? AMDGPU_GEM_DOMAIN_VRAM : AMDGPU_GEM_DOMAIN_GTT,
&adev->firmware.fw_buf_mc);
if (err) {
dev_err(adev->dev, "(%d) Firmware buffer pin failed\n", err);
goto failed_pin;
}
err = amdgpu_bo_kmap(*bo, &adev->firmware.fw_buf_ptr);
if (err) {
dev_err(adev->dev, "(%d) Firmware buffer kmap failed\n", err);
goto failed_kmap;
}
amdgpu_bo_unreserve(*bo);
}
memset(adev->firmware.fw_buf_ptr, 0, adev->firmware.fw_size);
@ -436,12 +415,6 @@ int amdgpu_ucode_init_bo(struct amdgpu_device *adev)
}
return 0;
failed_kmap:
amdgpu_bo_unpin(*bo);
failed_pin:
amdgpu_bo_unreserve(*bo);
failed_reserve:
amdgpu_bo_unref(bo);
failed:
if (err)
adev->firmware.load_type = AMDGPU_FW_LOAD_DIRECT;
@ -464,8 +437,10 @@ int amdgpu_ucode_fini_bo(struct amdgpu_device *adev)
ucode->kaddr = NULL;
}
}
amdgpu_bo_unref(&adev->firmware.fw_buf);
adev->firmware.fw_buf = NULL;
amdgpu_bo_free_kernel(&adev->firmware.fw_buf,
&adev->firmware.fw_buf_mc,
&adev->firmware.fw_buf_ptr);
return 0;
}

View file

@ -116,7 +116,7 @@ static void amdgpu_uvd_idle_work_handler(struct work_struct *work);
int amdgpu_uvd_sw_init(struct amdgpu_device *adev)
{
struct amdgpu_ring *ring;
struct amd_sched_rq *rq;
struct drm_sched_rq *rq;
unsigned long bo_size;
const char *fw_name;
const struct common_firmware_header *hdr;
@ -230,9 +230,9 @@ int amdgpu_uvd_sw_init(struct amdgpu_device *adev)
}
ring = &adev->uvd.ring;
rq = &ring->sched.sched_rq[AMD_SCHED_PRIORITY_NORMAL];
r = amd_sched_entity_init(&ring->sched, &adev->uvd.entity,
rq, amdgpu_sched_jobs);
rq = &ring->sched.sched_rq[DRM_SCHED_PRIORITY_NORMAL];
r = drm_sched_entity_init(&ring->sched, &adev->uvd.entity,
rq, amdgpu_sched_jobs, NULL);
if (r != 0) {
DRM_ERROR("Failed setting up UVD run queue.\n");
return r;
@ -244,7 +244,7 @@ int amdgpu_uvd_sw_init(struct amdgpu_device *adev)
}
/* from uvd v5.0 HW addressing capacity increased to 64 bits */
if (!amdgpu_ip_block_version_cmp(adev, AMD_IP_BLOCK_TYPE_UVD, 5, 0))
if (!amdgpu_device_ip_block_version_cmp(adev, AMD_IP_BLOCK_TYPE_UVD, 5, 0))
adev->uvd.address_64_bit = true;
switch (adev->asic_type) {
@ -272,7 +272,7 @@ int amdgpu_uvd_sw_fini(struct amdgpu_device *adev)
int i;
kfree(adev->uvd.saved_bo);
amd_sched_entity_fini(&adev->uvd.ring.sched, &adev->uvd.entity);
drm_sched_entity_fini(&adev->uvd.ring.sched, &adev->uvd.entity);
amdgpu_bo_free_kernel(&adev->uvd.vcpu_bo,
&adev->uvd.gpu_addr,
@ -297,6 +297,8 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
if (adev->uvd.vcpu_bo == NULL)
return 0;
cancel_delayed_work_sync(&adev->uvd.idle_work);
for (i = 0; i < adev->uvd.max_handles; ++i)
if (atomic_read(&adev->uvd.handles[i]))
break;
@ -304,8 +306,6 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
if (i == AMDGPU_MAX_UVD_HANDLES)
return 0;
cancel_delayed_work_sync(&adev->uvd.idle_work);
size = amdgpu_bo_size(adev->uvd.vcpu_bo);
ptr = adev->uvd.cpu_addr;
@ -346,6 +346,8 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
ptr += le32_to_cpu(hdr->ucode_size_bytes);
}
memset_io(ptr, 0, size);
/* to restore uvd fence seq */
amdgpu_fence_driver_force_completion(&adev->uvd.ring);
}
return 0;
@ -408,6 +410,7 @@ static u64 amdgpu_uvd_get_addr_from_ctx(struct amdgpu_uvd_cs_ctx *ctx)
*/
static int amdgpu_uvd_cs_pass1(struct amdgpu_uvd_cs_ctx *ctx)
{
struct ttm_operation_ctx tctx = { false, false };
struct amdgpu_bo_va_mapping *mapping;
struct amdgpu_bo *bo;
uint32_t cmd;
@ -430,7 +433,7 @@ static int amdgpu_uvd_cs_pass1(struct amdgpu_uvd_cs_ctx *ctx)
}
amdgpu_uvd_force_into_uvd_segment(bo);
r = ttm_bo_validate(&bo->tbo, &bo->placement, false, false);
r = ttm_bo_validate(&bo->tbo, &bo->placement, &tctx);
}
return r;
@ -949,6 +952,7 @@ int amdgpu_uvd_ring_parse_cs(struct amdgpu_cs_parser *parser, uint32_t ib_idx)
static int amdgpu_uvd_send_msg(struct amdgpu_ring *ring, struct amdgpu_bo *bo,
bool direct, struct dma_fence **fence)
{
struct ttm_operation_ctx ctx = { true, false };
struct ttm_validate_buffer tv;
struct ww_acquire_ctx ticket;
struct list_head head;
@ -975,7 +979,7 @@ static int amdgpu_uvd_send_msg(struct amdgpu_ring *ring, struct amdgpu_bo *bo,
amdgpu_uvd_force_into_uvd_segment(bo);
}
r = ttm_bo_validate(&bo->tbo, &bo->placement, true, false);
r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
if (r)
goto err;
@ -1151,10 +1155,10 @@ static void amdgpu_uvd_idle_work_handler(struct work_struct *work)
} else {
amdgpu_asic_set_uvd_clocks(adev, 0, 0);
/* shutdown the UVD block */
amdgpu_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_UVD,
AMD_PG_STATE_GATE);
amdgpu_set_clockgating_state(adev, AMD_IP_BLOCK_TYPE_UVD,
AMD_CG_STATE_GATE);
amdgpu_device_ip_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_UVD,
AMD_PG_STATE_GATE);
amdgpu_device_ip_set_clockgating_state(adev, AMD_IP_BLOCK_TYPE_UVD,
AMD_CG_STATE_GATE);
}
} else {
schedule_delayed_work(&adev->uvd.idle_work, UVD_IDLE_TIMEOUT);
@ -1174,10 +1178,10 @@ void amdgpu_uvd_ring_begin_use(struct amdgpu_ring *ring)
amdgpu_dpm_enable_uvd(adev, true);
} else {
amdgpu_asic_set_uvd_clocks(adev, 53300, 40000);
amdgpu_set_clockgating_state(adev, AMD_IP_BLOCK_TYPE_UVD,
AMD_CG_STATE_UNGATE);
amdgpu_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_UVD,
AMD_PG_STATE_UNGATE);
amdgpu_device_ip_set_clockgating_state(adev, AMD_IP_BLOCK_TYPE_UVD,
AMD_CG_STATE_UNGATE);
amdgpu_device_ip_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_UVD,
AMD_PG_STATE_UNGATE);
}
}
}
@ -1218,7 +1222,7 @@ int amdgpu_uvd_ring_test_ib(struct amdgpu_ring *ring, long timeout)
} else if (r < 0) {
DRM_ERROR("amdgpu: fence wait failed (%ld).\n", r);
} else {
DRM_INFO("ib test on ring %d succeeded\n", ring->idx);
DRM_DEBUG("ib test on ring %d succeeded\n", ring->idx);
r = 0;
}

View file

@ -31,6 +31,10 @@
#define AMDGPU_UVD_SESSION_SIZE (50*1024)
#define AMDGPU_UVD_FIRMWARE_OFFSET 256
#define AMDGPU_UVD_FIRMWARE_SIZE(adev) \
(AMDGPU_GPU_PAGE_ALIGN(le32_to_cpu(((const struct common_firmware_header *)(adev)->uvd.fw->data)->ucode_size_bytes) + \
8) - AMDGPU_UVD_FIRMWARE_OFFSET)
struct amdgpu_uvd {
struct amdgpu_bo *vcpu_bo;
void *cpu_addr;
@ -47,8 +51,8 @@ struct amdgpu_uvd {
struct amdgpu_irq_src irq;
bool address_64_bit;
bool use_ctx_buf;
struct amd_sched_entity entity;
struct amd_sched_entity entity_enc;
struct drm_sched_entity entity;
struct drm_sched_entity entity_enc;
uint32_t srbm_soft_reset;
unsigned num_enc_rings;
};

View file

@ -85,7 +85,7 @@ static void amdgpu_vce_idle_work_handler(struct work_struct *work);
int amdgpu_vce_sw_init(struct amdgpu_device *adev, unsigned long size)
{
struct amdgpu_ring *ring;
struct amd_sched_rq *rq;
struct drm_sched_rq *rq;
const char *fw_name;
const struct common_firmware_header *hdr;
unsigned ucode_version, version_major, version_minor, binary_id;
@ -174,9 +174,9 @@ int amdgpu_vce_sw_init(struct amdgpu_device *adev, unsigned long size)
}
ring = &adev->vce.ring[0];
rq = &ring->sched.sched_rq[AMD_SCHED_PRIORITY_NORMAL];
r = amd_sched_entity_init(&ring->sched, &adev->vce.entity,
rq, amdgpu_sched_jobs);
rq = &ring->sched.sched_rq[DRM_SCHED_PRIORITY_NORMAL];
r = drm_sched_entity_init(&ring->sched, &adev->vce.entity,
rq, amdgpu_sched_jobs, NULL);
if (r != 0) {
DRM_ERROR("Failed setting up VCE run queue.\n");
return r;
@ -207,7 +207,7 @@ int amdgpu_vce_sw_fini(struct amdgpu_device *adev)
if (adev->vce.vcpu_bo == NULL)
return 0;
amd_sched_entity_fini(&adev->vce.ring[0].sched, &adev->vce.entity);
drm_sched_entity_fini(&adev->vce.ring[0].sched, &adev->vce.entity);
amdgpu_bo_free_kernel(&adev->vce.vcpu_bo, &adev->vce.gpu_addr,
(void **)&adev->vce.cpu_addr);
@ -311,10 +311,10 @@ static void amdgpu_vce_idle_work_handler(struct work_struct *work)
amdgpu_dpm_enable_vce(adev, false);
} else {
amdgpu_asic_set_vce_clocks(adev, 0, 0);
amdgpu_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
AMD_PG_STATE_GATE);
amdgpu_set_clockgating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
AMD_CG_STATE_GATE);
amdgpu_device_ip_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
AMD_PG_STATE_GATE);
amdgpu_device_ip_set_clockgating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
AMD_CG_STATE_GATE);
}
} else {
schedule_delayed_work(&adev->vce.idle_work, VCE_IDLE_TIMEOUT);
@ -343,10 +343,10 @@ void amdgpu_vce_ring_begin_use(struct amdgpu_ring *ring)
amdgpu_dpm_enable_vce(adev, true);
} else {
amdgpu_asic_set_vce_clocks(adev, 53300, 40000);
amdgpu_set_clockgating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
AMD_CG_STATE_UNGATE);
amdgpu_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
AMD_PG_STATE_UNGATE);
amdgpu_device_ip_set_clockgating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
AMD_CG_STATE_UNGATE);
amdgpu_device_ip_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
AMD_PG_STATE_UNGATE);
}
}
@ -543,6 +543,55 @@ err:
return r;
}
/**
* amdgpu_vce_cs_validate_bo - make sure not to cross 4GB boundary
*
* @p: parser context
* @lo: address of lower dword
* @hi: address of higher dword
* @size: minimum size
* @index: bs/fb index
*
* Make sure that no BO cross a 4GB boundary.
*/
static int amdgpu_vce_validate_bo(struct amdgpu_cs_parser *p, uint32_t ib_idx,
int lo, int hi, unsigned size, int32_t index)
{
int64_t offset = ((uint64_t)size) * ((int64_t)index);
struct ttm_operation_ctx ctx = { false, false };
struct amdgpu_bo_va_mapping *mapping;
unsigned i, fpfn, lpfn;
struct amdgpu_bo *bo;
uint64_t addr;
int r;
addr = ((uint64_t)amdgpu_get_ib_value(p, ib_idx, lo)) |
((uint64_t)amdgpu_get_ib_value(p, ib_idx, hi)) << 32;
if (index >= 0) {
addr += offset;
fpfn = PAGE_ALIGN(offset) >> PAGE_SHIFT;
lpfn = 0x100000000ULL >> PAGE_SHIFT;
} else {
fpfn = 0;
lpfn = (0x100000000ULL - PAGE_ALIGN(offset)) >> PAGE_SHIFT;
}
r = amdgpu_cs_find_mapping(p, addr, &bo, &mapping);
if (r) {
DRM_ERROR("Can't find BO for addr 0x%010Lx %d %d %d %d\n",
addr, lo, hi, size, index);
return r;
}
for (i = 0; i < bo->placement.num_placement; ++i) {
bo->placements[i].fpfn = max(bo->placements[i].fpfn, fpfn);
bo->placements[i].lpfn = bo->placements[i].lpfn ?
min(bo->placements[i].lpfn, lpfn) : lpfn;
}
return ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
}
/**
* amdgpu_vce_cs_reloc - command submission relocation
*
@ -648,12 +697,13 @@ int amdgpu_vce_ring_parse_cs(struct amdgpu_cs_parser *p, uint32_t ib_idx)
uint32_t allocated = 0;
uint32_t tmp, handle = 0;
uint32_t *size = &tmp;
int i, r = 0, idx = 0;
unsigned idx;
int i, r = 0;
p->job->vm = NULL;
ib->gpu_addr = amdgpu_sa_bo_gpu_addr(ib->sa_bo);
while (idx < ib->length_dw) {
for (idx = 0; idx < ib->length_dw;) {
uint32_t len = amdgpu_get_ib_value(p, ib_idx, idx);
uint32_t cmd = amdgpu_get_ib_value(p, ib_idx, idx + 1);
@ -663,6 +713,54 @@ int amdgpu_vce_ring_parse_cs(struct amdgpu_cs_parser *p, uint32_t ib_idx)
goto out;
}
switch (cmd) {
case 0x00000002: /* task info */
fb_idx = amdgpu_get_ib_value(p, ib_idx, idx + 6);
bs_idx = amdgpu_get_ib_value(p, ib_idx, idx + 7);
break;
case 0x03000001: /* encode */
r = amdgpu_vce_validate_bo(p, ib_idx, idx + 10,
idx + 9, 0, 0);
if (r)
goto out;
r = amdgpu_vce_validate_bo(p, ib_idx, idx + 12,
idx + 11, 0, 0);
if (r)
goto out;
break;
case 0x05000001: /* context buffer */
r = amdgpu_vce_validate_bo(p, ib_idx, idx + 3,
idx + 2, 0, 0);
if (r)
goto out;
break;
case 0x05000004: /* video bitstream buffer */
tmp = amdgpu_get_ib_value(p, ib_idx, idx + 4);
r = amdgpu_vce_validate_bo(p, ib_idx, idx + 3, idx + 2,
tmp, bs_idx);
if (r)
goto out;
break;
case 0x05000005: /* feedback buffer */
r = amdgpu_vce_validate_bo(p, ib_idx, idx + 3, idx + 2,
4096, fb_idx);
if (r)
goto out;
break;
}
idx += len / 4;
}
for (idx = 0; idx < ib->length_dw;) {
uint32_t len = amdgpu_get_ib_value(p, ib_idx, idx);
uint32_t cmd = amdgpu_get_ib_value(p, ib_idx, idx + 1);
switch (cmd) {
case 0x00000001: /* session */
handle = amdgpu_get_ib_value(p, ib_idx, idx + 2);
@ -893,7 +991,7 @@ out:
*
*/
void amdgpu_vce_ring_emit_ib(struct amdgpu_ring *ring, struct amdgpu_ib *ib,
unsigned vm_id, bool ctx_switch)
unsigned vmid, bool ctx_switch)
{
amdgpu_ring_write(ring, VCE_CMD_IB);
amdgpu_ring_write(ring, lower_32_bits(ib->gpu_addr));
@ -954,7 +1052,7 @@ int amdgpu_vce_ring_test_ring(struct amdgpu_ring *ring)
}
if (i < timeout) {
DRM_INFO("ring test on %d succeeded in %d usecs\n",
DRM_DEBUG("ring test on %d succeeded in %d usecs\n",
ring->idx, i);
} else {
DRM_ERROR("amdgpu: ring %d test failed\n",
@ -999,7 +1097,7 @@ int amdgpu_vce_ring_test_ib(struct amdgpu_ring *ring, long timeout)
} else if (r < 0) {
DRM_ERROR("amdgpu: fence wait failed (%ld).\n", r);
} else {
DRM_INFO("ib test on ring %d succeeded\n", ring->idx);
DRM_DEBUG("ib test on ring %d succeeded\n", ring->idx);
r = 0;
}
error:

View file

@ -46,7 +46,7 @@ struct amdgpu_vce {
struct amdgpu_ring ring[AMDGPU_MAX_VCE_RINGS];
struct amdgpu_irq_src irq;
unsigned harvest_config;
struct amd_sched_entity entity;
struct drm_sched_entity entity;
uint32_t srbm_soft_reset;
unsigned num_rings;
};
@ -63,7 +63,7 @@ void amdgpu_vce_free_handles(struct amdgpu_device *adev, struct drm_file *filp);
int amdgpu_vce_ring_parse_cs(struct amdgpu_cs_parser *p, uint32_t ib_idx);
int amdgpu_vce_ring_parse_cs_vm(struct amdgpu_cs_parser *p, uint32_t ib_idx);
void amdgpu_vce_ring_emit_ib(struct amdgpu_ring *ring, struct amdgpu_ib *ib,
unsigned vm_id, bool ctx_switch);
unsigned vmid, bool ctx_switch);
void amdgpu_vce_ring_emit_fence(struct amdgpu_ring *ring, u64 addr, u64 seq,
unsigned flags);
int amdgpu_vce_ring_test_ring(struct amdgpu_ring *ring);

View file

@ -35,8 +35,7 @@
#include "soc15d.h"
#include "soc15_common.h"
#include "vega10/soc15ip.h"
#include "raven1/VCN/vcn_1_0_offset.h"
#include "vcn/vcn_1_0_offset.h"
/* 1 second timeout */
#define VCN_IDLE_TIMEOUT msecs_to_jiffies(1000)
@ -51,7 +50,7 @@ static void amdgpu_vcn_idle_work_handler(struct work_struct *work);
int amdgpu_vcn_sw_init(struct amdgpu_device *adev)
{
struct amdgpu_ring *ring;
struct amd_sched_rq *rq;
struct drm_sched_rq *rq;
unsigned long bo_size;
const char *fw_name;
const struct common_firmware_header *hdr;
@ -104,18 +103,18 @@ int amdgpu_vcn_sw_init(struct amdgpu_device *adev)
}
ring = &adev->vcn.ring_dec;
rq = &ring->sched.sched_rq[AMD_SCHED_PRIORITY_NORMAL];
r = amd_sched_entity_init(&ring->sched, &adev->vcn.entity_dec,
rq, amdgpu_sched_jobs);
rq = &ring->sched.sched_rq[DRM_SCHED_PRIORITY_NORMAL];
r = drm_sched_entity_init(&ring->sched, &adev->vcn.entity_dec,
rq, amdgpu_sched_jobs, NULL);
if (r != 0) {
DRM_ERROR("Failed setting up VCN dec run queue.\n");
return r;
}
ring = &adev->vcn.ring_enc[0];
rq = &ring->sched.sched_rq[AMD_SCHED_PRIORITY_NORMAL];
r = amd_sched_entity_init(&ring->sched, &adev->vcn.entity_enc,
rq, amdgpu_sched_jobs);
rq = &ring->sched.sched_rq[DRM_SCHED_PRIORITY_NORMAL];
r = drm_sched_entity_init(&ring->sched, &adev->vcn.entity_enc,
rq, amdgpu_sched_jobs, NULL);
if (r != 0) {
DRM_ERROR("Failed setting up VCN enc run queue.\n");
return r;
@ -130,9 +129,9 @@ int amdgpu_vcn_sw_fini(struct amdgpu_device *adev)
kfree(adev->vcn.saved_bo);
amd_sched_entity_fini(&adev->vcn.ring_dec.sched, &adev->vcn.entity_dec);
drm_sched_entity_fini(&adev->vcn.ring_dec.sched, &adev->vcn.entity_dec);
amd_sched_entity_fini(&adev->vcn.ring_enc[0].sched, &adev->vcn.entity_enc);
drm_sched_entity_fini(&adev->vcn.ring_enc[0].sched, &adev->vcn.entity_enc);
amdgpu_bo_free_kernel(&adev->vcn.vcpu_bo,
&adev->vcn.gpu_addr,
@ -261,7 +260,7 @@ int amdgpu_vcn_dec_ring_test_ring(struct amdgpu_ring *ring)
}
if (i < adev->usec_timeout) {
DRM_INFO("ring test on %d succeeded in %d usecs\n",
DRM_DEBUG("ring test on %d succeeded in %d usecs\n",
ring->idx, i);
} else {
DRM_ERROR("amdgpu: ring %d test failed (0x%08X)\n",
@ -274,6 +273,7 @@ int amdgpu_vcn_dec_ring_test_ring(struct amdgpu_ring *ring)
static int amdgpu_vcn_dec_send_msg(struct amdgpu_ring *ring, struct amdgpu_bo *bo,
bool direct, struct dma_fence **fence)
{
struct ttm_operation_ctx ctx = { true, false };
struct ttm_validate_buffer tv;
struct ww_acquire_ctx ticket;
struct list_head head;
@ -294,7 +294,7 @@ static int amdgpu_vcn_dec_send_msg(struct amdgpu_ring *ring, struct amdgpu_bo *b
if (r)
return r;
r = ttm_bo_validate(&bo->tbo, &bo->placement, true, false);
r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
if (r)
goto err;
@ -467,7 +467,7 @@ int amdgpu_vcn_dec_ring_test_ib(struct amdgpu_ring *ring, long timeout)
} else if (r < 0) {
DRM_ERROR("amdgpu: fence wait failed (%ld).\n", r);
} else {
DRM_INFO("ib test on ring %d succeeded\n", ring->idx);
DRM_DEBUG("ib test on ring %d succeeded\n", ring->idx);
r = 0;
}
@ -500,7 +500,7 @@ int amdgpu_vcn_enc_ring_test_ring(struct amdgpu_ring *ring)
}
if (i < adev->usec_timeout) {
DRM_INFO("ring test on %d succeeded in %d usecs\n",
DRM_DEBUG("ring test on %d succeeded in %d usecs\n",
ring->idx, i);
} else {
DRM_ERROR("amdgpu: ring %d test failed\n",
@ -643,7 +643,7 @@ int amdgpu_vcn_enc_ring_test_ib(struct amdgpu_ring *ring, long timeout)
} else if (r < 0) {
DRM_ERROR("amdgpu: fence wait failed (%ld).\n", r);
} else {
DRM_INFO("ib test on ring %d succeeded\n", ring->idx);
DRM_DEBUG("ib test on ring %d succeeded\n", ring->idx);
r = 0;
}
error:

View file

@ -56,8 +56,8 @@ struct amdgpu_vcn {
struct amdgpu_ring ring_dec;
struct amdgpu_ring ring_enc[AMDGPU_VCN_MAX_ENC_RINGS];
struct amdgpu_irq_src irq;
struct amd_sched_entity entity_dec;
struct amd_sched_entity entity_enc;
struct drm_sched_entity entity_dec;
struct drm_sched_entity entity_enc;
unsigned num_enc_rings;
};

View file

@ -24,6 +24,14 @@
#include "amdgpu.h"
#define MAX_KIQ_REG_WAIT 100000000 /* in usecs */
bool amdgpu_virt_mmio_blocked(struct amdgpu_device *adev)
{
/* By now all MMIO pages except mailbox are blocked */
/* if blocking is enabled in hypervisor. Choose the */
/* SCRATCH_REG0 to test. */
return RREG32_NO_KIQ(0xc040) == 0xffffffff;
}
int amdgpu_allocate_static_csa(struct amdgpu_device *adev)
{
int r;
@ -39,6 +47,12 @@ int amdgpu_allocate_static_csa(struct amdgpu_device *adev)
return 0;
}
void amdgpu_free_static_csa(struct amdgpu_device *adev) {
amdgpu_bo_free_kernel(&adev->virt.csa_obj,
&adev->virt.csa_vmid0_addr,
NULL);
}
/*
* amdgpu_map_static_csa should be called during amdgpu_vm_init
* it maps virtual address "AMDGPU_VA_RESERVED_SIZE - AMDGPU_CSA_SIZE"
@ -107,8 +121,6 @@ void amdgpu_virt_init_setting(struct amdgpu_device *adev)
adev->enable_virtual_display = true;
adev->cg_flags = 0;
adev->pg_flags = 0;
mutex_init(&adev->virt.lock_reset);
}
uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg)
@ -227,6 +239,22 @@ int amdgpu_virt_reset_gpu(struct amdgpu_device *adev)
return 0;
}
/**
* amdgpu_virt_wait_reset() - wait for reset gpu completed
* @amdgpu: amdgpu device.
* Wait for GPU reset completed.
* Return: Zero if reset success, otherwise will return error.
*/
int amdgpu_virt_wait_reset(struct amdgpu_device *adev)
{
struct amdgpu_virt *virt = &adev->virt;
if (!virt->ops || !virt->ops->wait_reset)
return -EINVAL;
return virt->ops->wait_reset(adev);
}
/**
* amdgpu_virt_alloc_mm_table() - alloc memory for mm table
* @amdgpu: amdgpu device.
@ -296,7 +324,6 @@ int amdgpu_virt_fw_reserve_get_checksum(void *obj,
void amdgpu_virt_init_data_exchange(struct amdgpu_device *adev)
{
uint32_t pf2vf_ver = 0;
uint32_t pf2vf_size = 0;
uint32_t checksum = 0;
uint32_t checkval;
@ -309,9 +336,9 @@ void amdgpu_virt_init_data_exchange(struct amdgpu_device *adev)
adev->virt.fw_reserve.p_pf2vf =
(struct amdgim_pf2vf_info_header *)(
adev->fw_vram_usage.va + AMDGIM_DATAEXCHANGE_OFFSET);
pf2vf_ver = adev->virt.fw_reserve.p_pf2vf->version;
AMDGPU_FW_VRAM_PF2VF_READ(adev, header.size, &pf2vf_size);
AMDGPU_FW_VRAM_PF2VF_READ(adev, checksum, &checksum);
AMDGPU_FW_VRAM_PF2VF_READ(adev, feature_flags, &adev->virt.gim_feature);
/* pf2vf message must be in 4K */
if (pf2vf_size > 0 && pf2vf_size < 4096) {

View file

@ -55,6 +55,7 @@ struct amdgpu_virt_ops {
int (*req_full_gpu)(struct amdgpu_device *adev, bool init);
int (*rel_full_gpu)(struct amdgpu_device *adev, bool init);
int (*reset_gpu)(struct amdgpu_device *adev);
int (*wait_reset)(struct amdgpu_device *adev);
void (*trans_msg)(struct amdgpu_device *adev, u32 req, u32 data1, u32 data2, u32 data3);
};
@ -80,6 +81,8 @@ enum AMDGIM_FEATURE_FLAG {
AMDGIM_FEATURE_ERROR_LOG_COLLECT = 0x1,
/* GIM supports feature of loading uCodes */
AMDGIM_FEATURE_GIM_LOAD_UCODES = 0x2,
/* VRAM LOST by GIM */
AMDGIM_FEATURE_GIM_FLR_VRAMLOST = 0x4,
};
struct amdgim_pf2vf_info_header {
@ -238,7 +241,6 @@ struct amdgpu_virt {
uint64_t csa_vmid0_addr;
bool chained_ib_support;
uint32_t reg_val_offs;
struct mutex lock_reset;
struct amdgpu_irq_src ack_irq;
struct amdgpu_irq_src rcv_irq;
struct work_struct flr_work;
@ -246,6 +248,7 @@ struct amdgpu_virt {
const struct amdgpu_virt_ops *ops;
struct amdgpu_vf_error_buffer vf_errors;
struct amdgpu_virt_fw_reserve fw_reserve;
uint32_t gim_feature;
};
#define AMDGPU_CSA_SIZE (8 * 1024)
@ -276,16 +279,18 @@ static inline bool is_virtual_machine(void)
}
struct amdgpu_vm;
bool amdgpu_virt_mmio_blocked(struct amdgpu_device *adev);
int amdgpu_allocate_static_csa(struct amdgpu_device *adev);
int amdgpu_map_static_csa(struct amdgpu_device *adev, struct amdgpu_vm *vm,
struct amdgpu_bo_va **bo_va);
void amdgpu_free_static_csa(struct amdgpu_device *adev);
void amdgpu_virt_init_setting(struct amdgpu_device *adev);
uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg);
void amdgpu_virt_kiq_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v);
int amdgpu_virt_request_full_gpu(struct amdgpu_device *adev, bool init);
int amdgpu_virt_release_full_gpu(struct amdgpu_device *adev, bool init);
int amdgpu_virt_reset_gpu(struct amdgpu_device *adev);
int amdgpu_sriov_gpu_reset(struct amdgpu_device *adev, struct amdgpu_job *job);
int amdgpu_virt_wait_reset(struct amdgpu_device *adev);
int amdgpu_virt_alloc_mm_table(struct amdgpu_device *adev);
void amdgpu_virt_free_mm_table(struct amdgpu_device *adev);
int amdgpu_virt_fw_reserve_get_checksum(void *obj, unsigned long obj_size,

File diff suppressed because it is too large Load diff

View file

@ -24,12 +24,14 @@
#ifndef __AMDGPU_VM_H__
#define __AMDGPU_VM_H__
#include <linux/rbtree.h>
#include <linux/idr.h>
#include <linux/kfifo.h>
#include <linux/rbtree.h>
#include <drm/gpu_scheduler.h>
#include "gpu_scheduler.h"
#include "amdgpu_sync.h"
#include "amdgpu_ring.h"
#include "amdgpu_ids.h"
struct amdgpu_bo_va;
struct amdgpu_job;
@ -39,9 +41,6 @@ struct amdgpu_bo_list_entry;
* GPUVM handling
*/
/* maximum number of VMIDs */
#define AMDGPU_NUM_VM 16
/* Maximum number of PTEs the hardware can write with one command */
#define AMDGPU_VM_MAX_UPDATE_SIZE 0x3FFFF
@ -69,6 +68,12 @@ struct amdgpu_bo_list_entry;
/* PDE is handled as PTE for VEGA10 */
#define AMDGPU_PDE_PTE (1ULL << 54)
/* PTE is handled as PDE for VEGA10 (Translate Further) */
#define AMDGPU_PTE_TF (1ULL << 56)
/* PDE Block Fragment Size for VEGA10 */
#define AMDGPU_PDE_BFS(a) ((uint64_t)a << 59)
/* VEGA10 only */
#define AMDGPU_PTE_MTYPE(a) ((uint64_t)a << 57)
#define AMDGPU_PTE_MTYPE_MASK AMDGPU_PTE_MTYPE(3ULL)
@ -96,6 +101,19 @@ struct amdgpu_bo_list_entry;
/* hardcode that limit for now */
#define AMDGPU_VA_RESERVED_SIZE (8ULL << 20)
/* VA hole for 48bit addresses on Vega10 */
#define AMDGPU_VA_HOLE_START 0x0000800000000000ULL
#define AMDGPU_VA_HOLE_END 0xffff800000000000ULL
/*
* Hardware is programmed as if the hole doesn't exists with start and end
* address values.
*
* This mask is used to remove the upper 16bits of the VA and so come up with
* the linear addr value.
*/
#define AMDGPU_VA_HOLE_MASK 0x0000ffffffffffffULL
/* max vmids dedicated for process */
#define AMDGPU_VM_MAX_RESERVED_VMID 1
@ -106,6 +124,16 @@ struct amdgpu_bo_list_entry;
#define AMDGPU_VM_USE_CPU_FOR_GFX (1 << 0)
#define AMDGPU_VM_USE_CPU_FOR_COMPUTE (1 << 1)
/* VMPT level enumerate, and the hiberachy is:
* PDB2->PDB1->PDB0->PTB
*/
enum amdgpu_vm_level {
AMDGPU_VM_PDB2,
AMDGPU_VM_PDB1,
AMDGPU_VM_PDB0,
AMDGPU_VM_PTB
};
/* base structure for tracking BO usage in a VM */
struct amdgpu_vm_bo_base {
/* constant after initialization */
@ -124,11 +152,10 @@ struct amdgpu_vm_bo_base {
struct amdgpu_vm_pt {
struct amdgpu_vm_bo_base base;
uint64_t addr;
bool huge;
/* array of page tables, one for each directory entry */
struct amdgpu_vm_pt *entries;
unsigned last_entry_used;
};
#define AMDGPU_VM_FAULT(pasid, addr) (((u64)(pasid) << 48) | (addr))
@ -162,13 +189,11 @@ struct amdgpu_vm {
spinlock_t freed_lock;
/* Scheduler entity for page table updates */
struct amd_sched_entity entity;
struct drm_sched_entity entity;
/* client id and PASID (TODO: replace client_id with PASID) */
u64 client_id;
unsigned int pasid;
/* dedicated to vm */
struct amdgpu_vm_id *reserved_vmid[AMDGPU_MAX_VMHUBS];
struct amdgpu_vmid *reserved_vmid[AMDGPU_MAX_VMHUBS];
/* Flag to indicate if VM tables are updated by CPU or GPU (SDMA) */
bool use_cpu_for_update;
@ -183,37 +208,9 @@ struct amdgpu_vm {
unsigned int fault_credit;
};
struct amdgpu_vm_id {
struct list_head list;
struct amdgpu_sync active;
struct dma_fence *last_flush;
atomic64_t owner;
uint64_t pd_gpu_addr;
/* last flushed PD/PT update */
struct dma_fence *flushed_updates;
uint32_t current_gpu_reset_count;
uint32_t gds_base;
uint32_t gds_size;
uint32_t gws_base;
uint32_t gws_size;
uint32_t oa_base;
uint32_t oa_size;
};
struct amdgpu_vm_id_manager {
struct mutex lock;
unsigned num_ids;
struct list_head ids_lru;
struct amdgpu_vm_id ids[AMDGPU_NUM_VM];
atomic_t reserved_vmid_num;
};
struct amdgpu_vm_manager {
/* Handling of VMIDs */
struct amdgpu_vm_id_manager id_mgr[AMDGPU_MAX_VMHUBS];
struct amdgpu_vmid_mgr id_mgr[AMDGPU_MAX_VMHUBS];
/* Handling of VM fences */
u64 fence_context;
@ -221,9 +218,9 @@ struct amdgpu_vm_manager {
uint64_t max_pfn;
uint32_t num_level;
uint64_t vm_size;
uint32_t block_size;
uint32_t fragment_size;
enum amdgpu_vm_level root_level;
/* vram base address for page table entry */
u64 vram_base_offset;
/* vm pte handling */
@ -231,8 +228,6 @@ struct amdgpu_vm_manager {
struct amdgpu_ring *vm_pte_rings[AMDGPU_MAX_RINGS];
unsigned vm_pte_num_rings;
atomic_t vm_pte_next_ring;
/* client id counter */
atomic64_t client_counter;
/* partial resident texture handling */
spinlock_t prt_lock;
@ -251,8 +246,6 @@ struct amdgpu_vm_manager {
spinlock_t pasid_lock;
};
int amdgpu_vm_alloc_pasid(unsigned int bits);
void amdgpu_vm_free_pasid(unsigned int pasid);
void amdgpu_vm_manager_init(struct amdgpu_device *adev);
void amdgpu_vm_manager_fini(struct amdgpu_device *adev);
int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
@ -270,13 +263,7 @@ int amdgpu_vm_validate_pt_bos(struct amdgpu_device *adev, struct amdgpu_vm *vm,
int amdgpu_vm_alloc_pts(struct amdgpu_device *adev,
struct amdgpu_vm *vm,
uint64_t saddr, uint64_t size);
int amdgpu_vm_grab_id(struct amdgpu_vm *vm, struct amdgpu_ring *ring,
struct amdgpu_sync *sync, struct dma_fence *fence,
struct amdgpu_job *job);
int amdgpu_vm_flush(struct amdgpu_ring *ring, struct amdgpu_job *job, bool need_pipe_sync);
void amdgpu_vm_reset_id(struct amdgpu_device *adev, unsigned vmhub,
unsigned vmid);
void amdgpu_vm_reset_all_ids(struct amdgpu_device *adev);
int amdgpu_vm_update_directories(struct amdgpu_device *adev,
struct amdgpu_vm *vm);
int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
@ -312,10 +299,9 @@ struct amdgpu_bo_va_mapping *amdgpu_vm_bo_lookup_mapping(struct amdgpu_vm *vm,
uint64_t addr);
void amdgpu_vm_bo_rmv(struct amdgpu_device *adev,
struct amdgpu_bo_va *bo_va);
void amdgpu_vm_set_fragment_size(struct amdgpu_device *adev,
uint32_t fragment_size_default);
void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint64_t vm_size,
uint32_t fragment_size_default);
void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t vm_size,
uint32_t fragment_size_default, unsigned max_level,
unsigned max_bits);
int amdgpu_vm_ioctl(struct drm_device *dev, void *data, struct drm_file *filp);
bool amdgpu_vm_need_pipeline_sync(struct amdgpu_ring *ring,
struct amdgpu_job *job);

Some files were not shown because too many files have changed in this diff Show more