Skip to content

Commit 0417715

Browse files
perexgtiwai
authored andcommitted
ALSA: compress_offload: introduce accel operation mode
There is a requirement to expose the audio hardware that accelerates various tasks for user space such as sample rate converters, compressed stream decoders, etc. This is description for the API extension for the compress ALSA API which is able to handle "tasks" that are not bound to real-time operations and allows for the serialization of operations. For details, refer to "compress-accel.rst" document. Cc: Mark Brown <[email protected]> Cc: Shengjiu Wang <[email protected]> Cc: Nicolas Dufresne <[email protected]> Cc: Amadeusz Sławiński <[email protected]> Cc: Pierre-Louis Bossart <[email protected]> Cc: Vinod Koul <[email protected]> Signed-off-by: Jaroslav Kysela <[email protected]> Reviewed-by: Amadeusz Sławiński <[email protected]> Tested-by: Shengjiu Wang <[email protected]> Signed-off-by: Takashi Iwai <[email protected]> Link: https://patch.msgid.link/[email protected]
1 parent 42f7652 commit 0417715

File tree

5 files changed

+587
-8
lines changed

5 files changed

+587
-8
lines changed
Lines changed: 134 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,134 @@
1+
==================================
2+
ALSA Co-processor Acceleration API
3+
==================================
4+
5+
Jaroslav Kysela <[email protected]>
6+
7+
8+
Overview
9+
========
10+
11+
There is a requirement to expose the audio hardware that accelerates various
12+
tasks for user space such as sample rate converters, compressed
13+
stream decoders, etc.
14+
15+
This is description for the API extension for the compress ALSA API which
16+
is able to handle "tasks" that are not bound to real-time operations
17+
and allows for the serialization of operations.
18+
19+
Requirements
20+
============
21+
22+
The main requirements are:
23+
24+
- serialization of multiple tasks for user space to allow multiple
25+
operations without user space intervention
26+
27+
- separate buffers (input + output) for each operation
28+
29+
- expose buffers using mmap to user space
30+
31+
- signal user space when the task is finished (standard poll mechanism)
32+
33+
Design
34+
======
35+
36+
A new direction SND_COMPRESS_ACCEL is introduced to identify
37+
the passthrough API.
38+
39+
The API extension shares device enumeration and parameters handling from
40+
the main compressed API. All other realtime streaming ioctls are deactivated
41+
and a new set of task related ioctls are introduced. The standard
42+
read/write/mmap I/O operations are not supported in the passthrough device.
43+
44+
Device ("stream") state handling is reduced to OPEN/SETUP. All other
45+
states are not available for the passthrough mode.
46+
47+
Data I/O mechanism is using standard dma-buf interface with all advantages
48+
like mmap, standard I/O, buffer sharing etc. One buffer is used for the
49+
input data and second (separate) buffer is used for the output data. Each task
50+
have separate I/O buffers.
51+
52+
For the buffering parameters, the fragments means a limit of allocated tasks
53+
for given device. The fragment_size limits the input buffer size for the given
54+
device. The output buffer size is determined by the driver (may be different
55+
from the input buffer size).
56+
57+
State Machine
58+
=============
59+
60+
The passthrough audio stream state machine is described below :
61+
62+
+----------+
63+
| |
64+
| OPEN |
65+
| |
66+
+----------+
67+
|
68+
|
69+
| compr_set_params()
70+
|
71+
v
72+
all passthrough task ops +----------+
73+
+------------------------------------| |
74+
| | SETUP |
75+
| |
76+
| +----------+
77+
| |
78+
+------------------------------------------+
79+
80+
81+
Passthrough operations (ioctls)
82+
===============================
83+
84+
All operations are protected using stream->device->lock (mutex).
85+
86+
CREATE
87+
------
88+
Creates a set of input/output buffers. The input buffer size is
89+
fragment_size. Allocates unique seqno.
90+
91+
The hardware drivers allocate internal 'struct dma_buf' for both input and
92+
output buffers (using 'dma_buf_export()' function). The anonymous
93+
file descriptors for those buffers are passed to user space.
94+
95+
FREE
96+
----
97+
Free a set of input/output buffers. If a task is active, the stop
98+
operation is executed before. If seqno is zero, operation is executed for all
99+
tasks.
100+
101+
START
102+
-----
103+
Starts (queues) a task. There are two cases of the task start - right after
104+
the task is created. In this case, origin_seqno must be zero.
105+
The second case is for reusing of already finished task. The origin_seqno
106+
must identify the task to be reused. In both cases, a new seqno value
107+
is allocated and returned to user space.
108+
109+
The prerequisite is that application filled input dma buffer with
110+
new source data and set input_size to pass the real data size to the driver.
111+
112+
The order of data processing is preserved (first started job must be
113+
finished at first).
114+
115+
If the multiple tasks require a state handling (e.g. resampling operation),
116+
the user space may set SND_COMPRESS_TFLG_NEW_STREAM flag to mark the
117+
start of the new stream data. It is useful to keep the allocated buffers
118+
for the new operation rather using open/close mechanism.
119+
120+
STOP
121+
----
122+
Stop (dequeues) a task. If seqno is zero, operation is executed for all
123+
tasks.
124+
125+
STATUS
126+
------
127+
Obtain the task status (active, finished). Also, the driver will set
128+
the real output data size (valid area in the output buffer).
129+
130+
Credits
131+
=======
132+
- Shengjiu Wang <[email protected]>
133+
- Takashi Iwai <[email protected]>
134+
- Vinod Koul <[email protected]>

include/sound/compress_driver.h

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,30 @@
1919

2020
struct snd_compr_ops;
2121

22+
/**
23+
* struct snd_compr_task_runtime: task runtime description
24+
* @list: list of all managed tasks
25+
* @input: input DMA buffer
26+
* @output: output DMA buffer
27+
* @seqno: sequence number
28+
* @input_size: really used data in the input buffer
29+
* @output_size: really used data in the output buffer
30+
* @flags: see SND_COMPRESS_TFLG_*
31+
* @state: actual task state
32+
* @private_value: used by the lowlevel driver (opaque)
33+
*/
34+
struct snd_compr_task_runtime {
35+
struct list_head list;
36+
struct dma_buf *input;
37+
struct dma_buf *output;
38+
u64 seqno;
39+
u64 input_size;
40+
u64 output_size;
41+
u32 flags;
42+
u8 state;
43+
void *private_value;
44+
};
45+
2246
/**
2347
* struct snd_compr_runtime: runtime stream description
2448
* @state: stream state
@@ -37,6 +61,10 @@ struct snd_compr_ops;
3761
* @dma_addr: physical buffer address (not accessible from main CPU)
3862
* @dma_bytes: size of DMA area
3963
* @dma_buffer_p: runtime dma buffer pointer
64+
* @active_tasks: count of active tasks
65+
* @total_tasks: count of all tasks
66+
* @task_seqno: last task sequence number (!= 0)
67+
* @tasks: list of all tasks
4068
*/
4169
struct snd_compr_runtime {
4270
snd_pcm_state_t state;
@@ -54,6 +82,13 @@ struct snd_compr_runtime {
5482
dma_addr_t dma_addr;
5583
size_t dma_bytes;
5684
struct snd_dma_buffer *dma_buffer_p;
85+
86+
#if IS_ENABLED(CONFIG_SND_COMPRESS_ACCEL)
87+
u32 active_tasks;
88+
u32 total_tasks;
89+
u64 task_seqno;
90+
struct list_head tasks;
91+
#endif
5792
};
5893

5994
/**
@@ -132,6 +167,12 @@ struct snd_compr_ops {
132167
struct snd_compr_caps *caps);
133168
int (*get_codec_caps) (struct snd_compr_stream *stream,
134169
struct snd_compr_codec_caps *codec);
170+
#if IS_ENABLED(CONFIG_SND_COMPRESS_ACCEL)
171+
int (*task_create) (struct snd_compr_stream *stream, struct snd_compr_task_runtime *task);
172+
int (*task_start) (struct snd_compr_stream *stream, struct snd_compr_task_runtime *task);
173+
int (*task_stop) (struct snd_compr_stream *stream, struct snd_compr_task_runtime *task);
174+
int (*task_free) (struct snd_compr_stream *stream, struct snd_compr_task_runtime *task);
175+
#endif
135176
};
136177

137178
/**
@@ -242,4 +283,9 @@ int snd_compr_free_pages(struct snd_compr_stream *stream);
242283
int snd_compr_stop_error(struct snd_compr_stream *stream,
243284
snd_pcm_state_t state);
244285

286+
#if IS_ENABLED(CONFIG_SND_COMPRESS_ACCEL)
287+
void snd_compr_task_finished(struct snd_compr_stream *stream,
288+
struct snd_compr_task_runtime *task);
289+
#endif
290+
245291
#endif

include/uapi/sound/compress_offload.h

Lines changed: 62 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
#include <sound/compress_params.h>
1515

1616

17-
#define SNDRV_COMPRESS_VERSION SNDRV_PROTOCOL_VERSION(0, 2, 0)
17+
#define SNDRV_COMPRESS_VERSION SNDRV_PROTOCOL_VERSION(0, 3, 0)
1818
/**
1919
* struct snd_compressed_buffer - compressed buffer
2020
* @fragment_size: size of buffer fragment in bytes
@@ -68,7 +68,8 @@ struct snd_compr_avail {
6868

6969
enum snd_compr_direction {
7070
SND_COMPRESS_PLAYBACK = 0,
71-
SND_COMPRESS_CAPTURE
71+
SND_COMPRESS_CAPTURE,
72+
SND_COMPRESS_ACCEL
7273
};
7374

7475
/**
@@ -127,6 +128,57 @@ struct snd_compr_metadata {
127128
__u32 value[8];
128129
} __attribute__((packed, aligned(4)));
129130

131+
/* flags for struct snd_compr_task */
132+
#define SND_COMPRESS_TFLG_NEW_STREAM (1<<0) /* mark for the new stream data */
133+
134+
/**
135+
* struct snd_compr_task - task primitive for non-realtime operation
136+
* @seqno: sequence number (task identifier)
137+
* @origin_seqno: previous sequence number (task identifier) - for reuse
138+
* @input_fd: data input file descriptor (dma-buf)
139+
* @output_fd: data output file descriptor (dma-buf)
140+
* @input_size: filled data in bytes (from caller, must not exceed fragment size)
141+
* @flags: see SND_COMPRESS_TFLG_* defines
142+
*/
143+
struct snd_compr_task {
144+
__u64 seqno;
145+
__u64 origin_seqno;
146+
int input_fd;
147+
int output_fd;
148+
__u64 input_size;
149+
__u32 flags;
150+
__u8 reserved[16];
151+
} __attribute__((packed, aligned(4)));
152+
153+
/**
154+
* enum snd_compr_state - task state
155+
* @SND_COMPRESS_TASK_STATE_IDLE: task is not queued
156+
* @SND_COMPRESS_TASK_STATE_ACTIVE: task is in the queue
157+
* @SND_COMPRESS_TASK_STATE_FINISHED: task was processed, output is available
158+
*/
159+
enum snd_compr_state {
160+
SND_COMPRESS_TASK_STATE_IDLE = 0,
161+
SND_COMPRESS_TASK_STATE_ACTIVE,
162+
SND_COMPRESS_TASK_STATE_FINISHED
163+
};
164+
165+
/**
166+
* struct snd_compr_task_status - task status
167+
* @seqno: sequence number (task identifier)
168+
* @input_size: filled data in bytes (from user space)
169+
* @output_size: filled data in bytes (from driver)
170+
* @output_flags: reserved for future (all zeros - from driver)
171+
* @state: actual task state (SND_COMPRESS_TASK_STATE_*)
172+
*/
173+
struct snd_compr_task_status {
174+
__u64 seqno;
175+
__u64 input_size;
176+
__u64 output_size;
177+
__u32 output_flags;
178+
__u8 state;
179+
__u8 reserved[15];
180+
} __attribute__((packed, aligned(4)));
181+
130182
/*
131183
* compress path ioctl definitions
132184
* SNDRV_COMPRESS_GET_CAPS: Query capability of DSP
@@ -164,6 +216,14 @@ struct snd_compr_metadata {
164216
#define SNDRV_COMPRESS_DRAIN _IO('C', 0x34)
165217
#define SNDRV_COMPRESS_NEXT_TRACK _IO('C', 0x35)
166218
#define SNDRV_COMPRESS_PARTIAL_DRAIN _IO('C', 0x36)
219+
220+
221+
#define SNDRV_COMPRESS_TASK_CREATE _IOWR('C', 0x60, struct snd_compr_task)
222+
#define SNDRV_COMPRESS_TASK_FREE _IOW('C', 0x61, __u64)
223+
#define SNDRV_COMPRESS_TASK_START _IOWR('C', 0x62, struct snd_compr_task)
224+
#define SNDRV_COMPRESS_TASK_STOP _IOW('C', 0x63, __u64)
225+
#define SNDRV_COMPRESS_TASK_STATUS _IOWR('C', 0x68, struct snd_compr_task_status)
226+
167227
/*
168228
* TODO
169229
* 1. add mmap support

sound/core/Kconfig

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,9 @@ config SND_CORE_TEST
5959
config SND_COMPRESS_OFFLOAD
6060
tristate
6161

62+
config SND_COMPRESS_ACCEL
63+
bool
64+
6265
config SND_JACK
6366
bool
6467

0 commit comments

Comments
 (0)