Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
AIR streaming protocol
#2
The data representation for multi-byte quantities seems to be big-endian (i.e. network byte order). This surprises me somewhat since both the supported hosts (i.e. PC and Mac) are little-endian.

The 22-byte payload header for outgoing (host->Devialet) packets seems to be:

byte 0 (1 byte): 0x44 ('D' for Devialet?)
byte 1 (1 byte): 0x6d ('m' for music?)
byte 2 (1 byte): 0x02
bytes 3- 6 (4 bytes): stream ID? (pseudo-random value, fixed for a given AIR stream)
bytes 7-11 (5 bytes): 0x00 0x00 0x00 0x02 0x01
bytes 12-15 (4 bytes): possibly high-order 32 bits of sequence number?
bytes 16-19 (4 bytes): (low order 32 bits of) sequence number
bytes 20-21 (2 bytes): number of samples carried by this payload (e.g. 0x00c8 = 200)

The stream ID is presumably assigned when AIR opens the streaming session with the Devialet. It could be assigned by the Devialet or by AIR - maybe this is part of the discovery (rather than streaming) protocol. I can't see any pattern in how the stream ID is generated for successive streams, so I suspect it's a pseudo-random number.

The sequence number seems to be the number of the first audio sample carried within the payload. For example if each payload carries 200 samples the sequence number will be 0 for the first payload, 200 for the second, 400 for the third, etc.

At 44.1 kHz a 32-bit unsigned sequence number would wrap round after about 27 hours. It is possible (likely?) that the protocol could represent the sequence number as a 64-bit quantity to avoid wrap-round. I'm running an experiment at the moment to see what happens when I keep an AIR session open for longer than the expected wrap-round time so hopefully I can resolve that question soon.

The per-channel data in outgoing (host->Devialet) packets is harder to understand.

I'm assuming for the moment that the 962-byte payload carries exactly 200 samples. This seems to tally with what I see in the payload header, and is roughly consistent with the size of the per-channel payload if AIR represents each sample in 16 bits.

I've used a tone-generator program on my Mac to play a 441 Hz sinusoid through AIR, which resulted in the network capture posted above. At a sample rate of 44.1 kHz (= 100 x 441 Hz), with 200 samples per payload, you'd expect to see exactly two full cycles of the sine wave for each channel.

Based on a bit of guesswork, I can see a set of 199 16-bit values for each channel which are obviously derived from the original sinusoid. These start at byte offset 7 within the per-channel data. The plot shows these values against the corresponding byte offset within the overall payload (there's an offset of 1 I haven't got rid of):

   

This raises some obvious questions about how AIR is representing the audio data:
  • where's the first (or 200th) value?
  • why does it look as though the values plot a "half-wave rectified" sine wave?
  • why are the values covering the range (roughly) 1,000-33,000 given that I was playing a very low-volume tone?

I don't yet know the answers to those questions.

The "rectification" effect is repeatable using different tone generator applications.

It seems clear that AIR isn't just sending linear PCM data to the Devialet. (This is consistent with the fact that "silence" seems to be sent in a compressed form.)

I wonder whether AIR is encoding the 200 samples as a baseline value and a set of differences, or something similar? If so, it looks as though the differences may be normalised so that they occupy a full 16 bits. The information to re-build the original samples must then be encoded either in the payload header or the 7 bytes preceding the 398 bytes of sample data. This must include the baseline value and scaling factor, for example.

For my payload, these 7 bytes are:

0x04 0x03 0x00 0x03 0x03 0x10 0x10

These are the same for both channels.

I tried streaming a sinusoid from a different application, with a different volume setting, and found that the per-channel header changed to:

0xF3 0xE4 0x00 0x02 0x02 0x10 0x10

The two 0x10s might represent the fact that the values have 16 significant bits and/or are transferred in 16-bit units.

When the payload is representing silence, these 7 bytes become:

0x00 0x00 0x00 0x00 0x00 0x00 0x00

(which seems a reasonable way to represent silence in compressed form).

Incidentally, the Excel spreadsheet I used to plot the sample values is here, in case anyone would like to play around with it:


.xls   16-44-441Hz-tone-payload.xls (Size: 113 KB / Downloads: 1)
Roon (Mac Mini), Wilson Benesch Full Circle, Expert 1000 Pro CI, Kaiser Chiara
Warwickshire, UK
Reply


Messages In This Thread
AIR streaming protocol - by thumb5 - 07-Sep-2014, 13:28
RE: AIR streaming protocol - by thumb5 - 07-Sep-2014, 14:43
RE: AIR streaming protocol - by thumb5 - 07-Sep-2014, 15:48
RE: AIR streaming protocol - by Rufus McDufus - 07-Sep-2014, 16:15
RE: AIR streaming protocol - by thumb5 - 07-Sep-2014, 16:27
RE: AIR streaming protocol - by Rufus McDufus - 07-Sep-2014, 16:47
RE: AIR streaming protocol - by thumb5 - 08-Sep-2014, 18:09
RE: AIR streaming protocol - by rik - 08-Sep-2014, 20:13
RE: AIR streaming protocol - by Mka - 09-Sep-2014, 15:35

Forum Jump:


Users browsing this thread: 1 Guest(s)