A deeper look into Siri and the used network communication

By 30. Januar 2012 Apple, iOS No Comments
heysiri

Hi folks,

it’s a long that i posted something,,, this is mostly due my work on a new SiriProxy Version. This time written in c#. There will be a nice GUI for Windows Users and a lot more features =)

this blogpost is for technical interested people. no tutorial, nothing you HAVE to know.

But let’s talk about the topic: Siri and the used networking communication. Due my work on the latest SiriProxy version i more and more get a deeper look into everything. It’s pretty similar to what applidium wrote some time ago, but a bit more specific.

The Header

The iPhone connects to the Guzzoni Server sending a special http header:

ACE /ace HTTP/1.0
Host: guzzoni.apple.com
User-Agent: Assistant(iPhone/iPhone3,1; iPhone OS/5.0.1/9A405) Ace/1.0
Content-Length: 20000000000
X-ACE-HOST: fxxxx2c5-xxa5-41cd-965a-xxxxxxxxxxxx

There are 3 things which are a bit special:
– the http method „ACE“
– Content-Lengt with ~ 2GB
– added x-ace-host

the guzzoni server checks if http method „ace“ is used. if not sent there is a special error: HTTP Status 406 – Unacceptable
you can check yourself by opening „https://guzzoni.apple.com/ace“ in your browser.

The body

After the header we have 4 bytes header: 0xAA 0xCC 0xEE 0x02

The next 2 bytes contain the ZLIB compression lvl which is „best“.
from there on everything is packed into zlib. Apple used the ZLIB „full flush“ mode.
this means mostly that after sent objects there is a „00 00 FF FF“ added.

so the whole stream looks like this:

[header][0xaa0xcc0xee0x02][zlibstream]

If you unpack the zlib stream you find that the first byte contains the „packet type“. This can be one of the following:

2 – binary plist object
3 – ping
4 – pong

a ping or pong only consist of 4 bytes:

[PAKETTYPE-BYTE][COUNT OF PING 3BYTES]

So the first Ping/Pong looks like: [3][0 0 1]

a binary plist object starts with 2 and the next 4 bytes contain the length of the packet. if size is for example 500 bytes it looks like this:

[2][0x0 0x1 0xF4][binary plist object]

Every plist object contains a class. i know the following (this is not the complete list!!! just an excerpt):

//iPhone Stuff
CreateAssistant,
CreateSessionInfoRequest,
LoadAssistant,
StartSpeechRequest,
SetRequestOrigin,
SpeechPacket,
FinishSpeech,
EventSearchCompleted,
SetRestrictions,
ClearContext,
//From here its Guzzoni Stuff
SetApplicationContext,
AssistantLoaded,
SetConnectionHeader,
AddViews,
EventSearch,
RequestCompleted,
SpeechRecognized,
SessionValidationFailed,
CommandFailed,
GetSessionCertificateResponse,

In common a full request from iPhone looks like this:

[LoadAssistant][CreateSessionInfoRequest][SetRequestOrigin][StartSpeechRequest][SpeechPaket][SpeechPaket][FinishSpeech]

guzzoni’s answers normaly:

[AssistantLoaded][SetConnectionHeader][SpeechRecognized][AddViews][RequestCompleted]

If Authentication fails there will be some other communication.

Sidenotes

Btw in loadassistant and createassistant there is another used id which is unique for each iPhone. This might be used to add a simple „authentication“ to the server. i heard of ppl which say they have a authentication on their proxy. if they did it, then i would say mostly this way. it’s an easy way to include authentication .. but.. easy to „hack“. if somebody knows such a siriproxy server with „authentifaction“ … lemme know and i check if i can hack it 😉

with knowledge of the pakets send around during a siri connection, you can do a lot of fun things.but more about hacking and fun things in one of my next posts…

so long folks =)

Matthias

You liked this post? Say thanks!

Leave a Reply