Issue
Setting up Server-Sent events is relatively simple - especially using FastAPI - you can do something like this:
def fake_data_streamer():
for i in range(10):
yield "some streamed data"
time.sleep(0.5)
@app.get('/')
async def main():
return StreamingResponse(fake_data_streamer())
And upon an HTTP GET, the connection will return "some streamed data" every 0.5 seconds.
What if I wanted to stream some structured data though, like JSON?
For example, I want the data to be JSON, so something like:
def fake_data_streamer():
for i in range(10):
yield json.dumps({'result': 'a lot of streamed data', "seriously": ["so", "much", "data"]}, indent=4)
time.sleep(0.5)
Is this a proper server side implementation? Does this pose a risk of the client receiving partially formed payloads - which is fine in the plaintext case, but makes the JSON un-parseable?
If this is okay, how would you read this from the client side? Something like this:
async def main():
async with aiohttp.ClientSession() as session:
async with session.get(url) as resp:
while True:
chunk = await resp.content.readuntil(b"\n")
await asyncio.sleep(1)
if not chunk:
break
print(chunk)
Although I am not sure what the proper separator / reading mode is appropriate to ensure that the client always receives fully formed JSON events.
Or, is this a completely improper way to accomplish streaming of fully formed JSON events.
For reference, OpenAI achieves streaming of complete JSON objects in their API: https://github.com/openai/openai-cookbook/blob/main/examples/How_to_stream_completions.ipynb
Solution
To clarify and provide some context, OpenAI's Client API post processes Server-sent events to make them into nice JSON. But the "raw" events are sent per data:
only spec for Server Sent events, which is here: https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events. There are links in the ipython notebook you mentioned that point to this spec and other resources in the notebooks introduction.
You can query the open AI end point to see raw Server-events yourself:
https://asciinema.org/a/TKDYxqh6pgCNN0hX6tdcSgfzU
I am using the curl command:
curl https://api.openai.com/v1/chat/completions -u :$OPENAI_API_KEY -H 'Content-Type: application/json' -d '{"stream": true,"model": "gpt-3.5-turbo-0613", "messages": [ {"role": "user", "content": "Summarize Othello by Shaekspeare in a one line"}]}'
Now to answer your question. There is nothing preventing you from doing what you are intending, the sample code in the spec says you can do this very simply by placing a "\n\n"
at the end of your data:
event and have a client parser parse this(After it sees the relevant header "Content-Type: text/event-stream"
). Requests library in python actually does this automatically for you using the response_obj.iter_lines()
function. Python also has a SSE Client library that is easy to read that can parse this event and give you the full data:
lines.
Answered By - Sid
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.