In my continuing attempt to set up a conversational ChatGPT-powered chatbot, I’ve got the following flow set up:
Capture user reply to {query}
Send {query} to API
Capture response to {answer}
Speak {answer}
This works great! Unfortunately, the API treats every call as a fresh query, forgetting any previous context. According to OpenAI, you must include a transcript of the previous conversation with each new call in order to avoid this amnesia effect. I don’t want to burn through too many tokens by including the entire conversation each time, so I’m trying to just save the immediate previous question and answer to prepend to the latest question.
This is the expanded flow:
Capture user reply to {query}
Send {query} to API
Capture response to {answer}
Set {oldquery} and {oldanswer} to the same values as {query} and {answer}
Speak {answer}
Capture user reply to update {query}
Send the updated {query} to API, prepended with {oldquery} and {oldanswer} in the API’s preferred format:
Loop back to #4, which updates the stored {oldquery} and {oldanswer} variables, speaks the updated answer, and waits for further user replies ad infinitum
I tried this but the second API call at step 7 (with the added context) failed. I thought it was an issue with the self-referring loopback, so I tried a more linear approach where step 9 instead captures the response to {answer2}, goes to a new “set variable” block to update {oldquery}/{oldanswer}, and then speaks the new {answer2}. Same problem. So the issue is with the second API call with the extra variables for context.
Thing is, this second API call appears to be fine – if use the “Send Request” button and type in the values for oldquery/oldanswer/query, it responds in context as expected. It just fails when used within the overall flow.
I tested a few scenarios:
call using specific text instead of variables (“user: what, assistant: moons, user: exist”)? API describes the moons of the solar system
call using a repeated {query} variable (“user: {query}, user: {query}, user: {query}”)? API reacts as if only one {query} was asked
call using variables again in case I made a typo before (“user: {oldquery}, assistant: {oldanswer}, user: {query}”)? API fails, though again it works when tested directly
Is VoiceFlow not able to include multiple variables in an API call? Is there a way to debug the failure? What might be going wrong here?
Hmm, just remembered that the article I read to get started on this actually uses multiple variables in an API call successfully:
{
“model”: “gpt-3.5-turbo”,
“messages”: [{“role”: “user”, “content”: “Write a dramatised, story of 800 words, for a {Age} year old, about a person called {Name} on the theme of {Story} – make it appropriate for a person {Age} years old- and add a moral to the story. “}]
}
Granted, they’re all in a single “user:” request, but it does demonstrate that including more than one variable shouldn’t break things. Now I’m really confused about what the problem is…
Also tried completely avoiding updating variables – just captured {query}, {answer}, and {query2} all separately. Included them in the second API call like so:
Hey! What channel is this for? It might be a time-out as the API calls in sequence are taking too long and some channels like Alexa have timeout periods.
Thanks, Braden – I don’t think this is it since the second API call does work when using a single variable, even if it’s repeated a few times. It’s only when using multiple different variables that the step fails.
edit: specifically, multiple variables separated out by “user:” and “assistant” tags as I showed upthread. If they’re all used in a single call, like the “write me a story” request quoted from the article, it seems to work for them. Also, I don’t think it’s a formatting problem because the call does work when tested in isolation.
Hi Guys, we did this for Alexa and can totally confirm the problem you are having is due to timeout. Alexa/Lambda accepts nativelly 3 seconds of delay for the response and most requests done to OpenAI take on average 9 seconds (we also have a chatbot for whatsapp where we track this stuff).
You are left with prompt engineering / cache to try and make it work. Good luck!