I created simple Thunk, to pass image and AI detect the object, like meat, vegitable etc. And AI also suggest what sort of meal we can cool out of those ingredients.
Did the same question to the ChatGPT 4o, but Thunk AI answer was less professional and not much reliable comparing with answers back from ChatGPT 4 o direct.
In theory, we are using the same underlying model (gpt-4o).
In practice, a few things could explain the difference in results:
Maybe OpenAI is using a slightly different model for ChatGPT.
Maybe OpenAI has a custom prompt that changes its behavior.
Maybe the Thunk prompt is changing the output. Your input in the chat window is only part of the input that is sent to gpt-4o. We also include some additional context (e.g. the data you’re looking at) and some generic instructions about how to communicate.
You can give additional instructions by clicking on this (?) in the upper-right of the screen.
@Koichi_Tsuji could you also please clarify two things to help us understand better.
Why do you say it is “less reilable”? Is it because it had less detail than the ChatGPT response? One of the options in your AI settings is conversational style, whch by default is set to be brief and to the point. If you change that, you will get different behaviors.
Why do you say it is “less professional”? Is it because it isn’t conversational and chatty? Again, one of the options in your AI settings tells the AI agent how verbose to be.
I am not disagreeing with your feedback, but trying to understand it better. Since we are not actually aiming to provide a chatty consumer conversational AI app like ChatGPT. Instead, we’re trying to provide an AI agent for work where it does the interaction with GPT in the background following the plan and the style guidelines of the user. And then it records the results (not the conversation) into the “data” of the thunk.
More generally, the purpose of the platform is to automate AI agent work. So, when there is repetitive business work, the platform can automatically apply AI work to it, involving the user where appropriate or necessary in a workflow.
Also, for one-off consumer-style tasks, we aren’t trying to provide something identical to ChatGPT’s chatty back-and-forth interface because there didn’t seem to be any new value in doing that.
My post here is not intended to embarrase anyone nor dissapointed, but just to share my experience with quick test.
My assumption was behind the Thunk AI, Open AI (ChatGPT) latest or equivalent could be working, so to test this ground, I simply passed same sort of the query, but we got totally different query in terms of the accuracy and comprehensiveness.
See my screenshot I shared earlier. Firstly, the object detected by AI was different. There is meat. Thunk AI thought it is Checken but ChatGPT, says rightly beef or pork . There are some other gaps in response.
Overall, I assumed there is different AI engine could be running, which led to the different level of the quelity in response.
Oh, thanks for the clarification. I did not notice the chicken vs beef/pork (I’m a vegetarian :]).
It is the very same engine. Every call to OpenAI’s model has some variability of response. There is a parameter to control it which is set to avoid variability but still it isn’t exactly consistent. So you should see the same sort of variability you would get if you used ChatGPT multiple times.
As for the other elements of the response, specifically the level of detail, this depends very much on the way the prompts are constructed. I will write an article describing the elements of control you have over the prompting.