A question from a layperson: One difference between your estimate and the study is that you assume the study's token count to be highly pessimistic. However, if I understand correctly, if you do a follow-up query in the same conversation, the whole input _and_ output of the previous query is part of the new context, right? If I understand correctly - again, I am a layperson - computing requirements scale exponentially with the size of the input. Wouldn't that mean that even a small follow-up question would increase the power needed for the conversation as a whole dramatically?
I do not use ChatGPT alot but in the cases that I have used it, I usually needed at least one or two follow-up queries to arrive at a satisfactory output. Maybe that is me being inexperienced, but given that I am a reasonably technically-minded person (and even if I'm not a computer scientist, I'm still the resident nerd in most of my circles), I don't think most end users fare much better. Would this circumstance (that many use cases involve at least one or two follow-up queries) change the calculations?
A question from a layperson: One difference between your estimate and the study is that you assume the study's token count to be highly pessimistic. However, if I understand correctly, if you do a follow-up query in the same conversation, the whole input _and_ output of the previous query is part of the new context, right? If I understand correctly - again, I am a layperson - computing requirements scale exponentially with the size of the input. Wouldn't that mean that even a small follow-up question would increase the power needed for the conversation as a whole dramatically?
I do not use ChatGPT alot but in the cases that I have used it, I usually needed at least one or two follow-up queries to arrive at a satisfactory output. Maybe that is me being inexperienced, but given that I am a reasonably technically-minded person (and even if I'm not a computer scientist, I'm still the resident nerd in most of my circles), I don't think most end users fare much better. Would this circumstance (that many use cases involve at least one or two follow-up queries) change the calculations?