-
Notifications
You must be signed in to change notification settings - Fork 256
Closed
Description
It would be really helpful if there was a documented example on how to catch common error-cases.
Might I request a documentation example, along the following lines:
for event in replicate.stream( LLM_MODEL, ... ):
print(str(event), end="")
elif ( #please document what goes here ):
print ("Error, timeout occurred after 34 seconds")
elif ( #please document what goes here ):
print ("Error: too many input tokens: you supplied 5034 when 4096 is the maximum")
elif ( #please document what goes here ):
print ("Error: output was truncated because we exceeded the context length before completion")
At the moment, all I can get is a crash with a runtime error (if tokens are exceeded), and complete silence if the timeout occurs.
Relatedly, is there any way to get the measured input and output token count?
I'm currently working with the approximation:
tokens = round((len(str.split()) * 4/3)).
Thanks for your time.
Metadata
Metadata
Assignees
Labels
No labels