replicate · cbh123 · Apr 19, 2024 · Jun 28, 2024
diff --git a/README.md b/README.md
@@ -42,18 +42,6 @@ Create a new Python file and add the following code, replacing the model identif
 ['https://replicate.com/api/models/stability-ai/stable-diffusion/files/50fcac81-865d-499e-81ac-49de0cb79264/out-0.png']
 ```
 
-Some models, particularly language models, may not require the version string. Refer to the API documentation for the model for more on the specifics:
-
-```python
-replicate.run(
-    "meta/llama-2-70b-chat",
-    input={
-        "prompt": "Can you write a poem about open source machine learning?",
-        "system_prompt": "You are a helpful, respectful and honest assistant.",
-    },
-)
-```
-
 Some models, like [andreasjansson/blip-2](https://replicate.com/andreasjansson/blip-2), have files as inputs.
 To run a model that takes a file input,
 pass a URL to a publicly accessible file.
@@ -69,14 +57,14 @@ Or, for smaller files (<10MB), you can pass a file handle directly.
 ```
 
 > [!NOTE]
-> You can also use the Replicate client asynchronously by prepending `async_` to the method name. 
-> 
+> You can also use the Replicate client asynchronously by prepending `async_` to the method name.
+>
 > Here's an example of how to run several predictions concurrently and wait for them all to complete:
 >
 > ```python
 > import asyncio
 > import replicate
-> 
+>
 > # https://replicate.com/stability-ai/sdxl
 > model_version = "stability-ai/sdxl:39ed52f2a78e934b3ba6e2a89f5b1c712de7dfea535525255b1aa35c5565e08b"
 > prompts = [
@@ -96,31 +84,32 @@ Or, for smaller files (<10MB), you can pass a file handle directly.
 
 ## Run a model and stream its output
 
-Replicate’s API supports server-sent event streams (SSEs) for language models. 
-Use the `stream` method to consume tokens as they're produced by the model.
+Replicate’s API supports server-sent event streams (SSEs) for language models.
+Use the `stream` method to consume tokens as they're produced.
 
 ```python
 import replicate
 
-# https://replicate.com/meta/llama-2-70b-chat
-model_version = "meta/llama-2-70b-chat:02e509c789964a7ea8736978a43525956ef40397be9033abf9fd2badfe68c9e3"
-
 for event in replicate.stream(
-    model_version,
+    "meta/meta-llama-3-70b-instruct,
     input={
         "prompt": "Please write a haiku about llamas.",
     },
 ):
     print(str(event), end="")
 ```
 
+> [!TIP]
+> Some models, like [meta/meta-llama-3-70b-instruct](https://replicate.com/meta/meta-llama-3-70b-instruct), 
+> don't require a version string. 
+> You can always refer to the API documentation on the model page for specifics.
+
 You can also stream the output of a prediction you create.
 This is helpful when you want the ID of the prediction separate from its output.
 
 ```python
-version = "02e509c789964a7ea8736978a43525956ef40397be9033abf9fd2badfe68c9e3"
 prediction = replicate.predictions.create(
-    version=version,
+    model="meta/meta-llama-3-70b-instruct",
     input={"prompt": "Please write a haiku about llamas."},
     stream=True,
 )
@@ -132,7 +121,6 @@ for event in prediction.stream():
 For more information, see
 ["Streaming output"](https://replicate.com/docs/streaming) in Replicate's docs.
 
-
 ## Run a model in the background
 
 You can start a model and run it in the background:
@@ -337,12 +325,12 @@ Here's how to list of all the available hardware for running models on Replicate
 
 ## Fine-tune a model
 
-Use the [training API](https://replicate.com/docs/fine-tuning) 
-to fine-tune models to make them better at a particular task. 
-To see what **language models** currently support fine-tuning, 
+Use the [training API](https://replicate.com/docs/fine-tuning)
+to fine-tune models to make them better at a particular task.
+To see what **language models** currently support fine-tuning,
 check out Replicate's [collection of trainable language models](https://replicate.com/collections/trainable-language-models).
 
-If you're looking to fine-tune **image models**, 
+If you're looking to fine-tune **image models**,
 check out Replicate's [guide to fine-tuning image models](https://replicate.com/docs/guides/fine-tune-an-image-model).
 
 Here's how to fine-tune a model on Replicate: