SciSharp · martindevans · May 29, 2025 · May 29, 2025
diff --git a/LLama.Web/wwwroot/lib/jquery-validation/dist/additional-methods.js b/LLama.Web/wwwroot/lib/jquery-validation/dist/additional-methods.js
@@ -359,7 +359,7 @@ $.validator.addMethod( "creditcard", function( value, element ) {
 }, "Please enter a valid credit card number." );
 
 /* NOTICE: Modified version of Castle.Components.Validator.CreditCardValidator
- * Redistributed under the the Apache License 2.0 at http://www.apache.org/licenses/LICENSE-2.0
+ * Redistributed under the Apache License 2.0 at http://www.apache.org/licenses/LICENSE-2.0
  * Valid Types: mastercard, visa, amex, dinersclub, enroute, discover, jcb, unknown, all (overrides all other settings)
  */
 $.validator.addMethod( "creditcardtypes", function( value, element, param ) {

diff --git a/LLama/ChatSession.cs b/LLama/ChatSession.cs
@@ -637,7 +637,7 @@ public record SessionState
     public IHistoryTransform HistoryTransform { get; set; } = new LLamaTransforms.DefaultHistoryTransform();
 
     /// <summary>
-    /// The the chat history messages for this session.
+    /// The chat history messages for this session.
     /// </summary>
     public ChatHistory.Message[] History { get; set; } = [ ];
 

diff --git a/LLama/Native/SafeLlamaModelHandle.cs b/LLama/Native/SafeLlamaModelHandle.cs
@@ -702,7 +702,7 @@ public int Count
             }
 
             /// <summary>
-            /// Get the the type of this vocabulary
+            /// Get the type of this vocabulary
             /// </summary>
             public LLamaVocabType Type
             {

diff --git a/docs/Architecture.md b/docs/Architecture.md
@@ -6,7 +6,7 @@ The figure below shows the core framework structure of LLamaSharp.
 
 - **Native APIs**: LLamaSharp calls the exported C APIs to load and run the model. The APIs defined in LLamaSharp specially for calling C APIs are named `Native APIs`. We have made all the native APIs public under namespace `LLama.Native`. However, it's strongly recommended not to use them unless you know what you are doing.
 - **LLamaWeights**: The holder of the model weight.
-- **LLamaContext**: A context which directly interact with the native library and provide some basic APIs such as tokenization and embedding. It takes use of `LLamaWeights`.
+- **LLamaContext**: A context which directly interacts with the native library and provides some basic APIs such as tokenization and embedding. It takes use of `LLamaWeights`.
 - **LLamaExecutors**: Executors which define the way to run the LLama model. It provides text-to-text and image-to-text APIs to make it easy to use. Currently we provide four kinds of executors: `InteractiveExecutor`, `InstructExecutor`, `StatelessExecutor` and `BatchedExecutor`. 
 - **ChatSession**: A wrapping for `InteractiveExecutor` and `LLamaContext`, which supports interactive tasks and saving/re-loading sessions. It also provides a flexible way to customize the text process by `IHistoryTransform`, `ITextTransform` and `ITextStreamTransform`.
 - **Integrations**: Integrations with other libraries to expand the application of LLamaSharp. For example, if you want to do RAG ([Retrieval Augmented Generation](https://en.wikipedia.org/wiki/Prompt_engineering#Retrieval-augmented_generation)), kernel-memory integration is a good option for you.

diff --git a/docs/FAQ.md b/docs/FAQ.md
@@ -29,7 +29,7 @@ Generally, there are two possible cases for this problem:
 
 Please set anti-prompt or max-length when executing the inference.
 
-Anti-prompt can also be called as "Stop-keyword", which decides when to stop the response generation. Under interactive mode, the maximum tokens count is always not set, which makes the LLM generates responses infinitively. Therefore, setting anti-prompt correctly helps a lot to avoid the strange behaviours. For example, the prompt file `chat-with-bob.txt` has the following content:
+Anti-prompt can also be called as "Stop-keyword", which decides when to stop the response generation. Under interactive mode, the maximum tokens count is always not set, which makes the LLM generate responses infinitively. Therefore, setting anti-prompt correctly helps a lot to avoid the strange behaviours. For example, the prompt file `chat-with-bob.txt` has the following content:
 
 ```
 Transcript of a dialog, where the User interacts with an Assistant named Bob. Bob is helpful, kind, honest, good at writing, and never fails to answer the User's requests immediately and with precision.
@@ -43,7 +43,7 @@ User:
 
 Therefore, the anti-prompt should be set as "User:". If the last line of the prompt is removed, LLM will automatically generate a question (user) and a response (bob) for one time when running the chat session. Therefore, the antiprompt is suggested to be appended to the prompt when starting a chat session.
 
-What if an extra line is appended? The string "User:" in the prompt will be followed with a char "\n". Thus when running the model, the automatic generation of a pair of question and response may appear because the anti-prompt is "User:" but the last token is "User:\n". As for whether it will appear, it's an undefined behaviour, which depends on the implementation inside the `LLamaExecutor`. Anyway, since it may leads to unexpected behaviors, it's recommended to trim your prompt or carefully keep consistent with your anti-prompt.
+What if an extra line is appended? The string "User:" in the prompt will be followed with a char "\n". Thus when running the model, the automatic generation of a pair of question and response may appear because the anti-prompt is "User:" but the last token is "User:\n". As for whether it will appear, it's an undefined behaviour, which depends on the implementation inside the `LLamaExecutor`. Anyway, since it may lead to unexpected behaviors, it's recommended to trim your prompt or carefully keep consistent with your anti-prompt.
 
 ## How to run LLM with non-English languages
 
@@ -59,6 +59,6 @@ $$ len(prompt) + len(response) < len(context) $$
 
 In this inequality, `len(response)` refers to the expected tokens for LLM to generate.
 
-## Choose models weight depending on you task
+## Choose models weight depending on your task
 
 The differences between modes may lead to much different behaviours under the same task. For example, if you're building a chat bot with non-English, a fine-tuned model specially for the language you want to use will have huge effect on the performance.
diff --git a/docs/QuickStart.md b/docs/QuickStart.md
@@ -24,7 +24,7 @@ PM> Install-Package LLamaSharp
 
 ## Model preparation
 
-There are two popular format of model file of LLM now, which are PyTorch format (.pth) and Huggingface format (.bin). LLamaSharp uses `GGUF` format file, which could be converted from these two formats. To get `GGUF` file, there are two options:
+There are two popular formats of model file of LLM now, which are PyTorch format (.pth) and Huggingface format (.bin). LLamaSharp uses `GGUF` format file, which could be converted from these two formats. To get `GGUF` file, there are two options:
 
 1. Search model name + 'gguf' in [Huggingface](https://huggingface.co), you will find lots of model files that have already been converted to GGUF format. Please take care of the publishing time of them because some old ones could only work with old version of LLamaSharp.
 

diff --git a/docs/index.md b/docs/index.md
@@ -17,7 +17,7 @@ If you are new to LLM, here're some tips for you to help you to get start with `
 
 ## Integrations
 
-There are integarions for the following libraries, which help to expand the application of LLamaSharp. Integrations for semantic-kernel and kernel-memory are developed in LLamaSharp repository, while others are developed in their own repositories.
+There are integrations for the following libraries, which help to expand the application of LLamaSharp. Integrations for semantic-kernel and kernel-memory are developed in LLamaSharp repository, while others are developed in their own repositories.
 
 - [semantic-kernel](https://github.com/microsoft/semantic-kernel): an SDK that integrates LLM like OpenAI, Azure OpenAI, and Hugging Face.
 - [kernel-memory](https://github.com/microsoft/kernel-memory): a multi-modal AI Service specialized in the efficient indexing of datasets through custom continuous data hybrid pipelines, with support for RAG ([Retrieval Augmented Generation](https://en.wikipedia.org/wiki/Prompt_engineering#Retrieval-augmented_generation)), synthetic memory, prompt engineering, and custom semantic memory processing.
@@ -32,7 +32,7 @@ There are integarions for the following libraries, which help to expand the appl
 Community effort is always one of the most important things in open-source projects. Any contribution in any way is welcomed here. For example, the following things mean a lot for LLamaSharp:
 
 1. Open an issue when you find something wrong.
-2. Open an PR if you've fixed something. Even if just correcting a typo, it also makes great sense.
+2. Open a PR if you've fixed something. Even if just correcting a typo, it also makes great sense.
 3. Help to optimize the documentation. 
 4. Write an example or blog about how to integrate LLamaSharp with your APPs.
 5. Ask for a missing feature and discuss with us.

diff --git a/docs/xmldocs/llama.abstractions.metadataoverride.md b/docs/xmldocs/llama.abstractions.metadataoverride.md
@@ -15,7 +15,7 @@ Implements [IEquatable&lt;MetadataOverride&gt;](https://docs.microsoft.com/en-us
 
 ### **Key**
 
-Get the key being overriden by this override
+Get the key being overridden by this override
 
 ```csharp
 public string Key { get; }

diff --git a/docs/xmldocs/llama.native.nativeapi.md b/docs/xmldocs/llama.native.nativeapi.md
@@ -340,7 +340,7 @@ Number of threads
 Binary image in jpeg format
 
 `image_bytes_length` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
-Bytes lenght of the image
+Bytes length of the image
 
 #### Returns
 
@@ -671,7 +671,7 @@ public static Span<float> llama_get_embeddings(SafeLLamaContextHandle ctx)
 
 Apply chat template. Inspired by hf apply_chat_template() on python.
  Both "model" and "custom_template" are optional, but at least one is required. "custom_template" has higher precedence than "model"
- NOTE: This function does not use a jinja parser. It only support a pre-defined list of template. See more: https://github.com/ggerganov/llama.cpp/wiki/Templates-supported-by-llama_chat_apply_template
+ NOTE: This function does not use a jinja parser. It only supports a pre-defined list of template. See more: https://github.com/ggerganov/llama.cpp/wiki/Templates-supported-by-llama_chat_apply_template
 
 ```csharp
 public static int llama_chat_apply_template(SafeLlamaModelHandle model, Char* tmpl, LLamaChatMessage* chat, IntPtr n_msg, bool add_ass, Char* buf, int length)

diff --git a/docs/xmldocs/llama.sessionstate.md b/docs/xmldocs/llama.sessionstate.md
@@ -75,7 +75,7 @@ public IHistoryTransform HistoryTransform { get; set; }
 
 ### **History**
 
-The the chat history messages for this session.
+The chat history messages for this session.
 
 ```csharp
 public Message[] History { get; set; }