Description
1. Introduction
Today, artificial-intelligence models can be accompanied by Go templates. In the near future, such models could be accompanied by JavaScript scripts or Web Assembly assemblies. These possibilities would present new benefits and opportunities with respect to the design of the Prompt API.
The following diagram illustrates how Web developers, via the Prompt API, could utilize artificial-intelligence models accompanied by scripts or assemblies.
+--------------------------------------------------+
| Web Models |
| |
| +------------+ +-----------------------+ |
| | | | | |
| | JavaScript | | | |
Web Prompt | | | | | |
Developers <===> API <===> | | or | <===> | AI Models | |
| | | | | |
| | WASM | | | |
| | | | | |
| +------------+ +-----------------------+ |
| |
+--------------------------------------------------+
2. Case Studies
2.1. Ollama
With respect to the Ollama platform, today, models can be accompanied by Go templates.
For an example, a model named Cogito is accompanied by the following template:
{{- if or .System .Tools }}<|start_header_id|>system<|end_header_id|>
{{- if .System }}
{{ .System }}
{{- end }}
{{- if .Tools }}
Available Tools:
{{ range $.Tools }}{{- . }}
{{ end }}
{{ end }}<|eot_id|>
{{- end }}
{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1 }}
{{- if eq .Role "user" }}<|start_header_id|>user<|end_header_id|>
{{ .Content }}<|eot_id|>{{ if $last }}<|start_header_id|>assistant<|end_header_id|>
{{ end }}
{{- else if eq .Role "assistant" }}<|start_header_id|>assistant<|end_header_id|>
{{- if .ToolCalls }}
{{ range .ToolCalls }}
<tool_call>
{"name": "{{ .Function.Name }}", "arguments": {{ .Function.Arguments }}}
</tool_call>{{ end }}
{{- else }}
{{ .Content }}
{{- end }}{{ if not $last }}<|eot_id|>{{ end }}
{{- else if eq .Role "tool" }}<|start_header_id|>ipython<|end_header_id|>
{"content": "{{ .Content }}"}<|eot_id|>{{ if $last }}<|start_header_id|>assistant<|end_header_id|>
{{ end }}
{{- end }}
{{- end }}
3. Benefits of AI Models Having Accompanying Scripts or Assemblies
Benefits of artificial-intelligence models being able to have accompanying scripts or assemblies would include: (1) modular design and separation of concerns, (2) portability and compatibility, (3) model-independent development, (4) extensibility, (5) multimodal prompt transformation, (6) adapters and codecs, (7) tools, (8) capabilities, (9) hyperparameters and settings, and (10) events.
3.1. Modular Design and Separation of Concerns
Models having accompanying scripts or assemblies would be parts of a modular design, enabling a separation of concerns.
Models' accompanying scripts or assemblies would be responsible for exporting those functions, implementing those interfaces, or implementing those APIs required for compatibility with the Prompt API, encapsulating any model-specific differences.
3.2. Portability and Compatibility
Models with accompanying scripts or assemblies would be portable, compatible across Web browsers.
3.3. Model-independent Development
If models were accompanied by scripts or assemblies, Web developers could more readily create model-independent software.
3.4. Extensibility
Models having their own scripts or assemblies would support the steadily advancing state of the art as new models or new versions of models could be designed, trained, tested, released, downloaded, and utilized independently of the versioning of Web browsers.
3.5. Multimodal Prompt Transformation
Models can require transformations or transpiling processes for their multimodal prompts and for their prompts’ components. Prompts' components can include: sections and paragraphs of text, source code, mathematics, lists, tables, images, audio, video, and various kinds of embedded files and data.
A model's script or assembly might, perhaps by calling an imported module or assembly, transform a mathematics expression provided in MathML into LaTeX or a table provided in HTML into markdown styled for that specific model.
3.6. Adapters and Codecs
For capable models, files and data resources could be embedded within multimodal prompts. A Web developer might want to embed CSV data, an XSLX spreadsheet, or a PDF document in a prompt, for example.
Models’ scripts or assemblies could make use of other reusable external script modules or assemblies (e.g., xlsx.wasm
).
That multiple models from one vendor or from multiple vendors might import the same script modules or assemblies suggests that a dependency management system could be useful to alleviate redundancy with respect to the local storage of script modules or assemblies.
3.7. Tools
Models’ accompanying scripts or assemblies could be responsible for processing or transpiling those JavaScript functions, tools, provided and described by Web developers into model-specific content, e.g., for models’ system prompts.
Models’ accompanying scripts or assemblies could also be responsible for receiving model-specific tool invocations, invoking those JavaScript functions provided and described by Web developers, and transforming functions’ returned values into model-specific content.
3.8. Capabilities
Models’ accompanying scripts or assemblies could enable Web developers to inspect models’ extensible capabilities.
3.9. Hyperparameters and Settings
Models’ accompanying scripts or assemblies could enable Web developers to inspect models’ extensible hyperparameters and settings.
In theory, some settings could result in modifications to models’ system prompts. A recent model, Cogito, for example, requires that the text: "Enable deep thinking subroutine."
be in its system prompt to activate its reasoning mode.
3.10. Events
In the Prompt API, LanguageModel
extends EventTarget
. Models with accompanying scripts or Web assemblies could support reflection over available, potentially custom, events and raising these events.
4. Technical Discussion
4.1. Exported Functions, Interfaces, or APIs
Which exported functions, interfaces, or APIs would need to be implemented by models’ scripts or assemblies for interoperation with the Prompt API?
4.2. Global Functions and Objects
Which global functions and objects would be available to models’ accompanying scripts or assemblies?
Instead of having global objects like window
and document
, models’ scripts or assemblies could access global objects like model
with which to access their models or, perhaps, kernel
with which to access a kernel component which enqueues and schedules interactions with loaded models.
4.3. A Walkthrough Sequence
The following diagram depicts an envisioned sequence of events between a Web developer’s JavaScript and an artificial-intelligence model as the prompt()
method is invoked on the LanguageModel
interface.
+----------------------------+
| |
| Developer's JavaScript |
+---| |<--+
1 | |----------------------------| | 10
+-->| |---+
| Prompt API |
+---| |<--+
2 | |----------------------------| | 9
+-->| |---+
| Web Browser (Native) |
+---| |<--+
3 | |----------------------------| | 8
+-->| |---+
| Model's JavaScript or WASM |
+---| |<--+
4 | |----------------------------| | 7
+-->| |---+
| Web Browser (Native) |
+---| |<--+
5 | |----------------------------| | 6
+-->| |---+
| Model |
| |
+----------------------------+
- First, the developer invokes
prompt()
on aLanguageModel
interface, passing it a text-string argument. - The
LanguageModel
is envisioned as being a native object wrapped as a JavaScript object. - The browser's native functionality can invoke a function in another sandbox, a function in the script or assembly accompanying the corresponding model.
- This functionality produces a lower-level, model-specific prompt and, at some point, accesses the model using a global object such as
model
orkernel
. These global objects are envisioned as being wrapped browser-native objects. - The browser then interacts with the model.
- The model returns model-specific content.
- Control returns to the model's accompanying script or assembly which transforms the model-specific response into a model-independent response.
- This model-independent response is returned to the browser native object wrapped as a JavaScript object for the
LanguageModel
interface. - The JavaScript interface for
LanguageModel
returns the model-independent response. - The developer's JavaScript receives the model-independent response.
This envisioned sequence of events is a simplification, however, as the Prompt API's LanguageModel
interface provides a return type of Promise<DOMString>
for its prompt()
method.
5. Conclusion
Thank you for any comments, feedback, or discussion, including with which to improve this feature request.