Skip to content

An AI-powered accessibility Discord bot written in Node.js, using Discord.js, MongoDB and Google Gemini API. The image-reader bot is designed to describe and transcribe images for the visually impaired, features a manual image-reading command and an auto-reading system.

License

Notifications You must be signed in to change notification settings

ElenaChes/discordjs-image-reader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Discord.js Image-Reader Bot

An AI-powered accessibility Discord bot written in Node.js, using Discord.js, MongoDB and Google Gemini API.
The image-reader bot is designed to describe and transcribe images for the visually impaired, features a manual image-reading command and an auto-reading system that automatically provides descriptions for all images sent in a designated channel.

Content


Project Structure

The selevant files in the src folder:

  • commands/ - Slash and context command declarations:
    • context/prompt.js - The "Set prompt" message context menu command.
    • regular/ - Commands accessible to all users:
      • bot.js - Global commands: Bot's description, ping and command list. (from template)
      • checkChat.js - The /check-chat command.
      • readImage.js - The /read-image command.
    • serverLocked/ - Commands that register only in certain servers:
      • homeGuild/manage.js - The /manage bot owner commands, register only in the testing server. (from template)
    • staff/ - Admin and mod commands:
      • chat.js - The /chat commands: Add/Remove auto-reading. (admin only)
      • chatsList.js - The /chats-list command.
      • prompt.js - The /prompt command: Edit auto-reading prompt.
  • config/info.json - User-facing messages and default display values, contains the system instructions and default prompt.
  • events/client/messageCreate.js - Handles plain message commands and images sent in auto-reading channels.
  • functions/
    • commands/ - The logic of the commands declared in src/commands.
    • runtime/ - Functions used during runtime:
      • services/chatService.js - Functions managing the auto-reading channel database operations.
  • schemas/ - MongoDB schemas:
    • access.js - Defines the IDs of the bot's owner and testing server. (from template)
    • chat.js - Structure for saving auto-reading channels and their custom prompts.
    • profile.js - Additional data about the bot: testing/main, presence and log-channel, get created automatically. (from template)
  • app.js - The app's bootstrapper. (from template)
  • index.js - The app's entry point. (from template)

Note

Other folders and files come from my Modular Discord.js Bot Template.
See the repo for in-depth documentation.

Dependencies

  1. Node.js 22.14.0
  2. npm 10.1.0

The app may work with other versions, but these are the versions that were used during development.

Installation

Note

For a more in-depth explanation covering Discord permissions, environment setup and app flags, see the template's installation process.

  1. Create a new application in the Discord Developers Portal if you don't already have one.
  2. Get your API key from the Google AI Studio (Gemini Developer Portal).
  3. Open a MongoDB project if you don't already have one.
  4. Create a .env file in the root directory of the project and fill it with the following:
TOKEN=<Discord bot token>
API_KEY=<Gemini API key>
DBURL=mongodb+srv://<USERNAME>:<PASSWORD>@<CLUSTER>.mongodb.net/<DATABASE>?retryWrites=true&w=majority
  1. Create a document in your MongoDB database according to the schema in access.js, for example:
 "IDs": [
    {
      "label": "OWNER_ID",
      "value": "<Bot's owner Discord user ID>",
      "roles": [ "owner"],
      "idType": "user"
    },
    {
      "label": "HOME_GUILD",
      "value": "<Bot's testing Discord server ID>",
      "roles": [ "homeGuild" ],
      "idType": "guild"
    },
  ],
  "Aliases": [
    { "label": "<Discord server ID>", "value": "<alias for the server name>" }
  ]

Note

Aliases are optional but useful for servers with long or unusual names that might clutter the logs.

  1. Run npm i to install all dependencies.
  2. Start node src/index.js using node . or npm start.

Image Reader Commands

Regular User

/check-chat

Check if auto-reading is enabled in a channel and what prompt will be used.

Arguments:

  1. channel: The channel to check, if none specified will check the current channel.
    • type: Text channel

Note

Plain message version: check-chat(#channel)

/read-image

Read and describe the provided image(s).

Note

Plain message version: read-image(prompt) & attach at least one image.

Arguments:

  1. image1: An image to read.
    • type: Attachment - jpeg / png / webp | required
  2. image2-10: Additional images to read (up to 9 more).
    • type: Attachment - jpeg / png / webp
  3. prompt: Specific questions about the image(s) or the response's format.
    • type: Text

Server Staff

/chat add (admin only)

Add auto-reading in a channel.

Arguments:

  1. channel: Where to add up auto-reading, if none specified will pick the current channel.
    • type: Text channel
  2. prompt: A custom prompt for the bot to use that this channel.
    • type: Text

Note

Plain message version: chat.add(#channel, prompt)

/chat remove (admin only)

Remove auto-reading from a channel.

Arguments:

  1. channel: Where to remove auto-reading, if none specified will pick the current channel.
    • type: Text channel

Note

Plain message version: chat.remove(#channel)

/chat remove-all (admin only)

Stop auto-reading in all channels in the server.

Note

Plain message version: chat.remove-all()

/chats-list

Provides a list of all channels where the bot is currently set up for auto-reading.

Note

Plain message version: chats-list()

/prompt

Edit a channel's custom prompt.

Arguments:

  1. channel: The channel to edit, if none specified will edit the current channel's prompt.
    • type: Text channel
  2. prompt: Custom prompt for the channel, leave empty to remove prompt.
    • type: Text

Note

Plain message version: chat.prompt(#channel, prompt)

Message Context Command - Set prompt

Edit a channel's custom prompt using an existing message's content.
To use, right-click or long-tap a message with the desired prompt -> Apps -> Set prompt.

Template Commands

Note

For an in-depth explanation of the template's built in commands and how to use them via plain message command calls, see the template's Built-in Commands and Components.

Global Commands

  • /bot about - Provides the bot's "about" description.
  • /bot commands - Provides descriptions of the bot's commands. Optionally, accepts the name of a category.
  • /bot ping - Provides the bot's ping and performs a simple permissions check (response changes based on whether the owner ran the command).

Owner Commands

  • /manage commands loaded - View the currently loaded commands.
  • /manage commands reload - Reload bot's commands. (once in 10 minutes)
  • /manage log-channel check - Check the current log channel.
  • /manage log-channel update - Update bot's log channel.
  • /manage presence - Update bot's presence, leave all empty to reload from database.
  • /manage reload-database - Reload information from database.
  • /manage state- Update bot's state (allows to disable bot responses by putting it in sleep mode).

Usage

  • Use the /read-image command to manually send images and receive a description or transcription.
  • Server admins can set up the auto-reading feature in specific channels using /chat add and adjust reading instructions using /prompt.
  • Use the /check-chat command to verify if a channel has auto-reading enabled and view its current prompt.

Important

The bot is configured to process only one image-reading request at a time.
If you are deploying this bot for a large user base and prioritize performance over potentially higher API costs, you can remove the busy boolean check in src/functions/commands/regular/readImage.js.

Caution

In case of misuse or a significant spike in traffic, you can quickly disable all bot activity:
Use the command /manage state and set the state to "sleep".
(To resume activity use the command again and set it to "active")

About

An AI-powered accessibility Discord bot written in Node.js, using Discord.js, MongoDB and Google Gemini API. The image-reader bot is designed to describe and transcribe images for the visually impaired, features a manual image-reading command and an auto-reading system.

Topics

Resources

License

Stars

Watchers

Forks