Run your own secure version of ChatGPT Enterprise in just a few minutes ⚡️
Multimodal GPT Slackbot is a comprehensive Docker image designed to enable businesses to empower their teams with the latest AI technologies provided by OpenAI, in a secure and efficient manner. By integrating OpenAI's APIs on the backend with Slack as the user interface, it offers a seamless experience for users to engage with multimodal AI capabilities directly within Slack.
- ✅ Conversation History
- ✅ GPT Vision Support
- ✅ Image Generation
- ✅ Text to Speech Conversion
- ✅ Speech to Text Conversion
To run Multimodal GPT Slackbot, you can choose from several options:
- Pull the Docker image:
docker pull skydockai/multimodal_gpt_slackbot:latest-
Configure Environment Variables: Download the config.env file and update the first three variables (SLACK_SOCKET_TOKEN, SLACK_BOT_USER_TOKEN, and OPENAI_KEY) with your Slack app tokens and OpenAI API key.
(Note: If you use Azure OpenAI instead of OpenAI, please see this instruction)
-
Run the Docker image:
docker run --env-file ./config.env multimodal_gpt_slackbot:latest- Clone the source code:
git clone https://github.com/skydockAI/multimodal_gpt_slackbot.git- Build the Docker image:
docker build -t multimodal_gpt_slackbot:latest .- Configure and run: Follow the same steps as in the pre-built Docker image setup to configure your
config.envand run the image.
- Clone the source code:
git clone https://github.com/skydockAI/multimodal_gpt_slackbot.git-
Configure environment variables: Update the
config.envas described above. -
Run with Docker Compose:
docker compose up-
Conversation History: Maintains the context of each conversation within a Slack thread, ensuring continuity and ease of reference.

-
GPT Vision Support: Utilizes the gpt-4-turbo model to provide cutting-edge vision capabilities.

-
Image Generation: Leverages the Dall-E models to support creative and dynamic image generation.

-
Text to Speech Conversion: Converts text messages into spoken words, enhancing accessibility.

-
Speech to Text Conversion: Uses the Whisper model to transcribe spoken words into text, facilitating easy communication.

Multimodal GPT Slackbot is open-source and licensed under the GPL-3.0 license.