Skip to content

Conversation

@CHLK
Copy link
Collaborator

@CHLK CHLK commented Jan 9, 2026

Summary

Key Features

  • Deployment Script: New deploy.sh script for easy setup and deployment from source code
  • Multi-Instance Support: Support for running multiple ragflow_server instances in Docker and deployment scenarios
  • Docker Configuration: Enhanced Docker Compose configurations and entrypoint scripts
  • Environment Configuration: Improved environment variable handling and service configuration templates
  • Nginx Configuration: Updated Nginx configuration for better load balancing and routing

Changes

  • Added scripts/deploy.sh - Comprehensive deployment script (1490+ lines)
  • Enhanced docker/entrypoint.sh with multi-instance support
  • Updated docker/docker-compose.yml and docker-compose-self-hosted-ob.yml
  • Improved docker/nginx/ragflow.conf for load balancing
  • Added environment variable examples in docker/.env.example
  • Updated service configuration templates
  • Fixed several bugs in deployment scripts and entrypoint handling

Solution Description

@CHLK CHLK self-assigned this Jan 9, 2026
@CHLK CHLK added the enhancement New feature or request label Jan 9, 2026
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


keyang.lk seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces comprehensive deployment and multi-instance support for RAGFlow by adding a new deployment script (deploy.sh) and enhancing Docker configurations to support running multiple ragflow_server instances for better scalability and load balancing.

Key Changes:

  • New 1490-line deployment script with multi-instance support, component management, and port conflict detection
  • Enhanced secret key management to ensure consistent JWT signing across multiple server instances
  • Updated Docker entrypoint and nginx configuration for load balancing across multiple server instances

Reviewed changes

Copilot reviewed 13 out of 14 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
scripts/deploy.sh Comprehensive deployment script (1490 lines) with multi-instance ragflow_server support, component toggles, port management, and service orchestration
scripts/README.md Documentation for deployment script usage, configuration options, and multi-instance setup examples
docker/entrypoint.sh Enhanced with multi-instance support, secret key management, and dynamic config generation
docker/nginx/ragflow.conf Updated to use upstream load balancing with dynamically generated server lists
docker/service_conf.yaml.template Added template variable support for ports and secret key
docker/docker-compose.yml Added PowerRAG port mapping and notes for extra instance ports
docker/docker-compose-self-hosted-ob.yml Made Gotenberg service optional via profiles
docker/.env.example Added multi-instance configuration variables
mcp/server/server.py Migrated from print statements to proper logging with file output
common/log_utils.py Modified to support multiple log file basenames for multi-instance scenarios
common/constants.py Made SERVICE_CONF configurable via RAGFLOW_SERVICE_CONF environment variable
api/ragflow_server.py Added support for per-instance log basename configuration
api/utils/api_utils.py Improved logging exception messages
.gitignore Added generated files from deployment script

Comment on lines +21 to 31
import os
import logging

from common.log_utils import init_root_logger
from plugin import GlobalPluginManager

import logging
import os
# Initialize logging as early as possible so show_configs() (INFO logs) won't be dropped.
# deploy.sh sets RAGFLOW_LOG_BASENAME=ragflow_server_<port> for multi-instance runs.
init_root_logger(os.environ.get("RAGFLOW_LOG_BASENAME", "ragflow_server"))
import signal
import sys
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import statement moved after init_root_logger() call but before the imports it depends on. Lines 30-31 import signal and sys, which come after the init_root_logger() call on line 29. While this works, it creates an unusual import order. If init_root_logger() internally depends on these modules or if there are circular dependencies, this could cause issues. Consider keeping all imports together at the top of the file, or add a comment explaining why logging must be initialized before other imports.

Copilot uses AI. Check for mistakes.
Comment on lines 342 to 350
if [[ "${ENABLE_WEBSERVER}" -eq 1 ]]; then
echo "Starting ragflow_server..."
start_ragflow_servers

# nginx upstream include files are generated by start_ragflow_servers;
# start nginx after generation so it picks them up (no reload needed).
echo "Starting nginx..."
/usr/sbin/nginx

echo "Starting ragflow_server..."
while true; do
"$PY" api/ragflow_server.py ${INIT_SUPERUSER_ARGS} &
wait;
sleep 1;
done &
fi
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing start logic for PowerRAG server. ENABLE_POWERRAG_SERVER is defined and set to 1 by default on line 37, and the start_powerrag_server() function is defined on lines 264-270, but the visible component startup section (lines 342-350) doesn't include a check to start the PowerRAG server. Based on the pattern for MCP server and admin server, there should be a corresponding if block checking ENABLE_POWERRAG_SERVER and calling start_powerrag_server.

Copilot uses AI. Check for mistakes.

```bash
./scripts/deploy.sh start \
--enable-webserver
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing line continuation backslash. Line 106 ends with "--enable-webserver" but there's no backslash before the newline, causing the command to be incomplete. This will result in a syntax error when users try to copy-paste this command.

Suggested change
--enable-webserver
--enable-webserver \

Copilot uses AI. Check for mistakes.
# - set per-instance logfile basename so logs are split by port
RAGFLOW_SERVICE_CONF="${conf_name}" \
RAGFLOW_LOG_BASENAME="ragflow_server_${port}" \
"$PY" api/ragflow_server.py ${INIT_SUPERUSER_ARGS} &
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable INIT_SUPERUSER_ARGS is used on line 303 but is not defined anywhere in the visible changes. If this variable was previously defined and is still needed, it should be initialized with a default value (e.g., INIT_SUPERUSER_ARGS="${INIT_SUPERUSER_ARGS:-}") to prevent potential undefined variable errors when running with "set -u".

Copilot uses AI. Check for mistakes.

def get_data_error_result(code=RetCode.DATA_ERROR, message="Sorry! Data missing!"):
logging.exception(Exception(message))
logging.exception(f"Data error: {message}")
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logging.exception() call here will log an exception that wasn't caught - it's not inside an exception handler. Use logging.error() instead of logging.exception() when you're not in an exception context. logging.exception() should only be used within an except block to include the traceback.

Suggested change
logging.exception(f"Data error: {message}")
logging.error(f"Data error: {message}")

Copilot uses AI. Check for mistakes.
#### 启动 task executors(range 模式)

```bash
./scripts/deploy.sh start --enable-taskexecutor\
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing line continuation backslash. Line 121 ends with "--enable-taskexecutor" but there's no backslash before the newline, causing the command to be incomplete. This will result in a syntax error when users try to copy-paste this command.

Suggested change
./scripts/deploy.sh start --enable-taskexecutor\
./scripts/deploy.sh start --enable-taskexecutor \

Copilot uses AI. Check for mistakes.
ENABLE_DATASYNC=${ENABLE_DATASYNC:-1}
ENABLE_MCP_SERVER=${ENABLE_MCP_SERVER:-0}
ENABLE_ADMIN_SERVER=${ENABLE_ADMIN_SERVER:-0} # Default close admin server
ENABLE_POWERRAG_SERVER=${ENABLE_POWERRAG_SERVER:-1} # Default close PowerRAG server
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default value for ENABLE_POWERRAG_SERVER is inconsistent. Line 37 sets it to 1 (enabled by default), but this contradicts the comment on line 36 which says "Default close PowerRAG server" and the pattern established by other optional components (MCP and ADMIN_SERVER) which default to 0. The default should likely be 0 to match the comment and the "optional component" pattern.

Suggested change
ENABLE_POWERRAG_SERVER=${ENABLE_POWERRAG_SERVER:-1} # Default close PowerRAG server
ENABLE_POWERRAG_SERVER=${ENABLE_POWERRAG_SERVER:-0} # Default close PowerRAG server

Copilot uses AI. Check for mistakes.

if [[ "${idx}" -eq 0 ]]; then
port="${SVR_HTTP_PORT}"
conf_name="local.service_conf.yaml"
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent configuration file naming between deploy.sh and entrypoint.sh. In deploy.sh (line 647), the main instance uses GLOBAL_SERVICE_CONF (defaults to "service_conf.yaml"), while in entrypoint.sh (lines 282, 319), it uses "local.service_conf.yaml". This inconsistency could cause configuration issues when switching between deployment modes. Consider using a consistent naming scheme or documenting this intentional difference.

Suggested change
conf_name="local.service_conf.yaml"
conf_name="service_conf.yaml"

Copilot uses AI. Check for mistakes.
)

from peewee import OperationalError
from werkzeug.exceptions import NotFound
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'NotFound' is not used.

Suggested change
from werkzeug.exceptions import NotFound

Copilot uses AI. Check for mistakes.
Comment on lines +223 to +228
rm -f "${out_file}"
while IFS= read -r line || [[ -n "$line" ]]; do
# shellcheck disable=SC2034
SVR_HTTP_PORT="${ragflow_port}" ADMIN_SVR_HTTP_PORT="${admin_port}" \
eval "echo \"$line\"" >> "${out_file}"
done < "${TEMPLATE_FILE}"
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The render_service_conf function uses eval "echo \"$line\"" to expand environment variables from service_conf.yaml.template, which introduces a shell code injection risk. If an attacker can influence any of the environment variables used in the template (e.g., via orchestrator/UI that lets them set REDIS_PASSWORD, RAGFLOW_HOST, etc.), they can embed command substitutions like $(...) so that when eval re-parses the expanded line, arbitrary commands execute inside the container with the entrypoint's privileges. Replace the eval-based templating with a non-evaluating mechanism (e.g., envsubst or a simple variable substitution script that does not invoke the shell parser on untrusted data) so that environment values are treated as data, not executable shell code.

Suggested change
rm -f "${out_file}"
while IFS= read -r line || [[ -n "$line" ]]; do
# shellcheck disable=SC2034
SVR_HTTP_PORT="${ragflow_port}" ADMIN_SVR_HTTP_PORT="${admin_port}" \
eval "echo \"$line\"" >> "${out_file}"
done < "${TEMPLATE_FILE}"
if ! command -v envsubst >/dev/null 2>&1; then
echo "ERROR: envsubst is required to render service_conf from template." >&2
return 1
fi
rm -f "${out_file}"
SVR_HTTP_PORT="${ragflow_port}" ADMIN_SVR_HTTP_PORT="${admin_port}" \
envsubst < "${TEMPLATE_FILE}" > "${out_file}"

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants