Code Execution Service

The code execution service executes Python code submitted by users through a queue-based architecture. It handles code execution requests, processes them asynchronously, and returns execution results.

Overview

The code execution service is a Flask-based microservice that accepts Python code execution requests from the frontend, executes code in isolated Docker containers, and returns execution results (stdout, stderr, exit code, execution time).

Architecture

Service Components

The service consists of three main components:

API Service (code-runner): Flask application that handles HTTP requests
- Validates incoming requests
- Enqueues jobs to Redis queue
- Waits for execution results
- Returns results to clients
Worker Service (worker): Async worker that processes execution jobs
- Continuously polls Redis queue for jobs
- Executes code in isolated subprocess
- Stores results back in Redis
- Handles errors gracefully
Redis Service: Message queue and result storage
- Job queue (FIFO) for pending executions
- Result storage with TTL (5 minutes)
- Metrics collection (1 hour retention)

Network Architecture

The service uses a two-network Docker architecture:

External Network: Allows frontend access to the API service at http://localhost:5001
Internal Network: Worker runs on isolated network for code execution

How It Works

Execution Flow

Request Received: Frontend sends POST request to /execute with:
- code: Python code to execute
- userId: User identifier (required for rate limiting)
- stdin: Optional standard input
Validation: API validates:
- Request format (JSON)
- Required fields (code, userId)
- Size limits (see Restrictions section)
Job Enqueueing:
- Unique job ID generated
- Job data pushed to Redis queue
- Queue length recorded for metrics
Worker Processing:
- Worker dequeues job from Redis
- Code written to temporary file in /tmp/code-execution
- Subprocess execution with restrictions (see Restrictions section)
- Output captured (stdout, stderr)
- Result stored in Redis with TTL
Result Retrieval:
- API waits up to 10 seconds for result
- Returns execution results or timeout error
- Metrics logged (execution time, queue length, userId)
Response: JSON response with stdout, stderr, exitCode, and executionTime

Restrictions

All code execution is subject to the following restrictions:

Input Limits:

Code size: 50KB maximum
Stdin size: 10KB maximum

Execution Limits:

CPU time: 5 seconds maximum
Memory: 512MB maximum
File size: 1MB maximum
Open files: 10 maximum
Child processes: 0 (no subprocess creation)

Output Limits:

stdout/stderr: Truncated to 10KB

Container Restrictions:

Filesystem: Read-only except /tmp/code-execution
Network: No external access (worker on internal network)
User: Non-root execution
Capabilities: Minimal Linux capabilities

Assumptions

The service makes the following assumptions:

User Authentication: userId is provided in requests. Currently uses placeholder 'dev-user' until authentication system is implemented.
Python Environment: Code executes in Python 3.12 with standard library. Common packages (numpy, pandas) are available but subject to resource limits.
Single Execution: Each request executes one code snippet. No persistent state between executions.
Queue Processing: Worker processes jobs sequentially (FIFO). Multiple workers can be added for horizontal scaling.
Result Retrieval: Results are stored in Redis for 5 minutes. Clients must retrieve results within this window.
Code Sandboxing: No additional code sandboxing (RestrictedPython) is currently implemented.

API Endpoints

POST `/execute`

Execute Python code.

Request Body:

{
  "code": "print('Hello, World!')",
  "userId": "user123",
  "stdin": "optional input"
}

Response (200 OK):

{
  "stdout": "Hello, World!\n",
  "stderr": "",
  "exitCode": 0,
  "executionTime": 0.123
}

Error Responses:

400 Bad Request: Validation errors (missing fields, exceeds size limits)
408 Request Timeout: Job timed out waiting in queue
429 Too Many Requests: Rate limit exceeded (10/min per user)

GET `/metrics`

Retrieve execution metrics.

Response (200 OK):

{
  "totalExecutions": 150,
  "averageExecutionTime": 0.456,
  "averageQueueLength": 2.3,
  "executionsPerUser": {
    "user123": 45,
    "user456": 30
  }
}

Rate Limited: 5 requests/minute per IP

GET `/health`

Health check endpoint.

Response (200 OK):

{
  "status": "healthy"
}

Rate Limiting

Execution Endpoint: 10 requests/minute per user (based on userId), falls back to IP-based limiting if userId not provided
Metrics Endpoint: 5 requests/minute per IP address
Storage: Rate limit counters stored in Redis (distributed)

Testing

The service includes a comprehensive test suite (80+ tests) covering:

API Tests: Endpoint validation, error handling, response formats
Execution Tests: Python features, libraries, output capture
Restrictions: All restrictions are tested (timeout, memory, CPU, file size, network isolation, filesystem)
Queue System: Job processing, result storage, timeouts
Rate Limiting: Enforcement, per-user limits, reset behavior
Metrics: Collection, aggregation, retrieval

Running Tests

Tests run in Docker containers for consistency:

# Run all tests
cd code-runner
./run_tests.sh

# Or manually
docker-compose --profile test up -d --build
docker-compose exec test-runner pytest tests/ -v
docker-compose --profile test down

The test suite ensures the service behaves correctly under various conditions.

Configuration

Key configuration values (in code-runner/src/config.py):

MAX_CODE_SIZE: 50KB (see Restrictions section)
MAX_STDIN_SIZE: 10KB (see Restrictions section)
MAX_EXECUTION_TIME: 5 seconds (see Restrictions section)
MAX_OUTPUT_SIZE: 10KB (see Restrictions section)
RATE_LIMIT_PER_MINUTE: 10 requests (see Rate Limiting section)
RESULT_TTL: 300 seconds (5 minutes)
METRICS_TTL: 3600 seconds (1 hour)

Deployment

Starting the Service

# Start code-runner service
npm run code-runner:start

# View logs
npm run code-runner:logs

# Stop service
npm run code-runner:stop

Service Dependencies

Redis: Required for queue and result storage
Docker: Required for container execution

Scaling

The service can be scaled horizontally by:

Adding more worker containers (process more jobs concurrently)
Using Redis cluster for high availability
Load balancing API requests across multiple API instances

Future Improvements

Code Sandboxing: Implement RestrictedPython for additional code restrictions
Authentication Integration: Replace placeholder userId with actual auth context
WebSocket Support: Real-time execution updates
Multiple Languages: Support for languages beyond Python
Persistent Sessions: Allow stateful code execution across requests

Overview​

Architecture​

Service Components​

Network Architecture​

How It Works​

Execution Flow​

Restrictions​

Assumptions​

API Endpoints​

POST /execute​

GET /metrics​

GET /health​

Rate Limiting​

Testing​

Running Tests​

Configuration​

Deployment​

Starting the Service​

Service Dependencies​

Scaling​

Future Improvements​

Overview

Architecture

Service Components

Network Architecture

How It Works

Execution Flow

Restrictions

Assumptions

API Endpoints

POST `/execute`

GET `/metrics`

GET `/health`

Rate Limiting

Testing

Running Tests

Configuration

Deployment

Starting the Service

Service Dependencies

Scaling

Future Improvements