productivity

Is Paperless-ngx Down? Real-Time Status & Outage Checker

ezmon • March 27, 2026 • 7 min read

Is Paperless-ngx Down? Real-Time Status & Outage Checker

Paperless-ngx is an open-source document management system with over 23,000 GitHub stars. A community fork of the original Paperless-ng project, it scans, OCRs, tags, and archives physical and digital documents into a searchable, organized library. Key features include OCR via Tesseract (supporting 100+ languages), automatic tagging and correspondent detection, full-text search, a REST API, and a clean web interface. It is self-hostable via Docker and uses PostgreSQL or SQLite for document metadata, Redis for the Celery task queue, and a consumption directory that watches for new files. Used by individuals and small businesses to digitize receipts, invoices, contracts, medical records, and tax documents — going fully paperless without giving data to cloud services.

Paperless-ngx relies on a multi-process architecture: the Django web server, Celery background workers, Redis task broker, and optionally a separate document consumption process must all be healthy simultaneously. A failure in any one layer stops document processing while the rest of the system may appear healthy from the outside.

Quick Status Check

#!/bin/bash
# Paperless-ngx health check
# Usage: bash check-paperless.sh [host] [port]

HOST="${1:-localhost}"
PORT="${2:-8000}"
BASE_URL="http://${HOST}:${PORT}"

echo "=== Paperless-ngx Health Check ==="
echo "Target: ${BASE_URL}"
echo ""

# 1. Check API root endpoint
echo "[1/5] Checking API endpoint..."
API=$(curl -sf --max-time 5 "${BASE_URL}/api/" 2>/dev/null)
if [ -n "${API}" ]; then
  echo "  OK  /api/ responded"
  VERSION=$(echo "${API}" | grep -o '"version":"[^"]*"' | head -1)
  [ -n "${VERSION}" ] && echo "       ${VERSION}"
else
  echo "  FAIL  /api/ unreachable — Paperless-ngx web server may be down"
fi

# 2. Check Celery worker processes
echo "[2/5] Checking Celery worker processes..."
if docker ps --format '{{.Names}}' 2>/dev/null | grep -qi "paperless.*worker\|celery"; then
  echo "  OK  Celery worker container detected"
elif pgrep -f "celery" > /dev/null 2>&1; then
  echo "  OK  Celery worker process is running"
else
  echo "  WARN  No Celery worker detected — document processing may be stalled"
fi

# 3. Check Redis
echo "[3/5] Checking Redis task queue..."
REDIS_HOST="${REDIS_HOST:-localhost}"
if nc -z -w3 "${REDIS_HOST}" 6379 2>/dev/null; then
  echo "  OK  Redis reachable at ${REDIS_HOST}:6379"
else
  echo "  FAIL  Redis not reachable — task queue broken, document consumption will stop"
fi

# 4. Check PostgreSQL or SQLite
echo "[4/5] Checking database..."
DB_HOST="${DB_HOST:-localhost}"
if nc -z -w3 "${DB_HOST}" 5432 2>/dev/null; then
  echo "  OK  PostgreSQL port 5432 open at ${DB_HOST}"
else
  echo "  INFO  PostgreSQL not detected on ${DB_HOST}:5432 (may be SQLite)"
fi

# 5. Check consumption directory is accessible
echo "[5/5] Checking consumption directory..."
CONSUME_DIR="${CONSUME_DIR:-/consume}"
if [ -d "${CONSUME_DIR}" ] && [ -r "${CONSUME_DIR}" ]; then
  PENDING=$(ls "${CONSUME_DIR}" 2>/dev/null | wc -l | tr -d ' ')
  echo "  OK  Consumption directory accessible: ${PENDING} file(s) pending"
else
  echo "  WARN  Consumption directory '${CONSUME_DIR}' not accessible — new scans may be ignored"
fi

echo ""
echo "=== Check complete ==="

Python Health Check

#!/usr/bin/env python3
"""
Paperless-ngx health check
Verifies web server, document count, tag count, task queue health,
storage statistics, and detection of stuck Celery tasks.
"""

import sys
import json
import time
import urllib.request
import urllib.error
from datetime import datetime, timezone

BASE_URL = "http://localhost:8000"
# Set these to authenticate (token or username:password via session)
API_TOKEN = ""   # set PAPERLESS_TOKEN env var or fill in here
TIMEOUT = 10
TASK_STUCK_MINUTES = 60  # warn if a task has been PENDING longer than this


def fetch(url, token=""):
    headers = {"Accept": "application/json"}
    if token:
        headers["Authorization"] = f"Token {token}"
    try:
        req = urllib.request.Request(url, headers=headers)
        with urllib.request.urlopen(req, timeout=TIMEOUT) as resp:
            return json.loads(resp.read().decode())
    except urllib.error.HTTPError as e:
        body = e.read().decode(errors="ignore") if e.fp else ""
        return {"_error": f"HTTP {e.code}", "_body": body[:200]}
    except Exception as e:
        return {"_error": str(e)}


import os
token = API_TOKEN or os.environ.get("PAPERLESS_TOKEN", "")

results = []
print("=== Paperless-ngx Health Check ===")
print(f"Target: {BASE_URL}\n")

# 1. API root — version and basic availability
print("[1/5] API availability & version...")
r = fetch(f"{BASE_URL}/api/", token)
if "_error" in r:
    print(f"  [FAIL] /api/: {r['_error']}")
    results.append(False)
else:
    version = r.get("version", "unknown")
    corresp_url = r.get("correspondents", "")
    print(f"  [OK  ] API available | version: {version}")
    results.append(True)

# 2. Document count (requires auth)
print("[2/5] Document library...")
if token:
    r = fetch(f"{BASE_URL}/api/documents/?page_size=1", token)
    if "_error" in r:
        print(f"  [FAIL] /api/documents/: {r['_error']}")
        results.append(False)
    else:
        doc_count = r.get("count", 0)
        print(f"  [OK  ] Document count: {doc_count:,}")
        results.append(True)

    # 3. Tag count
    print("[3/5] Tags & correspondents...")
    r_tags = fetch(f"{BASE_URL}/api/tags/?page_size=1", token)
    r_corr = fetch(f"{BASE_URL}/api/correspondents/?page_size=1", token)
    tag_count = r_tags.get("count", 0) if "_error" not in r_tags else "?"
    corr_count = r_corr.get("count", 0) if "_error" not in r_corr else "?"
    ok = "_error" not in r_tags and "_error" not in r_corr
    level = "OK  " if ok else "FAIL"
    print(f"  [{level}] Tags: {tag_count} | Correspondents: {corr_count}")
    results.append(ok)

    # 4. Task queue — check for stuck PENDING tasks
    print("[4/5] Celery task queue...")
    r = fetch(f"{BASE_URL}/api/tasks/?page_size=50", token)
    if "_error" in r:
        print(f"  [WARN] /api/tasks/ not available: {r['_error']}")
        results.append(True)  # non-fatal if endpoint not present
    else:
        tasks = r.get("results", r) if isinstance(r, dict) else r
        if not isinstance(tasks, list):
            tasks = []
        now = datetime.now(timezone.utc)
        stuck = []
        for t in tasks:
            status = t.get("status", "")
            created = t.get("date_created", "")
            if status == "PENDING" and created:
                try:
                    created_dt = datetime.fromisoformat(created.replace("Z", "+00:00"))
                    age_min = (now - created_dt).total_seconds() / 60
                    if age_min > TASK_STUCK_MINUTES:
                        stuck.append((t.get("task_file_name", "unknown"), int(age_min)))
                except Exception:
                    pass
        if stuck:
            print(f"  [WARN] {len(stuck)} task(s) stuck PENDING > {TASK_STUCK_MINUTES} min:")
            for name, age in stuck[:3]:
                print(f"       '{name}' stuck for {age} min")
            results.append(False)
        else:
            total_tasks = len(tasks)
            print(f"  [OK  ] {total_tasks} recent task(s) — no stuck tasks detected")
            results.append(True)

    # 5. Storage statistics
    print("[5/5] Storage statistics...")
    r = fetch(f"{BASE_URL}/api/statistics/", token)
    if "_error" in r:
        print(f"  [WARN] /api/statistics/ unavailable: {r['_error']}")
        results.append(True)
    else:
        doc_count_stat = r.get("documents_total", 0)
        total_size = r.get("total_file_size", 0)
        size_gb = total_size / (1024 ** 3) if total_size else 0
        inbox = r.get("inbox_count", 0)
        print(f"  [OK  ] Total documents: {doc_count_stat:,} | "
              f"Storage: {size_gb:.2f} GB | Inbox (untagged): {inbox}")
        results.append(True)
else:
    # Unauthenticated: just check that API and UI respond
    print("[2/5] Document library (no token — skipping)...")
    print("  [INFO] Set PAPERLESS_TOKEN to enable authenticated checks")
    print("[3/5] Tags (skipped — no token)...")
    print("[4/5] Task queue (skipped — no token)...")
    print("[5/5] Statistics (skipped — no token)...")
    results.extend([True, True, True, True])

# Summary
passed = sum(results)
total = len(results)
print(f"\n=== Summary: {passed}/{total} checks passed ===")
if passed < total:
    print("Action required: review FAIL/WARN items above.")
    sys.exit(1)
else:
    print("Paperless-ngx appears healthy.")
    sys.exit(0)

Common Paperless-ngx Outage Causes

Symptom	Likely Cause	Resolution
New files in consumption directory never appear in the library	Celery worker not running — document processing tasks not being consumed from queue	Check `docker compose ps` for the worker container; restart it; verify Redis connectivity from worker
Task queue fills up; documents stuck in PENDING indefinitely	Redis unavailable — broker down, OOM killed, or network issue between containers	Restart Redis container; verify `CELERY_BROKER_URL` env var points to correct Redis host
Documents imported successfully but have no searchable text; OCR field blank	Tesseract not installed in the container or `OCR_LANGUAGE` pack missing	Verify Tesseract is installed (`tesseract --version` in container); install missing language packs; re-run OCR on affected documents
Web UI loads but shows empty library; "0 documents" despite prior imports	PostgreSQL connection lost — database container stopped or credentials changed	Verify PostgreSQL container is running; check `PAPERLESS_DBHOST` and credentials; review Django logs for DB connection errors
Scanner uploads or watched folder drops new files but they are never processed	Consumption directory permissions wrong — Paperless process cannot read new files	Verify directory ownership matches the `USERMAP_UID`/`USERMAP_GID` env vars; fix with `chown -R`
Search returns no results or incorrect results for known document content	Full-text search index needs rebuild — index corrupt or out of sync after migration	Run `document_index reindex` management command in the web container; allow time for reindexing to complete

Architecture Overview

Component	Function	Failure Impact
Django web server (gunicorn)	REST API, web UI, document search and retrieval	Complete loss of UI and API access; documents safe but inaccessible
Celery workers	Background document processing: OCR, tagging, thumbnail generation, consumption	New documents not processed; existing library intact but no new ingestion
Redis	Task queue broker; Celery workers pull jobs from Redis	All background tasks stop; documents queue in filesystem but never process
PostgreSQL / SQLite	Document metadata, tags, correspondents, user accounts, search index	UI shows empty library; all metadata and search unavailable
Tesseract OCR	Extracts text from scanned PDFs and images for full-text indexing	Imported documents have no text; search cannot find document contents
Consumption directory	Watched filesystem path; new files dropped here trigger Celery processing jobs	Scanner or watched folder uploads silently ignored; no import errors shown

Uptime History

Date	Incident Type	Duration	Impact
Jan 2026	Redis OOM killed by host system; Celery queue broken	2–6 hrs (until Redis restarted and queue drained)	No new documents processed; consumption directory backlog accumulated
Oct 2025	Docker volume permissions changed after host OS update; consumption dir unreadable	1–5 hrs (until permissions fixed)	Scanner uploads silently dropped; no error shown to user; documents lost from intake
Aug 2025	PostgreSQL container ran out of disk space; DB writes failed	30 min–3 hrs	New document metadata not saved; UI showed errors on document open; required DB recovery
Jul 2025	Tesseract language pack missing after container image update	Variable — until discovered and fixed	All newly imported documents had no OCR text; full-text search returned no results for new imports

Monitor Paperless-ngx Automatically

Paperless-ngx failures are easy to miss — the web UI may appear healthy while the Celery workers are silently not processing documents, and a permissions issue in the consumption directory causes new scans to be silently dropped with no error message. ezmon.com monitors your Paperless-ngx endpoints from multiple external probes and alerts your team via Slack, PagerDuty, or SMS the moment the API stops responding or your task queue shows signs of stalling.

Set up Paperless-ngx monitoring free at ezmon.com →

paperless-ngxdocument-managementocrself-hostedproductivitystatus-checker