Developer Guide

Developer Guide

This guide covers the project architecture, local development setup, and contribution guidelines.

Project Structure

paperless-vault/
├── deploy/                    # Deployment configurations
│   ├── 01_dev/               # Development environment
│   │   ├── docker-compose.yml
│   │   ├── .env
│   │   ├── consume/          # Document intake folder
│   │   ├── data/             # Application data
│   │   └── media/            # Stored documents
│   ├── 02_stage/             # Staging environment
│   └── 03_prod/              # Production environment
│
├── src/                       # Custom Docker images
│   ├── webserver/            # Paperless webserver + enhancements
│   │   ├── Dockerfile
│   │   └── pre-consume.sh    # Pre-processing script
│   ├── worker/               # Celery worker
│   ├── db/                   # PostgreSQL
│   ├── redis/                # Redis cache
│   └── scheduler/            # Celery beat
│
├── scripts/                   # Utility scripts
│   ├── import-all-docs.sh    # Bulk import script
│   ├── move-unsupported.sh   # Clean up unsupported files
│   └── reimport-office-docs.sh
│
├── backups/                   # Database backups
│
└── docs/                      # This documentation (Hugo)

Technology Stack

Component Technology
Document Management Paperless-ngx
Container Runtime Docker + Docker Compose
Database PostgreSQL 15
Cache/Queue Redis 7
Task Queue Celery
OCR Engine OCRmyPDF + Tesseract
Office Conversion Apache Tika + Gotenberg
Tunnel Cloudflare Tunnel
Documentation Hugo

Local Development Setup

Prerequisites

Clone Repository

git clone https://bitbucket.org/wilsonify/paperless-vault.git
cd paperless-vault

Configure Environment

cd deploy/01_dev

# Copy and edit environment file
cp .env.example .env
nano .env

Required environment variables:

# Database
POSTGRES_PASSWORD=your-secure-password

# Paperless
PAPERLESS_SECRET_KEY=$(openssl rand -hex 32)
PAPERLESS_ADMIN_USER=admin
PAPERLESS_ADMIN_PASSWORD=your-admin-password

# Cloudflare (optional for local dev)
CLOUDFLARE_TUNNEL_TOKEN=your-tunnel-token

Start Development Environment

# Build and start all services
docker compose up -d

# View logs
docker compose logs -f webserver

# Access at http://localhost:18000

Development Ports

Environment Port
Development 18000
Staging 28000
Production 38000

Building Custom Images

Webserver Image

The webserver image extends Paperless-ngx with:

cd src/webserver

# Build image
docker build -t paperless-webserver .

# Test locally
docker run -it --rm paperless-webserver bash

Dockerfile Structure

FROM ghcr.io/paperless-ngx/paperless-ngx:latest

USER root

# Install additional packages
RUN apt-get update && apt-get install -y \
    libreoffice-writer \
    libreoffice-calc \
    imagemagick \
    && rm -rf /var/lib/apt/lists/*

# Configure ImageMagick for PDF
RUN sed -i 's/rights="none" pattern="PDF"/rights="read|write" pattern="PDF"/g' \
    /etc/ImageMagick-6/policy.xml

USER paperless

Working with Services

Service Management

# Start specific service
docker compose up -d webserver

# Rebuild after changes
docker compose build webserver
docker compose up -d webserver

# View service logs
docker compose logs -f webserver

# Execute command in container
docker exec -it paperless-dev-webserver bash

# Restart service
docker compose restart webserver

Database Operations

# Connect to database
docker exec -it paperless-dev-db psql -U paperless

# Backup database
docker exec paperless-dev-db pg_dump -U paperless paperless > backup.sql

# Restore database
cat backup.sql | docker exec -i paperless-dev-db psql -U paperless

Redis Operations

# Connect to Redis CLI
docker exec -it paperless-dev-redis redis-cli

# Check queue status
KEYS *
LLEN celery

Testing

Manual Testing

  1. Upload a test document via web UI
  2. Check processing in logs:
    docker compose logs -f webserver | grep -i "consuming\|success\|error"
    
  3. Verify document appears in dashboard
  4. Test search functionality

Testing Office Documents

# Copy test file to consume folder
cp test.docx deploy/01_dev/consume/

# Monitor processing
docker compose logs -f webserver | grep -i "tika\|gotenberg"

Code Style & Contributing

Commit Messages

Follow conventional commits:

feat: add office document support
fix: resolve Redis connection issue
docs: update admin guide
chore: update dependencies

Pull Request Process

  1. Fork the repository
  2. Create a feature branch
  3. Make changes with tests
  4. Submit PR with description
  5. Wait for review

Documentation Changes

cd docs

# Start Hugo dev server
hugo server -D

# Build for production
hugo --minify

# Deploy (see Admin Guide)
npx wrangler pages deploy public

Troubleshooting Development

Container Won’t Start

# Check logs
docker compose logs webserver

# Verify environment
docker compose config

# Check port conflicts
lsof -i :18000

Database Connection Issues

# Check database is running
docker compose ps db

# Test connection
docker exec paperless-dev-webserver python -c "
import django
django.setup()
from django.db import connection
connection.ensure_connection()
print('Connected!')
"

Celery Worker Issues

# Check worker logs
docker compose logs worker

# Verify Redis connection
docker exec paperless-dev-webserver python -c "
import redis
r = redis.from_url('redis://redis:6379')
print(r.ping())
"

Useful Commands

# Full rebuild
docker compose down -v
docker compose build --no-cache
docker compose up -d

# Clean up Docker
docker system prune -a

# Export document data
docker exec paperless-dev-webserver document_exporter ../export

# Reindex search
docker exec paperless-dev-webserver document_index reindex