Post

Linux Process Management: From Basics to Advanced Techniques

Introduction

You know that moment when you’re running a long compilation job, and your SSH connection drops? Or when you need to find and kill that one rogue process eating up all your CPU? Yeah, I’ve been there too. Process management is one of those skills that separates casual Linux users from confident system administrators.

In this guide, we’ll dive deep into the world of Linux process management. Whether you’re just learning how to run commands in the background or you’re looking to master advanced techniques like process priority tuning and session management, I’ve got you covered. We’ll explore everything from the trusty ps command to sophisticated tools like tmux and cron scheduling.

Let’s get started!

Understanding Processes: The Building Blocks

Before we jump into commands, let’s quickly talk about what a process actually is. Every time you run a command in Linux, the system creates a process—basically, a running instance of a program. Each process has its own Process ID (PID), parent process ID (PPID), and state.

Process States Explained

Processes aren’t just “running” or “not running.” They exist in several states:

State Symbol What It Means
Running R The process is actively executing on a CPU
Interruptible Sleep S Waiting for something (like I/O), can be woken up
Uninterruptible Sleep D Waiting and can’t be interrupted (usually disk I/O)
Stopped T Process execution has been suspended
Zombie Z Process finished but parent hasn’t cleaned it up yet

Note: Zombie processes aren’t necessarily bad. They’re cleaned up once the parent process calls wait(). However, too many zombies might indicate a parent process isn’t handling child processes correctly.

Listing Processes: Your First Tool

The ps command is your Swiss Army knife for viewing processes. It has three different syntax styles (UNIX, BSD, and GNU), which can be confusing at first. Here’s what you really need to know:

Essential ps Commands

View processes in your current terminal:

1
ps

This shows just the processes running in your terminal session. Pretty basic.

View all your processes:

1
ps x

The x option shows all processes belonging to you, even those not attached to a terminal. This is where things get interesting.

View ALL system processes with details:

1
ps aux

This is probably the most commonly used variant. Let me break down what you’re seeing:

1
2
3
4
sysadmin@localhost:~$ ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.3  37820  5912 ?        Ss   10:23   0:01 /sbin/init
sysadmin  2901  2.5  1.2 845620 25484 ?        Sl   11:45   1:23 /usr/bin/firefox
  • USER: Who owns the process
  • PID: Process ID (unique identifier)
  • %CPU: CPU usage percentage
  • %MEM: Memory usage percentage
  • STAT: Process state (we covered this above!)
  • COMMAND: The actual command that’s running

Full format with parent process info:

1
ps -ef

This shows the PPID (parent process ID), which is super useful when you’re tracking down process relationships.

Real-World Example: Finding Memory Hogs

Let’s say your system is sluggish. Here’s how I’d investigate:

1
2
3
4
5
# Sort processes by memory usage
ps aux --sort=-%mem | head -n 10

# Or by CPU usage
ps aux --sort=-%cpu | head -n 10

This shows the top 10 resource consumers. I’ve caught runaway processes this way more times than I can count.

Searching for Processes: Meet pgrep

While you can pipe ps output through grep, there’s a better way. The pgrep command is specifically designed for finding processes.

pgrep in Action

Find processes by name (case-insensitive):

1
pgrep -i firefox

This returns just the PIDs. Simple and clean.

Show process names along with PIDs:

1
2
3
pgrep -li sshd
# Output: 1234 sshd
#         5678 sshd

Find all processes for a specific user:

1
pgrep -u sysadmin -l

Count processes matching a pattern:

1
2
pgrep -c httpd
# Output: 8

Tip: Use pgrep -a to see the full command line for each process. It’s like ps and grep had a super-efficient baby.

Advanced pgrep Techniques

Find the newest process:

1
pgrep -n chrome

Find the oldest process:

1
pgrep -o chrome

Combine with other criteria:

1
2
# Find all Python scripts run by the deploy user
pgrep -u deploy -f "python.*\.py"

The -f flag matches against the full command line, not just the process name.

Background and Foreground: Taking Control

By default, when you run a command, it occupies your terminal until it finishes. That’s fine for quick commands, but what about long-running processes?

Running Commands in the Background

Start a command in the background:

1
2
# This will take 5 minutes
sleep 300 &

The & symbol tells the shell to run the command in the background. You’ll see something like:

1
[1] 3456

That’s the job number (1) and the PID (3456).

View your background jobs:

1
2
3
jobs
# Output:
# [1]+  Running                 sleep 300 &

Suspending and Resuming Jobs

Here’s a scenario: You start a long-running compilation, then realize you need your terminal back.

Suspend the foreground process:

Press Ctrl+Z. Your job stops (doesn’t terminate, just pauses).

1
2
3
sysadmin@localhost:~$ make
^Z
[1]+  Stopped                 make

Resume it in the background:

1
2
bg %1
# Output: [1]+ make &

Bring it back to the foreground:

1
fg %1

Practical Job Control Example

Let’s say you’re editing a config file, need to test something quickly, then go back to editing:

1
2
3
4
5
6
7
8
9
10
11
# Start editing
vim /etc/nginx/nginx.conf

# Oh wait, need to check something
# Press Ctrl+Z

# Run your test
nginx -t

# Back to editing
fg

This workflow becomes second nature once you get used to it.

Signals: Talking to Processes

Signals are how the operating system and users communicate with processes. Think of them as messages that say “please terminate,” “pause,” or “continue.”

Common Signals You Should Know

Signal Number Purpose Can Be Caught?
SIGTERM 15 Terminate gracefully (default) Yes
SIGKILL 9 Force kill immediately No
SIGINT 2 Interrupt (Ctrl+C) Yes
SIGHUP 1 Hangup (often means reload config) Yes
SIGSTOP 19 Stop/pause process No
SIGCONT 18 Continue paused process No
SIGUSR1/2 10/12 User-defined signals Yes

Using the kill Command

Despite its name, kill doesn’t just kill processes—it sends signals.

Terminate a process gracefully:

1
kill 2901

This sends SIGTERM (15), giving the process a chance to clean up.

Force kill when necessary:

1
kill -9 2901

SIGKILL can’t be caught or ignored. The process dies immediately. Use this as a last resort.

Warning: Using kill -9 should be your last option. It doesn’t give the process time to clean up resources, close files, or save state. Always try SIGTERM first.

Kill by job number:

1
kill %1

Send specific signals:

1
2
3
4
5
# Send SIGHUP to reload configuration
kill -HUP 2901

# Or use the number
kill -1 2901

killall and pkill: Bulk Operations

Kill all processes with a specific name:

1
killall sleep

Or use pkill with patterns:

1
2
3
4
5
# Kill all Firefox processes
pkill firefox

# Kill all Python scripts run by user bob
pkill -u bob python

Danger: Be extremely careful with killall on Solaris systems—it kills ALL processes, not just ones matching a name. Always double-check which OS you’re on!

Real-World Signal Example

Here’s how I typically handle a misbehaving web server:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# First, find the process
pgrep -li nginx
# Output: 5432 nginx: master process

# Try graceful shutdown
sudo kill -TERM 5432

# Wait a few seconds
sleep 5

# Check if it's still running
pgrep nginx

# If still running, force it
if [ $? -eq 0 ]; then
    sudo kill -9 5432
fi

Process Priority: nice and renice

Not all processes are created equal. Some tasks (like system backups) should run with low priority so they don’t impact interactive work. That’s where niceness comes in.

Understanding Niceness

The niceness scale runs from -20 (highest priority) to 19 (lowest priority). Default is 0.

1
2
-20 ←────────────────── 0 ────────────────→ 19
Highest Priority    Default    Lowest Priority

Key points:

  • Lower numbers = higher priority = less “nice” to other processes
  • Higher numbers = lower priority = more “nice” to other processes
  • Regular users can only increase niceness (lower priority)
  • Root can set any niceness value

Using nice and renice

Start a process with low priority:

1
2
# Run a backup script with lowest priority
nice -n 19 ./backup-database.sh

Start a process with high priority (requires root):

1
2
# Critical system monitoring
sudo nice -n -10 /usr/local/bin/monitor-critical-service

Change priority of running process:

1
2
3
4
5
6
7
8
# Find the PID first
ps aux | grep backup

# Lower the priority (any user can do this)
renice -n 15 -p 8923

# Increase priority (requires root)
sudo renice -n -5 -p 8923

Practical Priority Management

Here’s a real scenario: You’re running a CPU-intensive video encoding job but need to keep your system responsive:

1
2
3
4
5
6
7
8
# Start the encoding with low priority
nice -n 19 ffmpeg -i input.mp4 -c:v libx264 output.mp4 &

# Check it's running with low priority
ps -o pid,ni,comm -p $(pgrep ffmpeg)
# Output:
#  PID  NI COMMAND
# 9876  19 ffmpeg

Now your video encoding happens in the background without making your system feel sluggish.

Real-Time Monitoring with top

The top command is like Task Manager for Linux—it shows live, updating information about system resources and processes.

Basic top Usage

1
top

You’ll see something like:

1
2
3
4
5
top - 14:23:01 up 3 days,  2:34,  2 users,  load average: 0.15, 0.23, 0.19
Tasks: 247 total,   2 running, 245 sleeping,   0 stopped,   0 zombie
%Cpu(s):  5.2 us,  2.1 sy,  0.0 ni, 92.5 id,  0.2 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  15847.2 total,   2134.5 free,   8456.3 used,   5256.4 buff/cache
MiB Swap:   2048.0 total,   2048.0 free,      0.0 used.   6234.7 avail Mem

Interactive top Commands

While top is running, you can press these keys:

Key Action
q Quit top
k Kill a process (prompts for PID)
r Renice a process
M Sort by memory usage
P Sort by CPU usage (default)
h Show help
1 Toggle individual CPU stats
c Show full command paths

Advanced top Techniques

Start top with custom sort:

1
2
3
4
5
# Sort by memory usage
top -o %MEM

# Sort by CPU usage
top -o %CPU

Monitor specific user:

1
top -u sysadmin

Batch mode for scripts:

1
2
# Take 3 snapshots, 5 seconds apart
top -b -n 3 -d 5 > top-output.txt

Better Alternatives: htop

If you can install additional tools, htop is top’s cooler cousin:

1
2
3
sudo apt install htop    # Debian/Ubuntu
sudo yum install htop    # RHEL/CentOS
htop

It’s colorful, mouse-friendly, and shows CPU cores individually. Much more user-friendly!

System Monitoring Commands

Beyond process-specific tools, you need commands that show overall system health.

uptime: Quick System Overview

1
2
uptime
# Output: 14:23:01 up 3 days,  2:34,  2 users,  load average: 0.15, 0.23, 0.19

Those load average numbers? They show system load over 1, 5, and 15 minutes. As a rule of thumb:

  • Less than your CPU count = good
  • Equal to your CPU count = busy but okay
  • More than your CPU count = system might be overloaded

Check your CPU count:

1
2
nproc
# Output: 4

So on a 4-core system, load averages under 4.0 are fine.

free: Memory Status

1
free -h

The -h flag makes it human-readable:

1
2
3
              total        used        free      shared  buff/cache   available
Mem:           15Gi       8.3Gi       2.1Gi       156Mi       5.1Gi       6.1Gi
Swap:         2.0Gi          0B       2.0Gi

Note: Don’t panic if “free” memory is low. Linux uses “available” memory for caching, which speeds things up. Look at the “available” column instead.

lscpu: CPU Information

1
lscpu

Shows detailed CPU architecture, cores, threads, and more. Useful for understanding your system’s capabilities.

vmstat: Virtual Memory Statistics

1
2
# Update every 2 seconds
vmstat 2

Great for seeing memory, swap, I/O, and CPU activity over time.

Terminal Multiplexers: screen and tmux

This is where things get really cool. Terminal multiplexers let you run multiple terminal sessions, detach from them, and reattach later—even after your SSH connection drops.

Why You Need This

Imagine this scenario:

  1. You SSH into a remote server
  2. Start a long-running database migration
  3. Your WiFi hiccups
  4. Connection drops
  5. Your migration process dies

Frustrating, right? With tmux or screen, step 5 becomes: “Your migration continues running, and you reattach when you reconnect.” Game changer.

screen: The Classic Choice

Start a new screen session:

1
screen

Start a named session:

1
screen -S database-migration

Always name your sessions! Future you will thank present you.

Detach from a session:

Press Ctrl+A then d (that’s two separate keystrokes, not simultaneous).

You’ll see: [detached from 12345.database-migration]

List all sessions:

1
2
3
4
screen -list
# Output:
# There is a screen on:
#     12345.database-migration    (Detached)

Reattach to a session:

1
screen -r database-migration

Force reattach if it thinks it’s still attached:

1
screen -rd database-migration

Essential screen Key Bindings

All screen commands start with Ctrl+A:

Command Action
Ctrl+A then d Detach from session
Ctrl+A then c Create new window
Ctrl+A then n Next window
Ctrl+A then p Previous window
Ctrl+A then " List all windows
Ctrl+A then k Kill current window
Ctrl+A then ? Show help

tmux: The Modern Alternative

tmux is like screen but with more features and better defaults. Here’s what I use daily:

Start a new tmux session:

1
tmux

Start a named session:

1
tmux new-session -s myproject

Start session running a specific command:

1
tmux new-session 'htop'

Detach from session:

Press Ctrl+B then d

List all sessions:

1
2
3
tmux ls
# Output:
# myproject: 1 windows (created Tue Oct  2 14:23:01 2025)

Attach to a session:

1
tmux attach -t myproject

Essential tmux Key Bindings

All tmux commands start with Ctrl+B:

Command Action
Ctrl+B then d Detach from session
Ctrl+B then c Create new window
Ctrl+B then n Next window
Ctrl+B then p Previous window
Ctrl+B then % Split pane vertically
Ctrl+B then " Split pane horizontally
Ctrl+B then o Switch between panes
Ctrl+B then x Kill current pane
Ctrl+B then ? Show help

Real-World tmux Workflow

Here’s my typical development session setup:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# Start a project session
tmux new-session -s webdev

# Split horizontally for code editor and terminal
Ctrl+B then "

# Switch to bottom pane
Ctrl+B then o

# Split that vertically for logs and tests
Ctrl+B then %

# Now I have:
# ┌─────────────────────────┐
# │   Code Editor (vim)     │
# ├──────────────┬──────────┤
# │  Terminal    │  Logs    │
# └──────────────┴──────────┘

I can detach with Ctrl+B d and everything stays running.

screen vs tmux: Which Should You Use?

Use screen if:

  • It’s already installed (it usually is)
  • You need something quick and simple
  • You’re on a minimal system

Use tmux if:

  • You can install it
  • You want pane splitting out of the box
  • You prefer modern defaults
  • You want easier configuration

Honestly? Learn both basics. You’ll encounter systems with only one or the other.

nohup: The Simple Alternative

If you don’t need the full power of tmux/screen, nohup (no hangup) is your friend.

Using nohup

1
nohup ./long-running-script.sh &

This does two things:

  1. Ignores the HUP signal (sent when your terminal closes)
  2. Redirects output to nohup.out

Check the output:

1
tail -f nohup.out

Custom output file:

1
nohup ./script.sh > my-custom-log.txt 2>&1 &

Tip: The 2>&1 redirects both standard output and standard error to the same file. Super useful for catching errors.

nohup vs. tmux/screen

Use nohup when:

  • You just need one command to survive logout
  • You don’t need interactivity
  • You’re on a locked-down system

Use tmux/screen when:

  • You need multiple terminals
  • You want to interact with the session later
  • You’re doing active development or troubleshooting

Scheduling Jobs with cron

Now let’s talk about automation. The cron daemon runs scheduled tasks at specified times.

Understanding crontab Syntax

A crontab entry has five time fields plus the command:

1
2
3
4
5
6
7
* * * * * command
│ │ │ │ │
│ │ │ │ └─── Day of week (0-7, both 0 and 7 = Sunday)
│ │ │ └───── Month (1-12)
│ │ └─────── Day of month (1-31)
│ └───────── Hour (0-23)
└─────────── Minute (0-59)

crontab Examples

Run every minute:

1
* * * * * /path/to/command

Run at 2:30 AM daily:

1
30 2 * * * /usr/local/bin/backup.sh

Run every 5 minutes:

1
*/5 * * * * /usr/local/bin/check-service.sh

Run on weekdays at 9 AM:

1
0 9 * * 1-5 /usr/local/bin/morning-report.sh

Run on the 1st and 15th of each month:

1
0 0 1,15 * * /usr/local/bin/biweekly-task.sh

Run every Monday at 3:30 AM:

1
30 3 * * 1 /usr/local/bin/weekly-cleanup.sh

crontab Special Keywords

Instead of five asterisks, you can use these shortcuts:

1
2
3
4
5
6
@reboot /path/to/startup-script.sh
@yearly /path/to/annual-report.sh    # Same as: 0 0 1 1 *
@monthly /path/to/monthly-backup.sh  # Same as: 0 0 1 * *
@weekly /path/to/weekly-update.sh    # Same as: 0 0 * * 0
@daily /path/to/daily-cleanup.sh     # Same as: 0 0 * * *
@hourly /path/to/hourly-check.sh     # Same as: 0 * * * *

Managing Your crontab

Edit your crontab:

1
crontab -e

This opens your crontab in your default editor (usually vi).

View your current crontab:

1
crontab -l

Remove your crontab:

1
crontab -r

Warning: crontab -r deletes your entire crontab without confirmation. If you want to be safe, use crontab -l > my-crontab-backup.txt first.

Edit another user’s crontab (as root):

1
sudo crontab -u username -e

System-Wide crontab

The /etc/crontab file is the system-wide crontab with an additional “user” field:

1
2
# minute hour day month weekday user command
30 2 * * * root /usr/local/bin/system-backup.sh

cron Best Practices

Always use full paths:

1
2
3
4
5
# Bad - might not find the command
* * * * * backup.sh

# Good - explicit path
* * * * * /usr/local/bin/backup.sh

Redirect output for debugging:

1
2
# Send output to a log file
30 2 * * * /usr/local/bin/backup.sh >> /var/log/backup.log 2>&1

Set environment variables at the top:

1
2
3
4
5
SHELL=/bin/bash
PATH=/usr/local/bin:/usr/bin:/bin
MAILTO=admin@example.com

30 2 * * * /usr/local/bin/backup.sh

Test your command first:

Before adding to cron, run the command manually with the same environment:

1
2
3
# Simulate cron environment
env -i /bin/bash --norc --noprofile
/usr/local/bin/your-script.sh

Real-World cron Example

Here’s a practical crontab I might set up for a web server:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# System backup at 2:30 AM daily
30 2 * * * /usr/local/bin/backup-database.sh >> /var/log/backup.log 2>&1

# Log rotation at 3 AM daily
0 3 * * * /usr/sbin/logrotate /etc/logrotate.conf

# Check disk space every hour
0 * * * * /usr/local/bin/check-disk-space.sh

# Send weekly report every Monday at 9 AM
0 9 * * 1 /usr/local/bin/weekly-report.sh

# Clear temp files every Sunday at midnight
0 0 * * 0 /usr/local/bin/clear-temp-files.sh

# Health check every 5 minutes
*/5 * * * * /usr/local/bin/health-check.sh

Advanced Topics

Let’s explore some more advanced process management techniques.

Process Monitoring with watch

The watch command repeatedly runs a command and displays the output:

1
2
3
4
5
6
7
8
# Update every 2 seconds (default)
watch ps aux

# Update every 5 seconds
watch -n 5 df -h

# Highlight differences between updates
watch -d free -h

This is perfect for monitoring changing values without refreshing manually.

Process Tree with pstree

Visualize parent-child process relationships:

1
pstree

Output shows a tree structure:

1
2
3
4
5
6
systemd─┬─accounts-daemon─┬─{gdbus}
        │                 └─{gmain}
        ├─apache2─┬─apache2
        │         ├─apache2
        │         └─apache2
        └─sshd───sshd───bash───pstree

Show PIDs:

1
pstree -p

Show processes for specific user:

1
pstree username

Using /proc Filesystem

Every process has a directory in /proc containing tons of information:

1
2
3
4
5
6
7
8
9
10
11
# View command line arguments
cat /proc/1234/cmdline

# View environment variables
cat /proc/1234/environ

# View memory maps
cat /proc/1234/maps

# View current working directory
ls -l /proc/1234/cwd

This is incredibly useful for debugging.

Process Limits with ulimit

View and set resource limits:

1
2
3
4
5
6
7
8
# View all limits
ulimit -a

# Set max number of open files
ulimit -n 4096

# Set max number of processes
ulimit -u 2048

For permanent limits, edit /etc/security/limits.conf.

CPU Affinity with taskset

Pin processes to specific CPU cores:

1
2
3
4
5
# Run command on CPU 0 only
taskset -c 0 ./my-program

# Set affinity of running process
taskset -cp 0-3 1234  # Use CPUs 0 through 3

Useful for performance tuning and NUMA systems.

ionice: I/O Priority

Similar to nice, but for disk I/O:

1
2
3
4
5
# Best effort, priority 7 (low)
ionice -c2 -n7 ./disk-intensive-task.sh

# Idle priority - only runs when no other I/O
ionice -c3 ./backup-script.sh

Batch Processing with at

For one-time scheduled tasks (unlike recurring cron jobs):

1
2
3
4
5
6
7
8
# Run at 3:30 PM today
echo "/usr/local/bin/report.sh" | at 3:30 PM

# Run tomorrow at noon
echo "/usr/local/bin/cleanup.sh" | at noon tomorrow

# Run next week
echo "/usr/local/bin/weekly-task.sh" | at 2pm next Monday

List pending jobs:

1
atq

Remove a job:

1
atrm 5  # Remove job number 5

Troubleshooting Common Issues

Let me share some real problems I’ve encountered and how to fix them.

“Too Many Open Files” Error

Problem: Application crashes with “Too many open files”

Diagnosis:

1
2
3
4
5
# Check current limits
ulimit -n

# See which process is using many files
lsof -p $(pgrep process-name) | wc -l

Solution:

1
2
3
4
5
6
# Temporary fix
ulimit -n 65535

# Permanent fix - add to /etc/security/limits.conf:
* soft nofile 65535
* hard nofile 65535

Zombie Process Buildup

Problem: Many zombie processes accumulating

Diagnosis:

1
ps aux | grep -w Z

Solution:

Zombies can’t be killed directly. Find and fix the parent:

1
2
3
4
5
# Find parent of zombies
ps -eo pid,ppid,stat,comm | grep Z

# If parent is misbehaving, restart it
sudo systemctl restart parent-service

High Load Average

Problem: System load is very high

Diagnosis:

1
2
3
4
5
6
7
8
# Check what's consuming CPU
top -b -n1 | head -20

# Check for I/O wait
iostat -x 1 5

# Check disk activity
iotop

Solution depends on cause:

1
2
3
4
5
# If CPU-bound: renice or kill offending process
renice -n 19 -p $(pgrep cpu-hog)

# If I/O-bound: check disk health and optimize I/O
sudo smartctl -a /dev/sda

Process Won’t Die

Problem: Process ignores SIGTERM

Diagnosis:

1
2
3
4
5
6
# Try SIGTERM first
kill 1234

# Wait and check
sleep 2
ps -p 1234

Solution:

1
2
3
4
5
6
# If still running, escalate
kill -9 1234

# If in uninterruptible sleep (D state), wait or reboot
# Check state
ps -o stat,pid,comm -p 1234

Best Practices and Tips

Here’s what I’ve learned over the years:

Always Try Graceful Shutdown First

1
2
3
4
5
6
7
# Good practice
kill <pid>          # SIGTERM
sleep 3
kill -9 <pid>      # SIGKILL if needed

# Bad practice
kill -9 <pid>      # Immediately force kill

Use Named Sessions

1
2
3
4
5
6
7
# Bad - you'll forget what this is
tmux new-session
screen

# Good - descriptive names
tmux new-session -s web-dev-project
screen -S database-maintenance

Log Everything Important

1
2
3
4
5
# Capture both stdout and stderr
./important-script.sh > output.log 2>&1

# Or use tee to see output AND log it
./important-script.sh 2>&1 | tee -a script.log

Monitor Before Killing

1
2
3
4
5
6
7
8
# See what the process is doing
strace -p 1234

# Check open files
lsof -p 1234

# View system calls
strace -p 1234 -c  # Summary mode

Document Your Cron Jobs

1
2
3
4
5
6
7
# Bad - no context
30 2 * * * /usr/local/bin/script.sh

# Good - clear purpose and contact
# Database backup - runs daily at 2:30 AM
# Contact: ops-team@example.com if this fails
30 2 * * * /usr/local/bin/backup-database.sh >> /var/log/backup.log 2>&1

Use Process Substitution

Instead of temporary files:

1
2
3
4
5
6
7
# Old way
ps aux > /tmp/processes.txt
grep httpd /tmp/processes.txt
rm /tmp/processes.txt

# Better way
grep httpd <(ps aux)

Set Up Monitoring Alerts

Create scripts that alert you to problems:

1
2
3
4
5
6
7
8
9
#!/bin/bash
# check-high-load.sh

LOAD=$(uptime | awk -F'load average:' '{print $2}' | cut -d',' -f1 | tr -d ' ')
THRESHOLD=4.0

if (( $(echo "$LOAD > $THRESHOLD" | bc -l) )); then
    echo "High load detected: $LOAD" | mail -s "Alert: High Load" admin@example.com
fi

Add to cron:

1
*/5 * * * * /usr/local/bin/check-high-load.sh

Creating Your Own Process Management Scripts

Let’s build some practical scripts you can use.

Script 1: Process Monitor

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#!/bin/bash
# monitor-process.sh - Monitor a process and restart if it dies

PROCESS_NAME="$1"
CHECK_INTERVAL=60  # Check every 60 seconds

if [ -z "$PROCESS_NAME" ]; then
    echo "Usage: $0 <process-name>"
    exit 1
fi

while true; do
    if ! pgrep -x "$PROCESS_NAME" > /dev/null; then
        echo "$(date): $PROCESS_NAME is not running. Starting..."
        # Add your start command here
        systemctl start "$PROCESS_NAME"

        # Or for custom scripts:
        # /path/to/start-script.sh &
    fi

    sleep $CHECK_INTERVAL
done

Usage:

1
2
# Run in background with nohup
nohup ./monitor-process.sh nginx > /var/log/monitor-nginx.log 2>&1 &

Script 2: Resource Usage Report

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
#!/bin/bash
# resource-report.sh - Generate resource usage report

OUTPUT_FILE="/var/log/resource-report-$(date +%Y%m%d).txt"

{
    echo "=========================================="
    echo "System Resource Report - $(date)"
    echo "=========================================="
    echo

    echo "UPTIME AND LOAD:"
    uptime
    echo

    echo "TOP 10 CPU CONSUMERS:"
    ps aux --sort=-%cpu | head -n 11
    echo

    echo "TOP 10 MEMORY CONSUMERS:"
    ps aux --sort=-%mem | head -n 11
    echo

    echo "MEMORY USAGE:"
    free -h
    echo

    echo "DISK USAGE:"
    df -h
    echo

    echo "PROCESS COUNT BY STATE:"
    ps aux | awk '{print $8}' | sort | uniq -c | sort -rn

} > "$OUTPUT_FILE"

echo "Report saved to: $OUTPUT_FILE"

Add to cron for daily reports:

1
0 0 * * * /usr/local/bin/resource-report.sh

Script 3: Clean Up Zombie Processes

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#!/bin/bash
# cleanup-zombies.sh - Report and attempt to clean zombie processes

ZOMBIES=$(ps aux | awk '$8 ~ /^Z/ {print $2}')

if [ -z "$ZOMBIES" ]; then
    echo "No zombie processes found."
    exit 0
fi

echo "Zombie processes detected:"
ps aux | awk '$8 ~ /^Z/'

echo
echo "Finding parent processes:"

for zombie_pid in $ZOMBIES; do
    PPID=$(ps -o ppid= -p $zombie_pid 2>/dev/null | tr -d ' ')

    if [ -n "$PPID" ]; then
        PARENT_CMD=$(ps -o comm= -p $PPID)
        echo "Zombie PID: $zombie_pid, Parent PID: $PPID ($PARENT_CMD)"

        # Optionally send SIGCHLD to parent to trigger cleanup
        # Uncomment the next line if you want automatic cleanup
        # kill -s SIGCHLD $PPID
    fi
done

echo
echo "Note: Zombies will be cleaned when parent process calls wait()."
echo "If parent is misbehaving, consider restarting it."

Script 4: Smart Process Killer

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
#!/bin/bash
# smart-kill.sh - Try graceful kill, escalate if needed

PID="$1"
MAX_WAIT=10  # Wait up to 10 seconds

if [ -z "$PID" ]; then
    echo "Usage: $0 <pid>"
    exit 1
fi

# Check if process exists
if ! ps -p "$PID" > /dev/null 2>&1; then
    echo "Process $PID does not exist."
    exit 1
fi

PROCESS_NAME=$(ps -p "$PID" -o comm=)
echo "Attempting to terminate $PROCESS_NAME (PID: $PID)..."

# Try SIGTERM
kill "$PID" 2>/dev/null

# Wait and check
for i in $(seq 1 $MAX_WAIT); do
    if ! ps -p "$PID" > /dev/null 2>&1; then
        echo "Process terminated gracefully."
        exit 0
    fi
    sleep 1
done

# Still running, escalate
echo "Process did not respond to SIGTERM. Sending SIGKILL..."
kill -9 "$PID" 2>/dev/null

sleep 1

if ! ps -p "$PID" > /dev/null 2>&1; then
    echo "Process force killed."
    exit 0
else
    echo "Error: Unable to kill process. It may be in uninterruptible sleep."
    exit 1
fi

Security Considerations

Process management isn’t just about convenience—it’s also about security.

Check for Suspicious Processes

1
2
3
4
5
6
7
8
# Look for processes running from /tmp (suspicious)
ps aux | grep '/tmp'

# Check for processes without a controlling terminal (potential backdoor)
ps aux | grep '?'

# Find processes running as root
ps aux | grep '^root' | less

Limit User Resources

Prevent resource exhaustion attacks by setting limits in /etc/security/limits.conf:

1
2
3
4
5
6
7
8
9
10
# Limit each user to 100 processes
* soft nproc 100
* hard nproc 150

# Limit file descriptors
* soft nofile 1024
* hard nofile 2048

# Limit CPU time (in minutes)
@users hard cpu 60

Audit Process Execution

Use auditd to track process execution:

1
2
3
4
5
6
7
8
# Install auditd
sudo apt install auditd

# Add rule to track executions
sudo auditctl -a always,exit -F arch=b64 -S execve

# View audit logs
sudo ausearch -sc execve

Secure Cron Jobs

1
2
3
4
5
6
# Restrict who can use cron
echo "root" > /etc/cron.allow
# This denies everyone except root

# Or use cron.deny to block specific users
echo "suspicious_user" >> /etc/cron.deny

Monitor for Privilege Escalation

1
2
3
4
5
6
# Find SUID binaries (run with owner's privileges)
find / -perm -4000 -type f 2>/dev/null

# Monitor for new SUID files
find / -perm -4000 -type f 2>/dev/null > /tmp/suid-files.txt
# Compare periodically with diff

Performance Tuning Tips

Optimize Process Scheduling

For interactive responsiveness:

1
2
# Use CFS (Completely Fair Scheduler) with low latency
echo 1 > /proc/sys/kernel/sched_autogroup_enabled

For batch processing:

1
2
# Adjust scheduler for throughput
echo 0 > /proc/sys/kernel/sched_autogroup_enabled

Tune I/O Scheduling

1
2
3
4
5
6
7
8
# Check current I/O scheduler
cat /sys/block/sda/queue/scheduler

# Set to deadline for databases (better for random I/O)
echo deadline > /sys/block/sda/queue/scheduler

# Set to noop for SSDs
echo noop > /sys/block/sda/queue/scheduler

Control Swappiness

1
2
3
4
5
6
7
8
# Check current swappiness (default is usually 60)
cat /proc/sys/vm/swappiness

# Lower value = less swap usage (good for servers with plenty of RAM)
sudo sysctl vm.swappiness=10

# Make it permanent in /etc/sysctl.conf:
echo "vm.swappiness=10" | sudo tee -a /etc/sysctl.conf

Use CPU Sets for Isolation

For critical processes, isolate them to specific CPUs:

1
2
3
4
5
6
7
# Create a CPU set
sudo mkdir /sys/fs/cgroup/cpuset/critical
echo 0-1 > /sys/fs/cgroup/cpuset/critical/cpuset.cpus
echo 0 > /sys/fs/cgroup/cpuset/critical/cpuset.mems

# Move process to this CPU set
echo $PID > /sys/fs/cgroup/cpuset/critical/tasks

Debugging Techniques

Using strace

See what system calls a process is making:

1
2
3
4
5
6
7
8
9
10
11
# Trace a running process
sudo strace -p 1234

# Trace process startup
strace ./my-program

# See timing information
strace -T ./my-program

# Count system calls
strace -c ./my-program

Using lsof

List open files for a process:

1
2
3
4
5
6
7
8
9
10
11
# All open files for a process
lsof -p 1234

# Network connections only
lsof -i -n -P -p 1234

# Files opened by a specific user
lsof -u username

# What process is using a specific file?
lsof /var/log/syslog

Using netstat

Check network activity:

1
2
3
4
5
# Show all listening ports and programs
sudo netstat -tulpn

# Show established connections
netstat -an | grep ESTABLISHED

Core Dumps for Post-Mortem Analysis

Enable core dumps:

1
2
3
4
5
6
7
8
# Set unlimited core size
ulimit -c unlimited

# Set core dump pattern
echo "/tmp/core.%e.%p" | sudo tee /proc/sys/kernel/core_pattern

# When process crashes, analyze with gdb
gdb /path/to/program /tmp/core.program.1234

Quick Reference Cheat Sheet

Process Listing

Command Purpose
ps Show current terminal processes
ps aux Show all processes with details
ps -ef Full format with parent PID
pgrep -li name Find processes by name
top Real-time process monitor
htop Better interactive monitor

Process Control

Command Purpose
command & Run in background
Ctrl+Z Suspend foreground process
bg %1 Resume job 1 in background
fg %1 Bring job 1 to foreground
jobs List background jobs
disown %1 Detach job from terminal

Signals

Command Purpose
kill PID Send SIGTERM (graceful)
kill -9 PID Send SIGKILL (force)
kill -HUP PID Send SIGHUP (reload config)
killall name Kill all processes by name
pkill pattern Kill by pattern match

Priority

Command Purpose
nice -n 19 cmd Start with low priority
renice -n 5 -p PID Change priority of running process
ionice -c3 cmd Start with idle I/O priority

Session Management

Command Purpose
screen Start screen session
screen -S name Start named session
Ctrl+A d Detach from screen
screen -r Reattach to session
tmux Start tmux session
tmux new -s name Start named tmux session
Ctrl+B d Detach from tmux
tmux attach -t name Reattach to tmux session

Scheduling

Cron Pattern Meaning
* * * * * Every minute
0 * * * * Every hour
0 0 * * * Daily at midnight
0 0 * * 0 Weekly (Sunday midnight)
0 0 1 * * Monthly (1st at midnight)
*/5 * * * * Every 5 minutes
@reboot At system boot

System Monitoring

Command Purpose
uptime System load averages
free -h Memory usage
vmstat 2 Virtual memory stats every 2s
iostat -x 2 I/O statistics every 2s
lscpu CPU information
watch cmd Run command repeatedly

Conclusion

Process management is one of those fundamental Linux skills that you’ll use constantly. Whether you’re running background jobs, debugging performance issues, or setting up automated tasks, these tools and techniques will make your life so much easier.

The key takeaways:

  1. Master the basics: Get comfortable with ps, kill, and job control before moving to advanced topics
  2. Always try graceful termination before force-killing processes
  3. Use terminal multiplexers for any remote work—they’re lifesavers
  4. Document your cron jobs so future you (or your team) knows what’s running and why
  5. Monitor proactively rather than reacting to problems

Remember, the best way to learn is by doing. Start with simple tasks like running commands in the background, then gradually work your way up to more complex scenarios. Before you know it, this stuff will be second nature.

Got questions or want to share your own process management tips? Drop a comment below!

Additional Resources

This post is licensed under CC BY 4.0 by the author.