Text & WritingMay 20, 2026

5 Text Processing Tasks Developers Do Manually (And Shouldn't)

Sorting lines, removing duplicates, comparing files, finding patterns, counting words — developers do these tasks constantly and usually reach for a text editor or the terminal. This guide shows faster browser-based alternatives for each.

Olivia Bennett · Full-stack Developer

Full-stack developer focused on developer tooling and web performance. Writes about the formats, patterns, and shortcuts devs reach for every day.

text processingdeveloper toolsproductivitytext manipulationdiffutilities

Text is everywhere in software development — log files, API responses, configuration files, SQL dumps, CSV exports, dependency lists, error output. Most of this text needs to be processed in some way before it is useful: sorted, deduplicated, compared against another version, counted, or searched and transformed.

The instinct is usually to reach for the terminal (sort | uniq | wc -l) or to open a code editor with a multi-cursor. Both work but have friction — shell pipelines require the right syntax on the right OS, and editors need the file to exist on disk. Browser-based tools handle these tasks on text you can paste directly from any source: a clipboard, a terminal output, an API response.

This guide covers five text processing tasks that come up constantly and the fastest way to do each.

1. Sorting Lines: Configuration Files, Dependency Lists, Imports

When you need this:

An .env file whose keys have grown in random order over two years
A requirements.txt or package.json dependency list that you want alphabetised before a code review
A list of feature flags, database migration names, or Terraform resource names that need canonical ordering
A CSV where the rows need to be sorted by a specific column-like structure

What makes this annoying manually:

In a code editor, you are sorting by eye and dragging lines. In the terminal, sort on macOS and Linux behaves differently (macOS's sort does not support --parallel; locale settings affect sort order for non-ASCII characters). For a quick sort of 20 lines copied from a CI log, spinning up a terminal session is more overhead than the task deserves.

How the Sort Lines tool works:

The Sort Lines tool handles:

Alphabetical (A→Z / Z→A) — standard lexicographic sort; apple comes before banana
By line length (short→long / long→short) — useful for code style: shortest imports first, longest last
Case-insensitive — Banana and banana sort together rather than capitals-first
Numeric sort — 2 before 10 (lexicographic sort would give 10 before 2)
Remove blank lines — clean up the output if your source has empty lines between entries

Example: sorting Python imports before a PR:

# Before (random order from git history)
import os
import json
from datetime import datetime
import sys
from pathlib import Path
import re
from typing import Optional, List

# Paste into Sort Lines, alphabetical, case-insensitive:

# After
from datetime import datetime
import json
import os
from pathlib import Path
import re
import sys
from typing import Optional, List

This does what isort does for a quick manual fix, without needing to install anything.

Terminal equivalent for reference:

sort file.txt                    # alphabetical
sort -r file.txt                 # reverse
sort -n file.txt                 # numeric
sort -f file.txt                 # case-insensitive
sort -u file.txt                 # unique (removes duplicates)

The browser tool wins when you have text on the clipboard rather than a file on disk.

2. Removing Duplicate Lines: Log Analysis, Dependency Deduplication

When you need this:

You have collected error messages from multiple log files and want the unique set
A dependency list was concatenated from two sources and has duplicates
You are building a blocklist or allowlist and received duplicate entries
A data export has duplicate rows that need to be removed before import

Why duplicates accumulate:

Configuration files that are edited by multiple people or maintained by scripts often end up with repeated entries. Log aggregation pipelines often produce the same error message thousands of times — you want the distinct set. A list of emails or usernames gathered from multiple queries often has overlaps.

What the Duplicate Line Remover does:

The Duplicate Line Remover removes lines that appear more than once, keeping the first occurrence. Options include:

Case-sensitive (default) — Error and error are treated as different lines
Case-insensitive — ERROR, Error, and error are treated as the same
Trim whitespace — lines that differ only in leading/trailing spaces are treated as duplicates
Show removed lines — see what was removed rather than what remained

Example: deduplicating a list of npm package names:

# Raw list from two package.json files merged
react
lodash
axios
react                  ← duplicate
react-dom
lodash                 ← duplicate
moment
axios                  ← duplicate

# After duplicate removal:
react
lodash
axios
react-dom
moment

Terminal equivalent:

sort file.txt | uniq              # sort first, then deduplicate
awk '!seen[$0]++'  file.txt       # deduplicate without sorting (preserves order)

The awk one-liner is the closest equivalent to the browser tool — it preserves the first occurrence and maintains the original order. Worth knowing for scripting; the browser tool is faster for interactive use.

3. Comparing Two Texts: Before/After, Code Review, Config Diffs

When you need this:

You have two versions of a configuration file and want to see what changed
A team member pasted a "fixed" version of a script in Slack and you want to see the differences from the original
You received an API response that looks wrong and want to compare it against the expected schema
You want to see what lines were added or removed between two SQL migrations

The problem with manual comparison:

Reading two blocks of text side by side and noting differences is both slow and error-prone. Human eyes miss changes in long lines, subtle whitespace differences, and reordered blocks. Even experienced developers miss differences that a diff algorithm catches instantly.

How the Text Diff Checker works:

The Text Diff Checker uses the Myers diff algorithm — the same algorithm used by git diff — to compare two texts and highlight additions and deletions. It shows:

Lines added (shown in green)
Lines removed (shown in red)
Lines unchanged (shown as context)
Inline character-level diff for modified lines

Example: comparing two versions of a Dockerfile:

# Version 1                          # Version 2
FROM node:18-alpine                  FROM node:20-alpine
WORKDIR /app                         WORKDIR /app
COPY package*.json ./                COPY package*.json ./
RUN npm ci                           RUN npm ci --omit=dev
COPY . .                             COPY . .
EXPOSE 3000                          EXPOSE 3000
CMD ["node", "index.js"]             CMD ["node", "dist/index.js"]

The diff shows:

node:18-alpine → node:20-alpine (Node version bump)
npm ci → npm ci --omit=dev (production-only install added)
index.js → dist/index.js (entry point changed to compiled output)

These are exactly the changes a reviewer needs to understand — the diff surfaces them without requiring line-by-line reading.

Terminal equivalent:

diff file1.txt file2.txt            # basic diff
diff -u file1.txt file2.txt         # unified format (same as git diff)
diff -y file1.txt file2.txt         # side-by-side

The browser tool wins for text on the clipboard that is not saved to files.

4. Finding and Replacing: Batch Transformations, Refactoring Snippets

When you need this:

You need to rename an identifier across a snippet of code that is not in your IDE (a Gist, a Stack Overflow answer, a paste from a CI log)
A CSV file has inconsistent formatting in one column and you want to normalise it before importing
You need to extract all lines matching a pattern and replace them with a different format
You want to strip all HTML tags from pasted content, or convert all https:// links to [link](url) markdown format

Beyond basic find-replace:

Text editors do find-replace well within files. The friction points are: files that are not open in your IDE, text that exists only on the clipboard, or transformations that need regex power but not the full overhead of opening a script editor.

How the Find and Replace tool works:

The Find and Replace tool supports:

Plain string replacement — simple literal find-replace across all occurrences
Regex mode — full regular expression syntax with capture groups
Case-sensitive / case-insensitive toggle
Global (all occurrences) or first-only

Example: converting a list of URLs to markdown links:

# Input (list of URLs)
https://github.com/anthropics/anthropic-sdk-python
https://github.com/openai/openai-python
https://github.com/google/generative-ai-python

# Find (regex): ^(https://github\.com/([^/]+)/([^/\n]+))$
# Replace: [$3]($1)

# Output:
[anthropic-sdk-python](https://github.com/anthropics/anthropic-sdk-python)
[openai-python](https://github.com/openai/openai-python)
[generative-ai-python](https://github.com/google/generative-ai-python)

Capture group $1 is the full URL, $2 is the org name, $3 is the repo name.

Example: normalising date format in a CSV:

# Input
2026-05-01, Alice, $150.00
2026/05/02, Bob, $200.00       ← inconsistent separator
2026.05.03, Carol, $75.00      ← another inconsistent separator

# Find (regex): (\d{4})[/.](\d{2})[/.](\d{2})
# Replace: $1-$2-$3

# Output
2026-05-01, Alice, $150.00
2026-05-02, Bob, $200.00
2026-05-03, Carol, $75.00

Terminal equivalent:

sed 's/old/new/g' file.txt                   # basic replace
sed -E 's/([0-9]{4})[\/.]([0-9]{2})/\1-\2/g' file.txt  # regex replace
perl -pe 's/pattern/replacement/g' file.txt  # Perl regex (more powerful)

sed works perfectly for file-based transformations in scripts. The browser tool wins for interactive one-off transformations of clipboard text.

5. Counting Words and Characters: Content Planning, Character Limits

When you need this:

You are writing a meta description and need to stay under 160 characters
A social media post needs to be under Twitter/X's 280-character limit
You are reviewing a technical document and need its approximate reading time for a meeting
You are verifying that a user-generated summary field does not exceed the database column limit
You want to know how many unique words a piece of documentation contains

The numbers that matter:

Metric	Common limits
Meta description	160 characters (Google truncates after ~155)
OG description	200 characters (Facebook truncates)
Tweet	280 characters
LinkedIn post	3000 characters
Email subject line	50–60 characters (optimal for open rate)
SMS	160 characters per segment
Average reading speed	200–250 words per minute

How the Word Counter works:

The Word Counter provides a live count of:

Words — split by whitespace and punctuation
Characters — with and without spaces
Sentences — split by terminal punctuation
Paragraphs — separated by blank lines
Reading time — based on a 200 words-per-minute average
Unique words — useful for vocabulary analysis

It updates as you type or paste, so you can edit the text in place until you hit the target count.

Terminal equivalents:

wc -w file.txt                    # word count
wc -c file.txt                    # byte count
wc -l file.txt                    # line count
echo -n "your text" | wc -c       # character count (no trailing newline)

# Unique words:
tr -s ' ' '\n' < file.txt | sort -f | uniq -i | wc -l

The wc command is fast for large files in scripts. The browser tool wins for quick checks on clipboard content and for the additional metrics (sentences, reading time, paragraphs) that wc does not provide.

When to Use Each Approach

Task	Browser tool	Terminal command	IDE
Clipboard text, no file	Best	Requires temp file	Requires paste
Large files (>10 MB)	Slow	Best	Depends
Scripting/automation	Not applicable	Best	Not applicable
Interactive editing	Good	Awkward	Best
Cross-platform	Best	Varies (macOS vs Linux)	Varies
No tools installed	Best	Requires shell	Requires IDE

The browser tools are not a replacement for shell pipelines in automation — they are the fastest path from clipboard to result for interactive use, especially when working with text from sources that are not files: API responses in Postman, log output from a CI dashboard, or a snippet pasted into Slack.

Combining Tools for Complex Tasks

These tools are most powerful in combination. A realistic workflow:

You receive a CSV export with 5000 rows from a database query.
Extract one column using Find and Replace with a regex capture group.
Sort Lines alphabetically to group related entries.
Remove Duplicate Lines to get the unique set.
Compare the result against last week's export using Text Diff to see what changed.
Paste the final list into a document and check its word / line count.

Each step takes seconds when you can paste directly between browser tabs rather than writing intermediate files.