5 Text Processing Tasks Developers Do Manually (And Shouldn't)
Sorting lines, removing duplicates, comparing files, finding patterns, counting words — developers do these tasks constantly and usually reach for a text editor or the terminal. This guide shows faster browser-based alternatives for each.
Text is everywhere in software development — log files, API responses, configuration files, SQL dumps, CSV exports, dependency lists, error output. Most of this text needs to be processed in some way before it is useful: sorted, deduplicated, compared against another version, counted, or searched and transformed.
The instinct is usually to reach for the terminal (sort | uniq | wc -l) or to open a code editor with a multi-cursor. Both work but have friction — shell pipelines require the right syntax on the right OS, and editors need the file to exist on disk. Browser-based tools handle these tasks on text you can paste directly from any source: a clipboard, a terminal output, an API response.
This guide covers five text processing tasks that come up constantly and the fastest way to do each.
1. Sorting Lines: Configuration Files, Dependency Lists, Imports
When you need this:
- An
.envfile whose keys have grown in random order over two years - A
requirements.txtorpackage.jsondependency list that you want alphabetised before a code review - A list of feature flags, database migration names, or Terraform resource names that need canonical ordering
- A CSV where the rows need to be sorted by a specific column-like structure
What makes this annoying manually:
In a code editor, you are sorting by eye and dragging lines. In the terminal, sort on macOS and Linux behaves differently (macOS's sort does not support --parallel; locale settings affect sort order for non-ASCII characters). For a quick sort of 20 lines copied from a CI log, spinning up a terminal session is more overhead than the task deserves.
How the Sort Lines tool works:
The Sort Lines tool handles:
- Alphabetical (A→Z / Z→A) — standard lexicographic sort;
applecomes beforebanana - By line length (short→long / long→short) — useful for code style: shortest imports first, longest last
- Case-insensitive —
Bananaandbananasort together rather than capitals-first - Numeric sort —
2before10(lexicographic sort would give10before2) - Remove blank lines — clean up the output if your source has empty lines between entries
Example: sorting Python imports before a PR:
# Before (random order from git history)
import os
import json
from datetime import datetime
import sys
from pathlib import Path
import re
from typing import Optional, List
# Paste into Sort Lines, alphabetical, case-insensitive:
# After
from datetime import datetime
import json
import os
from pathlib import Path
import re
import sys
from typing import Optional, List
This does what isort does for a quick manual fix, without needing to install anything.
Terminal equivalent for reference:
sort file.txt # alphabetical
sort -r file.txt # reverse
sort -n file.txt # numeric
sort -f file.txt # case-insensitive
sort -u file.txt # unique (removes duplicates)
The browser tool wins when you have text on the clipboard rather than a file on disk.
2. Removing Duplicate Lines: Log Analysis, Dependency Deduplication
When you need this:
- You have collected error messages from multiple log files and want the unique set
- A dependency list was concatenated from two sources and has duplicates
- You are building a blocklist or allowlist and received duplicate entries
- A data export has duplicate rows that need to be removed before import
Why duplicates accumulate:
Configuration files that are edited by multiple people or maintained by scripts often end up with repeated entries. Log aggregation pipelines often produce the same error message thousands of times — you want the distinct set. A list of emails or usernames gathered from multiple queries often has overlaps.
What the Duplicate Line Remover does:
The Duplicate Line Remover removes lines that appear more than once, keeping the first occurrence. Options include:
- Case-sensitive (default) —
Erroranderrorare treated as different lines - Case-insensitive —
ERROR,Error, anderrorare treated as the same - Trim whitespace — lines that differ only in leading/trailing spaces are treated as duplicates
- Show removed lines — see what was removed rather than what remained
Example: deduplicating a list of npm package names:
# Raw list from two package.json files merged
react
lodash
axios
react ← duplicate
react-dom
lodash ← duplicate
moment
axios ← duplicate
# After duplicate removal:
react
lodash
axios
react-dom
moment
Terminal equivalent:
sort file.txt | uniq # sort first, then deduplicate
awk '!seen[$0]++' file.txt # deduplicate without sorting (preserves order)
The awk one-liner is the closest equivalent to the browser tool — it preserves the first occurrence and maintains the original order. Worth knowing for scripting; the browser tool is faster for interactive use.
3. Comparing Two Texts: Before/After, Code Review, Config Diffs
When you need this:
- You have two versions of a configuration file and want to see what changed
- A team member pasted a "fixed" version of a script in Slack and you want to see the differences from the original
- You received an API response that looks wrong and want to compare it against the expected schema
- You want to see what lines were added or removed between two SQL migrations
The problem with manual comparison:
Reading two blocks of text side by side and noting differences is both slow and error-prone. Human eyes miss changes in long lines, subtle whitespace differences, and reordered blocks. Even experienced developers miss differences that a diff algorithm catches instantly.
How the Text Diff Checker works:
The Text Diff Checker uses the Myers diff algorithm — the same algorithm used by git diff — to compare two texts and highlight additions and deletions. It shows:
- Lines added (shown in green)
- Lines removed (shown in red)
- Lines unchanged (shown as context)
- Inline character-level diff for modified lines
Example: comparing two versions of a Dockerfile:
# Version 1 # Version 2
FROM node:18-alpine FROM node:20-alpine
WORKDIR /app WORKDIR /app
COPY package*.json ./ COPY package*.json ./
RUN npm ci RUN npm ci --omit=dev
COPY . . COPY . .
EXPOSE 3000 EXPOSE 3000
CMD ["node", "index.js"] CMD ["node", "dist/index.js"]
The diff shows:
node:18-alpine→node:20-alpine(Node version bump)npm ci→npm ci --omit=dev(production-only install added)index.js→dist/index.js(entry point changed to compiled output)
These are exactly the changes a reviewer needs to understand — the diff surfaces them without requiring line-by-line reading.
Terminal equivalent:
diff file1.txt file2.txt # basic diff
diff -u file1.txt file2.txt # unified format (same as git diff)
diff -y file1.txt file2.txt # side-by-side
The browser tool wins for text on the clipboard that is not saved to files.
4. Finding and Replacing: Batch Transformations, Refactoring Snippets
When you need this:
- You need to rename an identifier across a snippet of code that is not in your IDE (a Gist, a Stack Overflow answer, a paste from a CI log)
- A CSV file has inconsistent formatting in one column and you want to normalise it before importing
- You need to extract all lines matching a pattern and replace them with a different format
- You want to strip all HTML tags from pasted content, or convert all
https://links to[link](url)markdown format
Beyond basic find-replace:
Text editors do find-replace well within files. The friction points are: files that are not open in your IDE, text that exists only on the clipboard, or transformations that need regex power but not the full overhead of opening a script editor.
How the Find and Replace tool works:
The Find and Replace tool supports:
- Plain string replacement — simple literal find-replace across all occurrences
- Regex mode — full regular expression syntax with capture groups
- Case-sensitive / case-insensitive toggle
- Global (all occurrences) or first-only
Example: converting a list of URLs to markdown links:
# Input (list of URLs)
https://github.com/anthropics/anthropic-sdk-python
https://github.com/openai/openai-python
https://github.com/google/generative-ai-python
# Find (regex): ^(https://github\.com/([^/]+)/([^/\n]+))$
# Replace: [$3]($1)
# Output:
[anthropic-sdk-python](https://github.com/anthropics/anthropic-sdk-python)
[openai-python](https://github.com/openai/openai-python)
[generative-ai-python](https://github.com/google/generative-ai-python)
Capture group $1 is the full URL, $2 is the org name, $3 is the repo name.
Example: normalising date format in a CSV:
# Input
2026-05-01, Alice, $150.00
2026/05/02, Bob, $200.00 ← inconsistent separator
2026.05.03, Carol, $75.00 ← another inconsistent separator
# Find (regex): (\d{4})[/.](\d{2})[/.](\d{2})
# Replace: $1-$2-$3
# Output
2026-05-01, Alice, $150.00
2026-05-02, Bob, $200.00
2026-05-03, Carol, $75.00
Terminal equivalent:
sed 's/old/new/g' file.txt # basic replace
sed -E 's/([0-9]{4})[\/.]([0-9]{2})/\1-\2/g' file.txt # regex replace
perl -pe 's/pattern/replacement/g' file.txt # Perl regex (more powerful)
sed works perfectly for file-based transformations in scripts. The browser tool wins for interactive one-off transformations of clipboard text.
5. Counting Words and Characters: Content Planning, Character Limits
When you need this:
- You are writing a meta description and need to stay under 160 characters
- A social media post needs to be under Twitter/X's 280-character limit
- You are reviewing a technical document and need its approximate reading time for a meeting
- You are verifying that a user-generated summary field does not exceed the database column limit
- You want to know how many unique words a piece of documentation contains
The numbers that matter:
| Metric | Common limits |
|---|---|
| Meta description | 160 characters (Google truncates after ~155) |
| OG description | 200 characters (Facebook truncates) |
| Tweet | 280 characters |
| LinkedIn post | 3000 characters |
| Email subject line | 50–60 characters (optimal for open rate) |
| SMS | 160 characters per segment |
| Average reading speed | 200–250 words per minute |
How the Word Counter works:
The Word Counter provides a live count of:
- Words — split by whitespace and punctuation
- Characters — with and without spaces
- Sentences — split by terminal punctuation
- Paragraphs — separated by blank lines
- Reading time — based on a 200 words-per-minute average
- Unique words — useful for vocabulary analysis
It updates as you type or paste, so you can edit the text in place until you hit the target count.
Terminal equivalents:
wc -w file.txt # word count
wc -c file.txt # byte count
wc -l file.txt # line count
echo -n "your text" | wc -c # character count (no trailing newline)
# Unique words:
tr -s ' ' '\n' < file.txt | sort -f | uniq -i | wc -l
The wc command is fast for large files in scripts. The browser tool wins for quick checks on clipboard content and for the additional metrics (sentences, reading time, paragraphs) that wc does not provide.
When to Use Each Approach
| Task | Browser tool | Terminal command | IDE |
|---|---|---|---|
| Clipboard text, no file | Best | Requires temp file | Requires paste |
| Large files (>10 MB) | Slow | Best | Depends |
| Scripting/automation | Not applicable | Best | Not applicable |
| Interactive editing | Good | Awkward | Best |
| Cross-platform | Best | Varies (macOS vs Linux) | Varies |
| No tools installed | Best | Requires shell | Requires IDE |
The browser tools are not a replacement for shell pipelines in automation — they are the fastest path from clipboard to result for interactive use, especially when working with text from sources that are not files: API responses in Postman, log output from a CI dashboard, or a snippet pasted into Slack.
Combining Tools for Complex Tasks
These tools are most powerful in combination. A realistic workflow:
- You receive a CSV export with 5000 rows from a database query.
- Extract one column using Find and Replace with a regex capture group.
- Sort Lines alphabetically to group related entries.
- Remove Duplicate Lines to get the unique set.
- Compare the result against last week's export using Text Diff to see what changed.
- Paste the final list into a document and check its word / line count.
Each step takes seconds when you can paste directly between browser tabs rather than writing intermediate files.