Linux 文本处理与搜索命令

Linux 中正则表达式、grep 与 egrep 的常见用法与搜索套路。

#type / howto #status / growing #tech / ops #resource / linux #platform / linux

[!info] related notes

相关 MOC: Linux MOC

命令入口: Linux 基础命令入口

Linux 文本处理与搜索命令

目标

快速在 Linux 中搜索文本、过滤输出，并理解最常用的正则元字符。

前置条件

先判断你是要：
- 搜索某个模式：grep
- 用扩展正则：grep -E / egrep

步骤

1. 记住最常见的正则元字符

元字符	含义
`^`	行首
`$`	行尾
`.`	任意单个字符
`*`	前一项重复任意次
`+`	前一项重复一次或多次
`?`	前一项重复零次或一次
`[]`	字符集合
`[^]`	取反字符集合
`	`

2. 用 grep 搜索文本

命令	用途
`grep pattern file`	搜索匹配行
`grep -n pattern file`	显示行号
`grep -i pattern file`	忽略大小写
`grep -w word file`	匹配完整单词
`grep -R pattern dir`	递归搜索目录
`grep -A 3 pattern file`	多看后 3 行
`grep -B 3 pattern file`	多看前 3 行
`grep -C 3 pattern file`	前后各看 3 行

3. 需要扩展正则时，切到 grep -E

egrep 等价于 grep -E。
当你需要 a|b、a+、a?、(ab)+ 这类扩展表达式时，直接用 grep -E。

验证

用 grep -n 或 grep -c 确认是否命中了预期行。
复杂模式先在小样本上试，再递归扫目录。

常见问题

grep -R 会递归很多文件，先缩小目录范围再跑，避免噪声过大。
正则里的 | 和 Markdown 表格列分隔符不是一回事；在表格中展示时需要写成 |。