Removing line numbers from text file

less than 1 minute read

Say you have OCR’d book or other text file that has line numbers embedded in file

1 # cool program
2 import sys
3 def howneat():
4     sys.exit('Thanks for visiting')

Remove the line numbers by typing in Terminal:

perl -pe 's/^[ t]+d+//'  >

Regular expression

^[ t]+d+
beginning of the line
[ t]+
match space or tab (one or more)
match one or more digits
implicitly loop over (read each line of) the file
enable PCRE

And we replace with nothing. This leaves alone the indentation of the code (relevant for Python).


I could have used the -i flag to edit in place, but I instead redirected STDOUT to in case I made a mistake (inputting the wrong file, for example).



