Count the word occurrences in a text file on Linux and learn about the content it holds in detail. Although the task sounds too heavy to deal with, the availability of grep
and tr
commands make things simple.
There are over 100 commands in Linux, and adding the sub-commands and flags to each will create a big combination of more. This article mainly concentrates on the commands that help us work with the text files; the main motive is learning how to count the word occurrences in a text file seamlessly.
We will also look at which particular method sits best and the most time optimized. These commands help us count the occurrences in a text file and the occurrences in each line and the occurrences of the lines, only excluding the count in each line.
How to Count the Word Occurrences in a Text File on Linux
Now that you’ve got enough information, it is the perfect time to uncover the best ways to count the word occurrence in a text file.
Count the Word Occurrences in a Text File: The grep Command Method
The first method involves the usage of the grep
command. The process is pretty simple, provided you adhere to what I’m going to discuss in the next section.
Creating a Text File
So in Linux, before you start counting the frequency of word occurrence in a text file, it is crucial to know how to create a text file and how to write contents to it. For that purpose, the best way is to employ the touch
command to create an empty text file.
Launch the Terminal by using the “Ctrl+Alt+T” key combination
Pass the following command:
$ touch name_of_file.txt

Editing Contents Inside
Then the next task is adding contents into the text file using the echo
command. Something like $ echo > "These are the contents to be written."
Note: You can use both of the commands in a single step like $ cat > name_of_file.txt
, and then the cursor will be placed on the next line to enter the text just like in a word or a notepad. The use of the cat command can also be simplified to view all the contents inside the text file.
View the Content and Count the Word Occurrences
After creating the text file and adding the contents to it, you can start counting the occurrence of each word. To start with it, all you need to do is invoke the grep
command. What the grep command mainly does is it would consider our input as a pattern to be searched and then search for it if we are searching for a string ‘search
‘ in a text file named sample.txt
.
Here is what the command should look like:
$ grep -o 'search' sample.txt | wc -l

Approaching with the grep Command
The -o
option used alongside the grep
command instructs the utility to output each corresponding match, but in a unique line and will print the output in a separate line. Then you’ll need to define the -l
flag with the wc
option to count each word.
Remember, the -l
flag will denote that it is needed to be counted in each line. You can also add the -i
flag, which denotes that the string search will be concluded irrespective of whether it is uppercase or lowercase.
$ grep -o -i ‘search’ sample.txt | wc -l

You can invoke several input files alongside the concerned grep
command. It’ll then work its way, search into, sort all the files, and return the total character count for each file. The command for this purpose will look something like this:
$ grep -o -i ‘search’ sample1.txt sample2.txt | wc -l

Here, we’ve created a new file sample1.txt
and sample2.txt
and then performed the operations related to character count on both files. When you pass the two concerned files as a unified argument to the grep
command, it will simply initiate the search and come up with the count.
Count the Word Occurrences on Linux: The tr Command Method
Apart from the grep command, we’ve another brilliant command-line utility called tr. It can help count the word occurrences in a text file with no issues whatsoever.
Here you’ll be using two flags -c
and -d
. While the former will take the compliment of the set, that is, words that do not match the imputed string pattern, the latter will delete those complimented words.
The structure of the tr
command to count the word occurrences in a text file:
$ tr -c -d 'search' < sample.txt | wc -c

-c
: The c flag is invoked to employ the compliment of the set
-d
: It deletes all the existing characters mentioned by the concerned set
The string, in this case, bags a single character, search. Since we’ll be combining the -c
and -d
options, it will eventually delete all the characters, avoiding the ones we’ve not defined in the set. The resultant string then heads towards the wc
command, which is nothing but ‘word count.’
The -c
flag used in the wc
or the word count command works its way to return the total character count.
Also, you can use the command tr
together with the grep
command differently and proceed with the search for text patterns in the desired file.
Sample Input:
$ tr ‘ [:space:]’ ‘[\n*]’ < sample.txt | grep -i -c search

The word search will then initiate in the sample.txt
file, and the total occurrence in each line will get summed up and printed.
Using awk Command to Count the Word Occurrences
Another command known as the awk
proves beneficial when you’re concerned about getting the frequency of word occurrence notes. It is a utility that takes input data, processes it, and returns the desired output.
Although compared to the methods I’ve already discussed, this one is very difficult to understand. For that reason, it lists among the least used method, and I’d recommend you to stick to either grep command or tr command. However, having an idea won’t hurt.
Sample command format:
$ awk -F ‘sample’ ‘{s+=(NF-1)} END {print s}’ sample.txt
Understanding awk Command
Over here, I’ve changed the current separator to ‘sample’ using the flag -F
(-F
is used to input the phase that needs to get searched. In our case, it is named search). The data that lies here will be separated at each occurrence of the word search.
Coming to the {s+=(NF-1)} END {print s}
section, this is the command that counts all the sub-phases included in the text generated and then eventually decreases it one by one from the main and get the desired character count. The subtraction occurs as one character match will do nothing but split the data into two corresponding parts.
Finally, the outputted count value will be added for each line, and at last, we get the TCC
(Total character Count) for the entire text.
That’s basically how you can count the word occurrences in a text file on Linux. If you ask me to help you choose one, I’d say going for the ‘tr’ command as it seems more efficient and time optimized. However, if you’re not after efficiency, all three methods can aid the purpose.
If this guide helped you, please share it.