How to use awk to select data from commands and scripts

The awk command is incredibly useful, and you will be surprised at just how powerful and transformative it will make your scripts. Here's how to get started using it.

log on screen

Image: iStock/Inimma-IS

While I'm not a fan of programming, I do love working from the command line as much as possible and have great respect for programmers and what they can do. As I work on evolving my Linux skills, I find a lot of cross-pollination between Linux and macOS in the Terminal due to their shared UNIX-base, and I recently got into using some more advanced (for me) Linux commands on my scripts.

SEE: Top 5 programming languages for systems admins to learn (free PDF) (TechRepublic)

One command that simply blew my mind is awk. For those who are unsure, let me tell you that awk is awesome! Basically, it is like a filter command that interprets data fed to it. Awk scans the file/data and splits the parts of a line into fields. This allows the user to transform the data as they see fit, output reports formatted a certain way, scan for patterns, or perform programming operations based on said data. It also works extremely well with variables to segment specific data points for recall later.

This article is far from any advanced take on awk, but if you're looking to add to your skillset by expanding your Linux/macOS Terminal knowledge, or simply wish to implement this great little utility into your scripting to make robust, feature-rich use of data, then this should help you get started down the road to awk enlightenment.

Print all data from a file

One of the easiest ways to use awk is to have it scan a file and print or display (Figure A) the contents on screen by using the following syntax:

awk {print} filename.ext

Figure A

2020-49-figure-a.jpg

Find lines that match a specific pattern

Say you wish to group all the lines that match a specified pattern. The following syntax would display only the lines that match the criteria of those with the "Doctor" job title (Figure B):

awk '/Doctor/ {print}' filename.ext 

Figure B

2020-49-figure-b.jpg

Specify particular data based on field

The awk command includes built-in variables, beginning with $0 (which prints the entire file) and on to $1, $2, and so on. This variable is assigned to a section/column and will reference all the items in the rows that fall under that header. In our example below (Figure C), the $4 and $6 variables would translate to the "email_address" and "location_id" headers, so the syntax to display only the information that matches those two patterns would be:

awk '{print $4,$6}' filename.ext 

Figure C

2020-49-figure-c.jpg

In my experience, I have found awk to be extremely versatile and useful beyond words, but not without its share of head-scratching moments. As with anything, I can't recommend enough to experiment with the command, read the manual pages, and definitely test it against your scripts in a non-production environment until you get a stronger understanding of what the utility is capable of.

Also see