One of the most structured data formats used today is JSON. Most programming languages have libraries to both parse and generate it, but what about us humans? What if you need to work with JSON blobs that are being returned by curl, or a big config file, but you don’t want to mess around with writing a script? Jq is your tool.

Your First Filter

In a very basic example, lets assume you want to grab the whois information for an IP address, and since you’re feeling weird, you decide to use cURL to talk to ARIN’s WHOIS api:

❯ curl -H "Accept: application/json" http://whois.arin.net/rest/ip/8.8.8.8
{"net":{"@xmlns":{"ns3":"http:\/\/www.arin.net\/whoisrws\/netref\/v2","ns2":"http:\/\/www.arin.net\/whoisrws\/rdns\/v1","$":"http:\/\/www.arin.net\/whoisrws\/core\/v1"},"@copyrightNotice":"Copyright 1997-2023, American Registry for Internet Numbers, Ltd.","@inaccuracyReportUrl":"https:\/\/www.arin.net\/resources\/registry\/whois\/inaccuracy_reporting\/","@termsOfUse":"https:\/\/www.arin.net\/resources\/registry\/whois\/tou\/","registrationDate":{"$":"2014-03-14T16:52:05-04:00"},"rdapRef":{"$":"https:\/\/rdap.arin.net\/registry\/ip\/8.8.8.0"},"ref":{"$":"https:\/\/whois.arin.net\/rest\/net\/NET-8-8-8-0-1"},"endAddress":{"$":"8.8.8.255"},"handle":{"$":"NET-8-8-8-0-1"},"name":{"$":"LVLT-GOGL-8-8-8"},"netBlocks":{"netBlock":{"cidrLength":{"$":"24"},"endAddress":{"$":"8.8.8.255"},"description":{"$":"Reallocated"},"type":{"$":"A"},"startAddress":{"$":"8.8.8.0"}}},"resources":{"@copyrightNotice":"Copyright 1997-2023, American Registry for Internet Numbers, Ltd.","@inaccuracyReportUrl":"https:\/\/www.arin.net\/resources\/registry\/whois\/inaccuracy_reporting\/","@termsOfUse":"https:\/\/www.arin.net\/resources\/registry\/whois\/tou\/","limitExceeded":{"@limit":"256","$":"false"}},"orgRef":{"@handle":"GOGL","@name":"Google LLC","$":"https:\/\/whois.arin.net\/rest\/org\/GOGL"},"parentNetRef":{"@handle":"NET-8-0-0-0-1","@name":"LVLT-ORG-8-8","$":"https:\/\/whois.arin.net\/rest\/net\/NET-8-0-0-0-1"},"startAddress":{"$":"8.8.8.0"},"updateDate":{"$":"2014-03-14T16:52:05-04:00"},"version":{"$":"4"}}}

Well, that worked, but the output isn’t exactly easily readable. We’re going to write our first jq query to clean the output up. The entire document is named ., so we’re just going to query that.

 curl -H "Accept: application/json" http://whois.arin.net/rest/ip/8.8.8.8 | jq .
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1509    0  1509    0     0  13473      0 --:--:-- --:--:-- --:--:-- 13473
{
  "net": {
    "@xmlns": {
      "ns3": "http://www.arin.net/whoisrws/netref/v2",
      "ns2": "http://www.arin.net/whoisrws/rdns/v1",
      "$": "http://www.arin.net/whoisrws/core/v1"
    },
    "@copyrightNotice": "Copyright 1997-2023, American Registry for Internet Numbers, Ltd.",
    "@inaccuracyReportUrl": "https://www.arin.net/resources/registry/whois/inaccuracy_reporting/",
    "@termsOfUse": "https://www.arin.net/resources/registry/whois/tou/",
    "registrationDate": {
      "$": "2014-03-14T16:52:05-04:00"
    },
    "rdapRef": {
      "$": "https://rdap.arin.net/registry/ip/8.8.8.0"
    },
    "ref": {
      "$": "https://whois.arin.net/rest/net/NET-8-8-8-0-1"
    },
    "endAddress": {
      "$": "8.8.8.255"
    },
    "handle": {
      "$": "NET-8-8-8-0-1"
    },
    "name": {
      "$": "LVLT-GOGL-8-8-8"
    },

I’ve truncated the output in the name of good taste, but you can see that jq formats the output in a nicely indented, readable manner.

Filtering For Attributes

Next, we’ll experiment with drilling down into the orgRef attribute of the net attribute:

 curl -H "Accept: application/json" http://whois.arin.net/rest/ip/8.8.8.8 | jq .net.orgRef
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1509    0  1509    0     0   7984      0 --:--:-- --:--:-- --:--:--  7984
{
  "@handle": "GOGL",
  "@name": "Google LLC",
  "$": "https://whois.arin.net/rest/org/GOGL"
}

Creating New JSON Documents

Jq isn’t limited to just filtering components of existing JSON documents. You can use it to create a new document by combining subcomponents of the document you are parsing. This also provides an example of how to handle special characters in key names.

❯ curl -H "Accept: application/json" http://whois.arin.net/rest/ip/8.8.8.8 | jq '{startAddr: .net.netBlocks.netBlock.startAddress."$", endAddr: .net.netBlocks.netBlock.endAddress."$", owner: .net.orgRef."@name"}'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1509    0  1509    0     0   8026      0 --:--:-- --:--:-- --:--:--  8026
{
  "startAddr": "8.8.8.0",
  "endAddr": "8.8.8.255",
  "owner": "Google LLC"
}

Jq also supports array access. Because the ARIN API doesn’t have any endpoints that generate arrays, we’re moving on to https://api.punkapi.com/v2/beers. This API returns an array of beer recipes, and we’re going to experiment with filtering that output.

Array Access

curl https://api.punkapi.com/v2/beers | jq .[].name
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 46947    0 46947    0     0   254k      0 --:--:-- --:--:-- --:--:--  254k
"Buzz"
"Trashy Blonde"
"Berliner Weisse With Yuzu - B-Sides"
"Pilsen Lager"
"Avery Brown Dredge"
"Electric India"
"AB:12"
"Fake Lager"
"AB:07"
"Bramling X"
"Misspent Youth"
"Arcade Nation"
"Movember"
"Alpha Dog"
"Mixtape 8"
"Libertine Porter"
"AB:06"
"Russian Doll – India Pale Ale"
"Hello My Name Is Mette-Marit"
"Rabiator"
"Vice Bier"
"Devine Rebel (w/ Mikkeller)"
"Storm"
"The End Of History"
"Bad Pixie"

Array slices are also supported:

❯ curl https://api.punkapi.com/v2/beers | jq .[0,3].name
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 46947    0 46947    0     0   309k      0 --:--:-- --:--:-- --:--:--  309k
"Buzz"
"Pilsen Lager"

As before, you can also create a new array from an old one:

curl https://api.punkapi.com/v2/beers | jq '[.[0,3] | {name: .name, description: .description}]'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 46947    0 46947    0     0   322k      0 --:--:-- --:--:-- --:--:--  325k
[
  {
    "name": "Buzz",
    "description": "A light, crisp and bitter IPA brewed with English and American hops. A small batch brewed only once."
  },
  {
    "name": "Pilsen Lager",
    "description": "Our Unleash the Yeast series was an epic experiment into the differences in aroma and flavour provided by switching up your yeast. We brewed up a wort with a light caramel note and some toasty biscuit flavour, and hopped it with Amarillo and Centennial for a citrusy bitterness. Everything else is down to the yeast. Pilsner yeast ferments with no fruity esters or spicy phenols, although it can add a hint of butterscotch."
  }
]

Yq

Jq is a great tool, but there are also lots of non-JSON documents floating around out in the internet. Fortunately Andrey Kislyuk has written a jq wrapper called yq. It lets you to work with YAML, XML, and TOML documents, running the jq queries you already know how to build. This means that you can easily filter the output of that legacy XML API or those huge YAML documents you have to deal with.