A better version of this blog piece is published in Toyota Connected India Medium Link
What you will see in this post?
- A Brief Introduction to
JMESPath
- What is a json? Where is it used?
- Is
JMESPath
the only way to parse json documents?
- How to install
JMESPath
CLI tooljp
- What are the common
JMESPath
expressions? - Application of
JMESPath
in AWS CLI- We will take practical, tough AWS cases and use a combination of
jp
commands to parse the json outputs
- We will take practical, tough AWS cases and use a combination of
Brief Introduction
In this article, we are going to discuss how to leverage jmespath
expressions for extracting elements from json documents. The command line module jp
and the python module import jmespath
are the two popular interfaces to jmespath
. Not just Python, jmespath
libraries are available in javascript, ruby and golang too.
In this article we can learn jmespath
expressions through jp
tool.
What is a JSON?
- JSON (JavaScript Object Notation) is used everywhere. Typically they are seen/used
- as data interchange formats while developing
- as Logs in JSON format
- as Configurations in JSON Format
- while transfering data in Cloud Serverless Services
What are the different types of JSON
- JSON (JavaScript Object Notation) represents structured data in key, value pairs
- Types of JSON
String
:{"name":"Senthil"}
dictionary
list
float
orint
boolean
null
Examples of Valid Jsons
- A typical dictionary type
{"name":"Senthil"}
- A nested dictionary with list data type values
{
"Android Phones": [
[
{
"name": "Samsung Galaxy",
"price": 899
}
]
]
}
- A dictionary with
null
value
{"name":null}
Checking the above is a valid json
echo "{\"name\":null}" > string_json.json
python -c "import json; dict_list = json.load(open('string_json.json','r')); print(dict_list)"
{'name': None}
- A json containing only a list of values is also
[
"iPhone",
"Samsung Galaxy",
"Google Pixel"
]
- Checking the above is a valid json
>> cat list_json.json
[
"iPhone",
"Samsung Galaxy",
"Google Pixel"
]
>> python -c "import json; dict_list = json.load(open('list_json.json','r')); print(dict_list)"
['iPhone', 'Samsung Galaxy', 'Google Pixel']
What is JMESPath?
- JMESPath stands for JSON Matching Expression Paths source
- JMESPath is a query expression language for searching in JSON documents
How to install jp
- If you use
Mac
device
brew install jmespath/jmespath/jp
Please note I have used a Mac OS terminal for all the below examples
If you use
linux
sudo wget https://github.com/jmespath/jp/releases/latest/download/jp-linux-amd64 \
-O /usr/local/bin/jp && sudo chmod +x /usr/local/bin/jp
- If you use
Windows
scoop install jp
Common JP Expressions
1. Simple Retrieval of Keys
echo '{"field_1":30, "field_2":50}' | jp 'field_2'
50
echo '{"field":{"sub_field":30}}' | jp 'field.sub_field'
30
- In the above examples, we have extracted a specific key by using
.
operatorfield.sub_field
2. Slicing array
or list
type field
echo '{"field_1":30, "field_2":50, "field_3":[1,2,3]}' | jp 'field_3[*]'
[
1,
2,
3]
echo '{"field_1":30, "field_2":50, "field_3":[1,2,3]}' | jp 'field_3[0]'
1
echo '{"field_1":30, "field_2":50, "field_3":[1,2,3]}' | jp '[field_3[0], field_3[2]]'
[
1,
3]
- In the above examples, we have used syntax such as
[*]
to extract all elements in an array[field[index], field[another_index]]
to extract specific indices of an array
3. Slicing an array of dictionaries products[{...}]
to fetch one of the keys name
cat data.json
{
"products": [
{
"name": "iPhone",
"price": 999
},
{
"name": "Samsung Galaxy",
"price": 899
},
{
"name": "Google Pixel",
"price": 799
},
{
"name": "OnePlus",
"price": 699
}
]
}
jp -f data.json 'products[*].name'
[
"iPhone",
"Samsung Galaxy",
"Google Pixel",
"OnePlus"
]
- In the above example, we have used
[*]
to look into all values in anarray
and then show only one fieldname
4. Filtering based on condition
- Retrieve all values from a specific key
name
in an arrayprodcts
whereprice
greater than a specified value
jp -f data.json 'products[?price >= `799`].name'
[
"iPhone",
"Samsung Galaxy",
"Google Pixel"
]
- In the above example, we have used a condition on a field
price
to retrieve from an arrayproducts
and then display only fieldname
5. Retrieve multiple values and make a new json
jp -f data.json '{"AndroidPhones":products[?name != `"iPhone"`].[{"android_phone_name":name, "price":price}]}' > android_phones_data.json && cat android_phones_data.json
{
"AndroidPhones": [
[
{"android_phone_name": "Samsung Galaxy",
"price": 899
}
],
[
{"android_phone_name": "Google Pixel",
"price": 799
}
],
[
{"android_phone_name": "OnePlus",
"price": 699
}
]
]
}
6. Pipe Expressions
The above result can be made with Pipes (which give a sense of modularized expressions)
jp -f data.json '{"Android Phones":products[?name != `"iPhone"`]} | "Android Phones"[*]'
[
{"name": "Samsung Galaxy",
"price": 899
},
{
"name": "Google Pixel",
"price": 799
},
{
"name": "OnePlus",
"price": 699
}
]
7. Built-in Functions
There are so many built-in jmespath functions (refer here). Let us cover some of them. The rest of them should follow similar template.
A. sort_by, min_by, max_by
Sort an array in ascending order
jp -f data.json 'products[*] | sort_by(@,&price)'
[
{"name": "OnePlus",
"price": 699
},
{
"name": "Google Pixel",
"price": 799
},
{
"name": "Samsung Galaxy",
"price": 899
},
{
"name": "iPhone",
"price": 999
}
]
Note:
- The &key_name
is critical to refer to the variable inside a built-in function
- Sort an array in descending order
jp -f data.json 'products | sort_by(@,&price) | reverse(@)'
[
{"name": "iPhone",
"price": 999
},
{
"name": "Samsung Galaxy",
"price": 899
},
{
"name": "Google Pixel",
"price": 799
},
{
"name": "OnePlus",
"price": 699
}
]
Note: - The use of @
sympbolizing the output from previous portion of the pipe to be used to the next stage
- Maximum Element in an array
jp -f data.json 'products | max_by(@,&price)'
{
"name": "iPhone",
"price": 999
}
- Minimum Element in an array
jp -f data.json 'products | min_by(@,&price) | name'
"OnePlus"
jp -u -f data.json 'products | min_by(@,&price) | name'
OnePlus
- The Pipe expressions are modularized and easy to handle.
- Note the argument
-u
(unquoted
) to get string without quotes
B. contains official docs
jp -u -f data.json 'products | contains([].name,`"OnePlus"`)'
true
contains
gives outtrue
orfalse
; simplest examplecontains('foobar','bar')
will give true
jp -u -f data.json 'products[?contains(name, `"Plus"`)]'
[
{"name": "OnePlus",
"price": 699
}
]
- We can use contains to match a portion of text in a variable inside an array
C. join official docs
>> jp -f data.json 'products[*].name'
[
"iPhone",
"Samsung Galaxy",
"Google Pixel",
"OnePlus"
]
>> jp -f data.json 'join(`","`,products[*].name)'
"iPhone,Samsung Galaxy,Google Pixel,OnePlus"
>> jp -u -f data.json 'join(`","`,products[*].name)'
iPhone,Samsung Galaxy,Google Pixel,OnePlus
- You can use the
-u
argument when you want the output to be displayed as plain, unquoted strings instead of valid JSON
D. keys official docs
>> jp -f data.json 'keys(@)'
[
"products"
]
>> jp -f data.json 'products[0] | keys(@)'
[
"name",
"price"
]
8. Logical OR and &&
jp -f data.json 'products[?(price > `699` && price < `999`)]'
[
{"name": "Samsung Galaxy",
"price": 899
},
{
"name": "Google Pixel",
"price": 799
}
]
jp -f data.json 'products[?(contains(name, `"Sam"`) || price < `899`)]'
[
{"name": "Samsung Galaxy",
"price": 899
},
{
"name": "Google Pixel",
"price": 799
},
{
"name": "OnePlus",
"price": 699
}
]
Practical AWS Cases
1. Let us analyze an example of an AWS cli output json
aws lambda list-functions --output json >> aws_example.json && cat aws_example.json
{
"Functions": [
{
"FunctionName": "my-function-1",
"FunctionArn": "arn:aws:lambda:us-east-1:1234567890:function:my-function-1",
"Runtime": "nodejs12.x",
"MemorySize": 128,
"Timeout": 3,
"LastModified": "2023-06-18T10:15:00Z"
},
{
"FunctionName": "my-function-2",
"FunctionArn": "arn:aws:lambda:us-east-1:1234567890:function:my-function-2",
"Runtime": "python3.8",
"MemorySize": 256,
"Timeout": 5,
"LastModified": "2023-06-17T14:30:00Z"
},
{
"FunctionName": "my-function-3",
"FunctionArn": "arn:aws:lambda:us-east-1:1234567890:function:my-function-3",
"Runtime": "java11",
"MemorySize": 512,
"Timeout": 10,
"LastModified": "2023-06-16T09:45:00Z"
}
]
}
Q1. Query all lambda functions running python
- Based on how you want to parse the output, you can have it as a list or just the first element by accessing
[0]
jp -f aws_example.json 'Functions[?starts_with(Runtime,`"python"`)]'
[
{"FunctionArn": "arn:aws:lambda:us-east-1:1234567890:function:my-function-2",
"FunctionName": "my-function-2",
"LastModified": "2023-06-17T14:30:00Z",
"MemorySize": 256,
"Runtime": "python3.8",
"Timeout": 5
}
]
jp -f aws_example.json 'Functions[?starts_with(Runtime,`"python"`)] | [0]'
{
"FunctionArn": "arn:aws:lambda:us-east-1:1234567890:function:my-function-2",
"FunctionName": "my-function-2",
"LastModified": "2023-06-17T14:30:00Z",
"MemorySize": 256,
"Runtime": "python3.8",
"Timeout": 5
}
Q2. Query all lambda functions using memory more than 128 MB
jp -f aws_example.json 'Functions[?MemorySize > `128`]'
[
{"FunctionArn": "arn:aws:lambda:us-east-1:1234567890:function:my-function-2",
"FunctionName": "my-function-2",
"LastModified": "2023-06-17T14:30:00Z",
"MemorySize": 256,
"Runtime": "python3.8",
"Timeout": 5
},
{
"FunctionArn": "arn:aws:lambda:us-east-1:1234567890:function:my-function-3",
"FunctionName": "my-function-3",
"LastModified": "2023-06-16T09:45:00Z",
"MemorySize": 512,
"Runtime": "java11",
"Timeout": 10
}
]
2. Let us analyze a more complicated example from official jmespath tutorial.
- It looks like the state of
EC2 instances
cat official_example_for_nested.json
{
"reservations": [
{
"instances": [
{"type": "small",
"state": {"name": "running"},
"tags": [{"Key": "Name",
"Values": ["Web"]},
{"Key": "version",
"Values": ["1"]}]},
{"type": "large",
"state": {"name": "stopped"},
"tags": [{"Key": "Name",
"Values": ["Web"]},
{"Key": "version",
"Values": ["1"]}]}
]
}, {
"instances": [
{"type": "medium",
"state": {"name": "terminated"},
"tags": [{"Key": "Name",
"Values": ["Web"]},
{"Key": "version",
"Values": ["1"]}]},
{"type": "xlarge",
"state": {"name": "running"},
"tags": [{"Key": "Name",
"Values": ["DB"]},
{"Key": "version",
"Values": ["1"]}]}
]
}
]
}
Q1. Find all instances that are running
and give me a count of them
jp -f official_example_for_nested.json 'reservations[].instances[?state.name == `"running"`][]'
[
{"state": {
"name": "running"
},
"tags": [
{
"Key": "Name",
"Values": [
"Web"
]
},
{
"Key": "version",
"Values": [
"1"
]
}
],
"type": "small"
},
{
"state": {
"name": "running"
},
"tags": [
{
"Key": "Name",
"Values": [
"DB"
]
},
{
"Key": "version",
"Values": [
"1"
]
}
],
"type": "xlarge"
}
]
jp -f official_example_for_nested.json 'length(reservations[].instances[?state.name == `"running"`][])'
2
Two instances are running
Note the
[]
in the end to flatten the list. A simpler example below:
echo "[[0,1],2,3,[4,5,6]]" | jp '[]'
[
0,
1,
2,3,
4,
5,
6
]
Q2. Find the status of instances of type large
or xlarge
jp -f official_example_for_nested.json 'reservations[].instances[?(type==`"xlarge"` || type==`"large"`)][]'
[
{"state": {
"name": "stopped"
},
"tags": [
{
"Key": "Name",
"Values": [
"Web"
]
},
{
"Key": "version",
"Values": [
"1"
]
}
],
"type": "large"
},
{
"state": {
"name": "running"
},
"tags": [
{
"Key": "Name",
"Values": [
"DB"
]
},
{
"Key": "version",
"Values": [
"1"
]
}
],
"type": "xlarge"
}
]
Conclusion
JMESPath
is a great tool to have in your arsenal, especially if you are a heavy cloud user- All major cloud providers - AWS, Azure and Oracle Cloud use - jmespath. (Google Cloud has its own variation to parsing json that has a lot of similarities to jmespath)
But in one of the examples, we used python
dictionary
parsing instead ofjp
. This is because - thejp
command would be hard to interpret and to debug on tough cases. In those complicated cases, you could parse the json in your programming lang of choice. The idea is to usejp
commands intuitively and not overcomplicate for maintenance/interpretation.There are some alternatives such as
jq
, JSONPath, etc.,- JQ is a feature-rich command-line JSON processor specifically designed for JSON manipulation
jmespath
seems to be easy to adopt thanjq
andjq
is more feature-rich thanjmespath
| opinionated source
Good Sources
- JMESPath Official Page source
- Official Examples used source1
- YCombinator Discussion between JMESPATH and JQ | Source
- Some tough examples sources for JP
- Want to practice with a different example?
- Use the data in JQ Tutorial here: JQ Tutorial