Pivotal Knowledge Base

Follow

How to handle inconsistently structured JSON data in OQL

Applies to

GemFire 8 and later

Purpose

The purpose of this article is to show different ways to successfully query inconsistently structured data with OQL.

Description

We start of with an object with nested arrays whose elements we need to query:

{ "name": "Jane Doe", "addresses": [..., { "city": "New York" }, ...] }

To find people who have an address in New York, the following OQL can be used:

SELECT DISTINCT person FROM /people AS person, person.addresses AS address WHERE address.city = 'New York'

However, that query throws an error if we also have the following object in the same region:

{ "name": "John Doe" }

The error stems from the inclusion of person.addresses in the FROM clause because the FROM clause may only include paths that resolve to arrays.

So how can a query be written given that person.addresses may not exist or may not be an array?

A recommended way to handle this kind of situation is using the is_defined clause:

select distinct person from /jsonRegion person where is_defined( person.addresses) and 'New York' in (select a.city from person.addresses a)

So let's say our original object changes to one that has a deeper nesting like the following:

{ 
"name":"Jane Doe",
"addresses":[
{
"phoneNumbers":[
{
"number":"123-456-7891"
},
{
"number":"987-654-4321"
}
]
},
{
"city":"New York"
}
]
}

How do we query for all people who have a phone number "123-456-7891"?

This query also takes into consideration when addresses doesn't exist:

select person.name from (select * from /jsonRegion jr where is_defined(jr.addresses)) person, (select * from person.addresses) a where is_defined(a.phoneNumbers) and '123-456-7891' in (select n.number from a.phoneNumbers n)

Comments

Powered by Zendesk