The Best of Both Worlds

JSONB can handle it

If you want a relational data structure with the the flexibility of storing schema-less data as needed, JSONB in PostgreSQL 9.4 may be the perfect solution.

Prior to the release of PostgrSQL 9.4 (Dec. 2014), deciding whether to go with a SQL or NoSQL database was a critical decision that was difficult to reverse once implemented. Now you can store data with a JSONB data type, which is a significant upgrade from the JSON data type first released in PostgreSQL 9.2 (Sept. 2012). With the introduction of JSON data types, it became convenient to store data as key/value pairs. Using the HSTORE data type is another option that actually performs better than JSON, but nesting data is not natively possible. The best of both worlds seems to be the JSONB data type. You can store deeply nested data and use the binary component to index for better performance. Using JSONB allows your data to remain flexible without predefined attributes and without compromising on performance.

Use Case:

We were given the task of saving data from an excel workbook with multiple sheets in a database. Once we realized the column headers would potentially change, we needed to find a way to store dynamic attributes. Using PostgreSQL 9.4, we created a model that stored the file name and sheet name. Then we created a record for each row that was stored as a JSONB data type. This way the dynamic data (an entire row from a sheet) is always aware of its file and sheet which are required fields.

Querying Examples

Example 1:

In our case, we were using Ruby on Rails. For a query on a dynamic attribute, you need to state the JSONB field name and cast the key to a type and then look for match. Like this...

Entry.where("data ->> 'district'::text = 'Northeast'")

Example 2:

We also wanted to be able to find records that matched value ranges from a specific sheet (static attribute) and region (dynamic attribute). We were able to query with a WHERE clause like this...

"sheet = 'Overview' AND
data ->> 'Region'::text = 'Central' AND
(data ->> '2015 Total Population')::int > 100000 AND
(data ->> '2015 Total Population')::int < 200000"

To search if a json column does or does not exist. These work...

Entry.where("(data -> 'updated values') is null")
Entry.where("(data -> 'updated values') is not null")

To search a json column for a value with a single quote, encapsulate your query string with $$. For example...

Entry.where("(data ->> 'location') = $$Martha's Vineyard$$")

Although, data types were cast in the query for comparison, JSONB preserves data types. So you can retrieve the type this way...['Region'].class

To search a value nested inside of a json hash, use a containment query...

Sheet.where("data @> '#{ { row_1: {col_1: "TPS Report:"} }.to_json}'")

Concerns put to rest

We we were concerned about scale. How would this hold up to 100k's of data? We tested it and it works well. Another initial concern was query syntax — it’s different than standard RDBMS but not too different. It's also supported by RDS. So for us, it was a good fit.

Jul. 15 2015