Mysql left join json array

My article “SQL vs NoSQL: The Differences” noted that the line between SQL and NoSQL databases has become increasingly blurred, with each camp adopting features from the other. MySQL 5.7+ InnoDB databases and PostgreSQL 9.2+ both directly support JSON document types in a single field. In this article, we’ll examine the MySQL 8.0 JSON implementation in more detail.

Note that any database will accept JSON documents as a single string blob. However, MySQL and PostgreSQL support validated JSON data in real key/value pairs rather than a basic string.

Just Because You Can Store JSON …

… it doesn’t follow you should.

Normalization is a technique used to optimize the database structure. The First Normal Form (1NF) rule governs that every column should hold a single value — which is clearly broken by storing multi-value JSON documents.

If you have clear relational data requirements, use appropriate single-value fields. JSON should be used sparingly as a last resort. JSON value fields can’t be indexed, so avoid using it on columns which are updated or searched regularly. In addition, fewer client applications support JSON and the technology is newer, so it could be less stable than other types.

That said, there are good JSON use cases for sparsely populated data or custom attributes.

Create a Table With a JSON Field

Consider a shop selling books. All books have an ID, ISBN, title, publisher, number of pages and other clear relational data. Presume you want to add any number of category tags to each book. You could achieve this in SQL using:

  1. a tag table which stored each tag name with a unique ID, and
  2. a tagmap table with many-to-many records mapping book IDs to tag IDs

It’ll work, but it’s cumbersome and considerable effort for a minor feature. Therefore, you can define a tags JSON field in your MySQL database’s book table:

CREATE TABLE `book` (
  `id` MEDIUMINT(8) UNSIGNED NOT NULL AUTO_INCREMENT,
  `title` VARCHAR(200) NOT NULL,
  `tags` JSON DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=INNODB;

Note that JSON columns can’t have a default value, be used as a primary key, be used as a foreign key, or have an index. You can create secondary indexes on generated virtual columns, but it’s easier and more practical to retain a value in a separate field if indexes are required.

Adding JSON Data

Whole JSON documents can be passed in INSERT or UPDATE statements. For example, our book tags can be passed as an array (inside a string):

INSERT INTO `book` (`title`, `tags`)
VALUES (
  'ECMAScript 2015: A SitePoint Anthology',
  '["JavaScript", "ES2015", "JSON"]'
);

JSON can also be created with these:

  • JSON_ARRAY() function, which creates arrays. For example:

    -- returns [1, 2, "abc"]:
    SELECT JSON_ARRAY(1, 2, 'abc');
    
  • JSON_OBJECT() function, which creates objects. For example:

    -- returns {"a": 1, "b": 2}:
    SELECT JSON_OBJECT('a', 1, 'b', 2);
    
  • JSON_QUOTE() function, which quotes a string as a JSON value. For example:

    -- returns "[1, 2, \"abc\"]":
    SELECT JSON_QUOTE('[1, 2, "abc"]');
    
  • or you can (CAST anyValue AS JSON).

The JSON_TYPE() function allows you to check JSON value types. It should return OBJECT, ARRAY, a scalar type (INTEGER, BOOLEAN, etc), NULL, or an error. For example:

-- returns ARRAY:
SELECT JSON_TYPE('[1, 2, "abc"]');

-- returns OBJECT:
SELECT JSON_TYPE('{"a": 1, "b": 2}');

-- returns an error:
SELECT JSON_TYPE('{"a": 1, "b": 2');

The JSON_VALID() function returns 1 if the JSON is valid or 0 otherwise:

-- returns 1:
SELECT JSON_TYPE('[1, 2, "abc"]');

-- returns 1:
SELECT JSON_TYPE('{"a": 1, "b": 2}');

-- returns 0:
SELECT JSON_TYPE('{"a": 1, "b": 2');

Attempting to insert an invalid JSON document will raise an error and the whole record will not be inserted/updated.

Searching JSON Data

The JSON_CONTAINS() function accepts the JSON document being searched and another to compare against. It returns 1 when a match is found. For example:

-- all books with the 'JavaScript' tag:
SELECT * FROM `book` WHERE JSON_CONTAINS(tags, '["JavaScript"]');

The similar JSON_SEARCH() function returns the path to the given match or NULL when there’s no match. It’s passed the JSON document being searched, 'one' to find the first match, or 'all' to find all matches, and a search string (where % matches any number of characters and _ matches one character in an identical way to LIKE). For example:

-- all books with tags starting 'Java':
SELECT * FROM `book` WHERE JSON_SEARCH(tags, 'one', 'Java%') IS NOT NULL;

JSON Paths

A JSON path targets values and can be used to extract or modify parts of a JSON document. The JSON_EXTRACT() function demonstrates this by extracting one or more values:

-- returns "SitePoint":
SELECT JSON_EXTRACT('{"id": 1, "website": "SitePoint"}', '$.website');

All path definitions start with a $ followed by other selectors:

  • a period followed by a name, such as $.website
  • [N] where N is the position in a zero-indexed array
  • the .[*] wildcard evaluates all members of an object
  • the [*] wildcard evaluates all members of an array
  • the prefix**suffix wildcard evaluates to all paths that begin with the named prefix and end with the named suffix

The following examples refer to the following JSON document:

{
  "a": 1,
  "b": 2,
  "c": [3, 4],
  "d": {
    "e": 5,
    "f": 6
  }
}

Example paths:

  • $.a returns 1
  • $.c returns [3, 4]
  • $.c[1] returns 4
  • $.d.e returns 5
  • $**.e returns [5]

You could extract the name and first tag of your book table using the query:

SELECT
  title, tags->"$[0]" AS `tag1`
FROM `book`;

For a more complex example, presume you have a user table with JSON profile data. For example:

idnameprofile
1 Craig { “email”: [“”, “”], “twitter”: “@craigbuckler” }
2 SitePoint { “email”: [], “twitter”: “@sitepointdotcom” }

You can extract the Twitter name using a JSON path. For example:

SELECT
  name, profile->"$.twitter" AS `twitter`
FROM `user`;

You could use a JSON path in the WHERE clause to only return users with a Twitter account:

SELECT
  name, profile->"$.twitter" AS `twitter`
FROM `user`
WHERE
  profile->"$.twitter" IS NOT NULL;

Modifying Part of a JSON Document

There are several MySQL functions to modify parts of a JSON document using path notation. These include:

  • JSON_SET(doc, path, val[, path, val]...): inserts or updates data in the document
  • JSON_INSERT(doc, path, val[, path, val]...): inserts data into the document
  • JSON_REPLACE(doc, path, val[, path, val]...): replaces data in the document
  • JSON_MERGE(doc, doc[, doc]...): merges two or more document
  • JSON_ARRAY_APPEND(doc, path, val[, path, val]...): appends values to the end of an array
  • JSON_ARRAY_INSERT(doc, path, val[, path, val]...): inserts an array within the document
  • JSON_REMOVE(doc, path[, path]...): removes data from the document

You can therefore add a “technical” tag to any book which already has a “JavaScript” tag:

UPDATE `book`
  SET tags = JSON_MERGE(tags, '["technical"]')
WHERE
  JSON_SEARCH(tags, 'one', 'JavaScript') IS NOT NULL;

Further Information

The MySQL manual provides further information about the JSON data type and the associated JSON functions.

Again, I urge you not to use JSON unless it’s absolutely necessary. You could emulate an entire document-oriented NoSQL database in MySQL, but it would negate many benefits of SQL, and you may as well switch to a real NoSQL system! That said, JSON data types might save effort for more obscure data requirements within an SQL application.