Blogs

Data Transmission for Dynamic Databases

Why database administrators need to pay attention to JSON and NoSQL databases

On the heels of the release of IBM® Informix® Version 12.10xC3 database, references to JavaScript Object Notation (JSON) and NoSQL have been getting a lot of buzz. As a result, database administrators (DBAs) are asking the following questions: “Why do I even want to deal with unstructured data?” and “What is JSON?” Actually, DBAs should care about unstructured data, but they also need to understand when JSON and NoSQL can be useful additions to an application or data store.

XML’s heir

JSON is a standard key/value pair data format for storing and transmitting data in a consistent manner that is easy to read and write, both for humans and for applications. It has largely overtaken XML as the standard for data transmission for dynamic web pages, and is natively supported in most standard programming languages or with commonly available libraries. It has, along with the binary version—Binary JSON (BSON)—become one of the more common data encoding types used for primary record storage in NoSQL databases.

As business application development increases and becomes more web-focused and distributed, new ways to easily package and transmit data have become essential. That requirement may be fine and dandy as far as many DBAs are concerned, but many of them may think they don’t see any change that requires anything other than a standard relational database—which is a perfectly valid thought. The problem is that many application developers these days are moving so quickly and the requirements are changing so rapidly that they can’t always wait for schema changes to make it through the pipe. They need to store their new data structures immediately.   Advanced Data Tools
 

 
In addition, the ability to store JSON data natively means that applications can serve up data directly from the database rather than needing to go through some intermediary application that creates JSON data from relational tables. Also, one of the benefits of JSON is that it can be far more efficient in terms of storage and the SQL needed. The concept of NoSQL is that applications will receive a single record for what they need.

NoSQL’s efficiency

Think about a website designed for people to shop for and buy a car. There are a lot of features that will be consistent among all cars. Attributes such as model, horsepower, and torque. But what about all the extras such as four-wheel drive, all-wheel drive, automatic cup holders, eight possibilities for types of back-up sensors, and dozens of trim levels, not to mention electric or hybrid cars and their differences? A large number of lookup tables and quite a few joins, or just one massively wide row, would be necessary to handle the tracking of all those different variations.

Moreover, every time a new feature is added, the developer would need to adjust the schema or add in new records to the lookup tables. With change management systems that can take days to make schema adjustments, an organization may not be able to afford such a delay. In a NoSQL—JSON—table, the programmer just adds in the new feature as an attribute to those cars for which it is relevant, and the DBA doesn’t need to do anything—no change to the schema is necessary. All the programmer has to do is return the single row that describes the car with all its attributes and display it however he or she wants it to appear on the site.

On the other hand, there are plenty of cases in which people will be storing only six possible attributes for a row, and there may be tens of millions of records. Trying to sort and search through those records in a JSON table is going to be tedious and time-intensive as opposed to using a traditional relational database management system (RDBMS). There are also many cases in which people shoehorn relational data into a nonstructured database, but that practice is wasteful and slow. Consider an example for the car shopping website that demonstrates a good use of JSON:
 

{"model":"Sports Car",
"make":"Model B",
"BHP":"638",
"Special_cash_back":"$3,000",
"CeramicBrakes":"Yes",
"CarID":123643634,
"CaliperColor":"Red"
}
{"model":"Electric Car A",
"make":"Model Z",
"BHP":"248",
"DealerRestrictions":"Can't buy in showroom",
"DefaultEngine":"3-phase, 4-pole, induction electric motor",
"CarID":1236415145,
"BatteryCount": 6,831,
"RangePerCharge":"58 Miles"
}

Now take a look at a traditional relational record expressed in JSON that offers a less-efficient use of the NoSQL methodology because every record in the table has the same fields:

{"First_Name":"Tom",
"Last_Name":"Beebe",
"City":"Annandale",
"State":"Virginia",
"Email":"tom@advancedatatools.com"
}

The DBA’s responsibility

When it comes to JSON and NoSQL database administration, the trick is finding the right tool for the job. A DBA’s job is to figure out the correct way to tell the programmers to manage the data, and then the DBA needs to work with the programmers to find the best balance between development speed and flexibility with data integrity and consistency.

Please share any thoughts or questions in the comments.