On Relations and Relationships: Sql and the Rise Of Relational Database Technology

E.F. (“Ted”) Codd conceived of his relational model for databases while working at IBM in 1969. Codd’s approach took a cue from first-order predicate logic, the basis of a large number of other mathematical systems, and was presented in terms of set theory leaving physical representation and access implementer-defined. In June of 1970, Codd laid down much of his extensive groundwork for the model in his article, “A Relational Model of Data for Large Shared Data Banks” published in the Communications of the ACM, a highly regarded professional journal published by the Association for Computing Machinery. Over the next few years Codd and his relational ideas blazed across the academic computing landscape.

As noted by Fabian Pascal:

  • formally a relation is a special set of tuples, representing propositions about the real world.
  • informally a relational table can be viewed as representing an “entity type” with rows representing “entities” of that type.

But note carefully that:

  • “entity” has no precise, formal definition
  • “relationship” can and should be regarded as a special case of entity

To support his relational theory Codd developed a language called ALPHA that he often used to communicate ideas in an academic context. However, in 1974 and 1975 Raymond Boyce and Don Chamberlin of IBM designed a new fourth generation language to extract information from systems based on Codd’s relational model known as Structured English Query Language, or SEQUEL. This would later be shortened to SQL but is still most correctly pronounced “sequel.” For better or worse, SQL has become popularly known as the relational language, much to the chagrin of certain luminaries of the database world such as Fabian Pascal, C.J. Date and Codd himself. Although SQL was an obvious improvement over many earlier quasi-languages used to perform database queries like CODASYL (which required complex code to answer even the simplest of questions database), it is not a fully relational or declarative language. SQL does, more or less, allows users to specify the results they want rather than having to specify a physical location, there are significant procedural elements in the language. Ideally, a purely declarative relational language would entirely shield the user from having to figure out the best way to execute the program. As it is, today’s Sql databases will still show wildly divergent execution times for different expressions of the same logic. Nonetheless, the relative scope and elegance has made SQL the benchmarkdatabase query language and has become almost synonymous with the relational model.

Despite all the excitement it was not until 1979 that the first commercial database product to use SQL was released by Oracle, only two years after its founding. This offering was quickly followed by IBM’s SQL/DS product, the forerunner to DB2. By the mid-1980s the relational bandwagon was definitely getting crowded with new companies hawking all sorts of “relational” wares. Not only were DB2 and Oracle significant players in the market but there was also Digital Equipment Corporation’s RDB, Relational Technology’s Ingres along with a host of other lesser-known products. Codd had extended his model further in his aptly titled paper “Extending the Database Relational Model to Capture More Meaning,” published in 1979 in the December issue of the ACM Transactions on Database Systems. However as the marketing departments of commercial database companies increasingly began beating loudly on the relational drum, Codd became increasingly distressed over what he saw as the unfulfilled promise of relational technology. In 1985, Codd, now president of the Relational Institute and with his own consultancy, put forth 12 basic rules plus nine structural, 18 manipulative and all three integrity rules, all of which had to be satisfied for a database to be considered fully relational. More rules would be forthcoming, but Codd assured readers that the current rules would be more than adequate to ensure that a database was “mid-80s” fully relational. Also in this paper Codd clearly demonstrated that no vendor could honestly profess to have a fully relational system. He took the entire industry to task for overstating their conformance to the relational model. He offered a few scathing criticisms of the then current draft of the first ANSI SQL standard as well. In 1989 Codd published his promised revision of the relational model in the book “The Relational Model of Database Management Version 2.” Needless to say, most relational database vendors fared even worse in Codd’s 1989 relational fidelity tests than they did in his mid-80’s tests.

Codd had simplicity as a major objective of his model. Unfortunately, given the depth and complexity of Codd’s thought, not to mention the arcane mathematical terms in which he often expressed himself, many of his key points continue to be misunderstood by practitioners.