What is a database? What are the kinds of databases? Why do you care? A Database is just a place that is used to so store and organize data (information). Your address book, a spreadsheet, basically anything that has a lot of similar (like) elements, is the basics of a database. In computers, they're used all over the place. And there are so many different ways of organizing that data, that we invent a whole lingo franca around database terminology. This tries to demystify some of the basic terms.
- 1 Kinds of DB's
- 2 Indexing
- 3 SQL
- 4 Relating Tables
- 5 Reporting
- 6 Conclusion
- 7 References
Kinds of DB's
There is a lot of jargon and a whole lingo to Databases. Yet the concepts are pretty easy.
What is a Database?
It is just a place that is used to so store and organize data (information). In computers, usually this is a special program to organize data, and your data itself. But in the real world, it is often just a book or ledger; like phone book, check book and so on.
What are the kinds of Databases?
There are a few kinds of databases; flat, relational, and object oriented.
A Database is just a place that is used to so store and organize data (information). In computers these are programs that go by the catchy acronym; DBMS (Database Management Systems). The data is arranged in tables with columns of attributes and rows of information. Sometimes there are multiple database tables that relate (point) to each others information. These go by the even catchier acronym of RDMBS (Relational Database Management Systems).
What is a flat database
A flat database is usually just one big "table"; like a ledger or checkbook. Databases are usually arranged this same way. There are rows and columns.
You have as many columns (attributes) for a row (entry) as you need to clearly organize your data. A column (or field), is where you break down "attributes" for each entry, like description in one column, check number in another, and something like a deposit or debit in a third column. A separate row (or record), is used for each entry. That way you can look up each entry as a separate line (or unique item) in your ledger.
What is a Relational database
Relational databases just take this to the next level. Instead of a single table or ledger, you have many tables (ledgers) that relate (or have a relationship) to each other.
Imagine your checkbook as our database. Normally when you write a check, you have to write down all the information (Name, Address, Phone#), every time you write a check to them. This means you are frequently re-entering the same information; since many of your checks go to the same places each month.
If you have a separate ledger (address book) of the companies that you commonly deal with, and that has their name, address and phone# in it, then when you write a check in your checkbook, you could just enter the row# (ID number) of that company, instead of writing out all that information each time. Your check ledger (table) has a "relationship" with your address ledger. That's a relational database.
This design can save you time in data entry since you don't have to reenter everything each time. And it can save you storage space since that one reference to the address book is "smaller" than adding all that redundant information each and every time. And lastly, when you make a correction to something in your address book, like they change their phone number, then instantly it is fixed for the entire checkbook; all the checks that point to (reference) that address will get the reflected correction. If you have that "flat file" way, and the information in every entry, then a change to something like the phone# would require you to correct it in every place that item was entered. So changes are easier and work better in relational databases, and you get less degradation over time.
What is a Object-Oriented Database
Object-Oriented Database take these concepts a little further still. Instead of breaking down tables as just relationships, there is often a new type of relationship; inheritance. The idea being that some types of data really borrow (inherit) most of their behaviors from their parents, for example; a business address and a home address are both addresses (and could inherit from that base type) but there are some slight differences; like business addresses may have the company as well as the addressee, a mail station, department, or office number, and people use a full 9 digit postal code, and so on. But most of the information structure (name, street address, street name, city, state, etc.) is the same.
When you inherit, again it can make some things easier to maintain. If I make a change to one "parent", like make the change in structure from having a 5 digit postal code to allowing the longer 9 digit variant, than everyone that is a "child" (inherits from) that parent, also inherits that new behavior. This makes it easier, in some cases, to maintain and manage that data.
The problem is that inheritance in Object-Oriented database can make design more complex; and more people understand simpler relationships, and those relational databases have been around longer. So most databases are not really Object-Oriented. And Object-Oriented is also a bit of a design philosophy, so can be applied to other databases as well. So it is a blurry line.
That is it for the basic kinds of database.
Data by itself has limited value; the important part is getting out the information that you need, when you need it.
Imagine your address book or the phone book as a database (because it is one). By itself it has some value, you can look up names, as long as you know how to spell the name (as it is in alphabetical order). That natural order of the table (data) is known as the index; it is indexed by name (alphabetically).
What if you want to look up someone by address or phone number? Normally you need another book that has things "sorted" (indexed) by address, or you have to just manually look through every entry until you find an address that matches. Having to manually look at ever element to see if one matches is not a pleasant thought, as this takes a lot of time. And this is true of computers -- so if you're doing a lot of reverse lookups, it sometimes helps to have another copy of the table (or at least indexes), that is indexed (sorted) in the order you want to find things.
Now in databases, you can tell the computer to "re-sort" or "re-index" the database; but this can take lots of time or memory, especially if you want to keep both versions of the data around. (Some want to use the sort by name index, others want to use the sort by number index).
Database designers have a complex problem; how do you make queries (ask complex questions) of your database that your database programmers hadn't even thought of... and get the answers you want back?
In the end, they had to create a whole language for databases so that you could make those queries. The most popular language for doing that is called SQL (Structured Query Language).
What SQL does is allow programmers to ask a database a complex question like; find me everyone that lives at an address that contains the number, street, and city that I request. Or in more geeky terms, you can make queries across multiple columns (attributes), and get some useful rows (results) back. This allows you to mine the database (extract the information you want).
SQL quickly took on more responsibilities, and actually became a language to not only help do queries (searches), but to also add, edit, or create database. SQL is now the defacto language for most database. And of course, there are many flavors and interpretations (dialects) of that language, just to keep things complex and confusing.
Databases often break down data into many tables; and these tables relate to each other. So I can have one table of my products (inventory), another that is the raw materials (or sub parts) that go to make a part, another table that is my customers, and yet another that is my sales.
When a customer buys an end product, I probably need to update many tables; I need to add the customer (if not in there), add a sale item, and reduce the inventory by that one part. That simple action (buying something) triggers many smaller actions; that we may want to group into something we call a "transaction". Or a transaction is just an event that triggers (and is made up of) many smaller actions.
If the customer changes their mind part way through the sale (or the database update), we want to be able to back-out of the whole transaction; because we don't want one table to reflect that there was a sale, but the inventory table to not be updated, or vise versa; the whole transaction needs to work together.
Another potential gain of the database is reporting. A report is often just a bunch of complex queries, with some math, and outputting that in some format that makes sense. So I could do something like look through all my inventory, and see what I have fewer than some number of, to know to start making or ordering more of those widgets before I run out. Or another possibility is to look at all the things sold, calculate how much raw materials was required to make those items, and then order more when we run low. Reports are really what we want to get most of time; what can be very complex queries and calculations, put in a standardized and simple and digestible format.
So while the concepts of Database are simple (just ledgers, or spreadsheets), there are a lot of concepts as these ideas have been maturing for centuries before there was computers to help manage them. Computers just make it easier.
The toughest part of most databases is the data is changing constantly (and growing), and the more data you have, the more you want to explore different aspects of it, and how one thing relates to the others. So quickly, things can grow in complexity to the point where you need a special person just to manage the arrangement, maintenance and mining (reporting) on that data. This specialist is known as a DBA (Database Administrator).
And now you know who to bug if you have any more questions about your databases.