New York
home > Database Email

Working with data - databases.
On this page: Other pages:
* Intro
* client-server
* olap oltp
* databases
* tools
* old, mainframe
* Teradata
* Oracle
* DB2
* Sybase
* Microsoft
* mysql
Intorduction home - top of the page - email

Database usually is a collection of tables (and tables, and tables, and tables) - which constitute "relational" database (relationships between tables).  There are also alternative approaches, for example  uses objects instead of tables.

* - databases on Google.
* - databases on Google.
* - portal on DB technologes
*,281922,,00.html - some good articles

Most databases are relationaland and understand standard language - SQL (Structured Query Language). SQL queries let you select, insert, update, find out the location of data, and so forth. See more at sql.html.

Here is the timeline (shows only some databases)
1970 - Dr. E.F. Codd formulated concept of a relational database. ( explained here)
1976 - IBM developed a prototype relational database management system and language (System/R).
1977 - Larry Ellison founded Oracle.
1979 - Oracle Corporation introduced the first commercial SQL database (written in assembler for PDP-11). Teradata is founded.
1983 - IBM released the first version of DB2 (for mainframe - MVS).
1987 - Sybase releases its first commercial DBMS product ("DataServer") for Sun workstations (unix).
    Sybase partner with Microsoft (divide the market - MS - on Windows & OS/2, Sybase - on Unix & VMS).
    Microsoft also parners with Ashton-Tate (DBase).
1989 - Microsoft SQL Server Version 1.0 ships
1991 - NCR (AT&T) acquires Teradata
1993 - Microsoft and Sybase end their partnership
1995 - Sybase - SQL Server 11. Sybase merges with PowerSoft (PowerBuilder).
          Microsoft SQL Server 6.0. DB2 for OS/2 and for AIX
1996 - Sybase Adaptive Server Enterprise 11.5, Microsoft SQL Server 6.5.
1998 - Sybase v.11.9.2. Oracle 8.  Microsoft SQL 7.
2000 - Sybase 12. Oracle 8i. Microsoft 2000. DB2 ver.7.
2001 - Oracle introduces RAC (Real Application Clusters)
2007 - Teradata is separated form NCR
2008 - Sybase 15. Oracle 11. Microsoft SQL Server 2008. DB2 ver. 9.

As of today (January 2008) Sybase is still a very common database on Wall Street. Reasonable question may be why Sybase became so common, when Oracle and IBM had their products on the market much earlier ?".  My guess is that Sybase had Client-Server architecture on Unix at the time when financial companies were starting to use Unix.

Client-Server home - top of the page - email

Database "server" can communicate with many "client" applications (2 layer Client-Server).

There can be also some middle layer(s) - very common is a 3-tier architecture.
  - Front-end (text, forms, etc.) - browser,  Java application, MS Excel, etc.
  - Middle - programms connecting the front-end with the back-end.
  - Back-end - data (database, files).

Fat client - lots of application code and some data are stored on the client side
Thin client - code and data are not stored on the client. They are loaded fromthe server - and run on the client (browser).

There can be many layers (more than 3). Application can be distributed between different locations, use different databases, different clients, and different servers communicating with those databases, clients, and other servers.

In distributed environment of today's large financial institutions, people often use a messaging system (like MQSeries from IBM) and code messsages in XML. Many claim that messaging is much more convenient to use than CORBA or Orbix.

To communicate with the database from your application, you need drivers. Each database has each own drivers for different languages.  Often it is a good idea to use standard drivers:
* ODBC (Open Database Connectivity) - standard and open API for database access (Windows).
* JDBC (Java Database Connectivity) - API for database access from Java

OLTP & OLAP home - top of the page - email

 OLTP - Online Transaction Processing
   - used to update database, typically high volume, response time is important
 OLAP - Online Analytical Processing (also called DSS - Decision Support Systems, EIS - Enterprise Information Systems)
   - used to help review and analyse information in the database. Typically doesn't update database. Queries may be complicated and long.
     Another name of this system type is DSS (Decision Support System). Related terms - "data mining" and "data warehouse" - check their exact meaning on
* - Data Warehousing for Cavemen
* - Bernard Lupin's OLAP site/tutorial
* - The OLAP Report
* - OLAP and DataMining: Bridging the Gap 
* - OLAP Council at 888-818-OLAP 
* - Microsoft SQL Server OLAP Services 7.0 OLAP Manager Tutorial
* - How CRM Differs From OLAP

Since ~2005 - new generation of databases for data analytics:

New grid analytic databases

  • designed to work on a grid of computers - massive parallelism and extensibility
  • optimized for massive loads and fast data retrieval
  • NOT optimized for row-by-row operations (deletes/updates transactions). In fact, architecture is "by column".

Cells of data in a table - [row_id,column_id,value].

  • In classical relational database data is stored on disk by row_id, and external index structures are created to enable searching by column values.
  • In new "by column" architecture data is stored on disk by column, and each column is sorted/indexed by its own value (as if it has built-in "clustered" index). To work with rows we need tobe able to connect cells in "columns" by row_id. Thus each column is optimized to search by value and by row_id.


 REPLICATION (Replication Server) - software technology to copy data from one database (primary server) to another (secondary) in more or less real time. Sybase has a very good replication server. There are 2 sides to replication process: to send information from the primary server (this takes ~ 10% of server load) - and to catch and update on the secondary server (this may put significant load on the secondary server, because you have to do inserts and deletes).

Different databases - and how to use them home - top of the page - email

*  - all databases
*  - several excellent books and a lot of discussions
* - Database Jump Site

* - DB2 universal database (Unix, NT, Mainframe, ...) - read on db2.html page.
* - very popular database - see oracle.html page
* - see sybase.html page.
* - for MS SQL server (also Exchange and Access) - see ms_sql_server.html page.
         ( - - Microsoft Visual Fox Pro )
* - Teradata - datawarehouse engine.

* -
* - object database (C++ or Java programming)

* = - (see also mysql.html) good, fast, support regular expression search, doesn't support transactions
* - free database with transactions (a part of Linux distribution)
* - Illustra database

* -
* - InterSystems Corp. - Cache - good, and very fast
* - good, but slow
* - good sql database

Some older technologies:
* - read about Clipper and Btrieve
* - "Pervasive.SQL" - developed from "Btrieve" and "Scalable"

Tools home - top of the page - email

* - PLATINUM ERwin from Computer Assotiates - a database design tool.
* - Embarcadero - creators of DBArtisan and RapidSQL. Excellent tools !!
   ( read reviews on this site:  - DBArtisan and RapidSQL )
   Work with all modern databases, allow to transparently copy objects between databases of different types or versions. - Aqua Studio - supports multiple databases (Oracle, DB2, Sybase, MS SQL, MySQL, Informix, PostgreSQL, etc.

Short recommendatin: - for all major databases (DB2, Oracle, Sybase, MS SQL Server) - use DBArtisan for administration and SQL Navigator or RapidSQL for development.  For MySQL you can use their own command-line client or GUI.

See more offers on pages related to particular databases:  db2 - sybase - oracle - mysql

Old, Mainframe home - top of the page - email

Some old MainFrame technologies:
* - Computer Associates International (bought Clipper, IDMS, Platinum)
* IDMS - - hierarchical database (non-relational)
* ADABASE (sets of tables)
* IMS - hierarchical database (non-relational) - - used by 95% of Fortune 1000 companies. Serves 200 million end users, managing over 15 million gigabytes of production data. IMS processes over 50 billion transactions every day.  IMS still owns the high-volume online transaction and database management environment. IMS comes with JDBC connectors, Java and XML support, gui administration front end, fastpathing, checkpoint recovery.  Why would IBM continue releasing new versions of IMS when they own DB2?  Because IMS is the fastest at transaction processing.  They recently achieved the following: IMS Runs Over 21,000 Transactions per Second on a Sysplex (Nearly Two Billion Transactions per Day). This IMS fact sheet gives some highlights:
* ISAM - Indexed Sequential Access Method (a file access method supporting both sequential and indexed access).
* VSAM - Virtual Storage Access Method - indexed files - check on
* DB2
* DBase - small database for personal computers, produced by Ashton-Tate in 1979 (1984 - DOS version, 1991 - sold to Borland, 1993,95 - DBase for windows shipped by Borland).

Some old languages used with old databases:
* JCL - Job Control Language
* COBOL-2 - Common Business-Oriented Language
* CICS - Customer Information Control System - adds interactivity to COBOL, works with VSAM files, or with databases (DB-2, IMS, IDMS)