Db2

DB2 is one of IBM's families of relational database management system (RDBMS) (or, as IBM now calls it, data server) software products within IBM's broader Information Management Software line.

Installation
DB2 can be installed using the GUI installer (which is the easiest method), or using the command line installer (which is basically a bash script and requires several post-installation steps to configure the environment). The sections below briefly cover typical installations using either method. Please note that the GUI installer has problems with Compiz and X compositing environments; you will need to disable these if you wish to use the GUI installer. Also note that the GUI installer permits selection of packages to be installed; the command line installer simply installs everything (several hundred megs, even for the client)!

GUI Installation
Visit the DB2 Trial page, and follow the links to download the installation tarball for the package you require (you will need an IBM ID for this; registration is free).

Extract and run as root (the following is an example for the 32-bit x86 client package),

In the window that appears, select "Install a product" from the menu on the left, then click on the "Install New" button at the bottom of the right-hand pane. Follow the steps in the wizard that appears. The defaults should be fine for most uses.

Note: If you are intending to build anything against the DB2 libraries (this includes things like building the db2 functions in PHP with the USE-flag, or compiling the DB2 Python and Ruby drivers) you will need to select a "Custom" installation type on the third page of the wizard then on the fifth page ("Features"), select the "Base application development tools" item under the "Application development tools" category, which is not checked by default. This option includes the headers necessary for compiling the aforementioned drivers. All other defaults should be fine.

Command Line Client Installation
Visit the DB2 Trial page, and follow the "DB2 Client" link to download the installation tarball for the package you require (you will need an IBM ID for this; registration is free). Please note that despite the title of the page being "DB2 for Linux, UNIX and Windows Trial Downloads", the client is actually free and not a trial.

Extract and run as root,

Create the group and the user which will own the instance (yes, even clients need an instance), and then create the instance.

In order to add new nodes and databases to the client's catalog you need SYSADM authority. Currently only db2inst1 (as the only member of the db2iadm1 group) has that authority so to add new nodes and databases to the catalog you'd need to. Obviously that's a bit of a pain for such an innocent task (on a client anyway). Moreover, since this a client instance instead of a server instance, there's no real security risk involved in giving yourself SYSADM authority. So, add your own user to the db2iadm1 group as well:

Command Line Server Installation
Visit the DB2 Trial page, or the DB2 Express-C page, and follow the links to download the installation tarball for the package you require (you will need an IBM ID for this; registration is free). Please note that despite the title of the page being "DB2 for Linux, UNIX and Windows Trial Downloads", the Express-C server is actually free and not a trial. Also note that as of v9.5, only 64-bit platforms are officially supported for server instances (32-bit versions are available "for testing and evaluation purposes only"). However, as Gentoo isn't a supported platform anyway this doesn't really matter!

Extract and run as root,

Create several groups and users which will be used to run various DB2 processes (don't try and run DB2 servers as root, or use IDs potentially used by other services like nobody; always use dedicated users as shown here):


 * db2inst1 (primary group db2iadm1) will be the database instance owner. This is the user that the database server will run under, and any members of the group will have full SYSADM authority within this instance.
 * db2fenc1 (primary group db2fadm1) will be the "fenced procedure" user which is used for running external-code (non-SQL) stored procedures and functions outside the database server process. Such functions can also be run "unfenced" within the database process for improved performance, but at a greater risk to reliability and/or security in the event or a problem.
 * dasusr1 (primary group dasgrp1) will be the administration instance owner. The administration instance is an optional facility which provides services like scheduled tasks to the DB2 Control Center.

Create a mapping for the database server port in /etc/services

Create a database instance owned by the db2inst1 user created above, using the db2fenc1 fenced procedure user also created above, and hosted on the db2inst1 port added to services. Also create an administration instance owned by the dasusr1 user created above.

Switch to the database instance owner, start the instance (if it wasn't already by db2icrt), and attempt to create the SAMPLE database to test the installation,

Cataloging Connections
Every DB2 client instance (including the client bundled with a server instance) has several "catalogs" which hold information required for connecting to DB2 databases. The two most commonly used catalogs are the "node catalog" (which holds details of DB2 server instances) and the "database catalog" which holds details of databases hosted by nodes. To query and configure the content of these catalogs, first switch to the instance owner (or another user with SYSADM authority), which will be db2inst1 if you've followed the installation instructions above,

You can check the content of the node catalog and database catalog with the following commands,

To add entries to the node directory, you need to know the node's hostname or IP address, and what port it is listening on (by default this is 50000). The nodename in the command below has a maximum length of 8 characters, and is arbitrary (it is simply the name you wish to use to refer to the node).

To add entries to the database directory, you need to know the name of the database and which node is hosting it. The dbname in the command below is the actual name of the database (maximum length 8 characters), while the alias is arbitrary (and also has a maximum length of 8 characters).

On some older versions of DB2, you may need to do the following to refresh the catalog cache before attempting to use the new catalog entries:

Adding entries to the node or database catalogs doesn't actually cause the DB2 client to connect to anything or check that the parameters you specify are correct. You will only discover if the parameters you have entered are correct when you attempt to connect to a database on the node, hence testing newly added entries is important. To connect to a database which has been added to the database catalog, use the following syntax,

If you omit the syntax you will be prompted for the password. If you also omit the syntax an "implicitly authorized" connection will be attempted (it will attempt to connect as your currently logged on username - this is only really useful when working locally on a server as remote clients aren't trusted by default). Finally, to disconnect after a successful connection test,

To remove entries from the catalogs, use the following commands,

Configuring PHP for use with DB2
Enable the USE-flag for, source the db2profile script from the instance owner's sqllib directory (the instance owner will be db2inst1 if you've followed the installation instructions above), and (re)emerge ,

Any time you rebuild you will need to  first or the build will fail. Hence, you may wish to add this line to the end of root's script to avoid having to remember in future. Next, if you haven't already got it, install to get access to PECL packages, then install the ibm_db2 PECL package,

The installation will ask (in a horribly confusing manner) for the installation path. You have to enter for the option you wish to change, then specify the DB2 installation path ( by default). To activate the ibm_db2 package, add the following lines to your and  files under the relevant sections:

You should now have access to the ibm_db2 db2 prefixed database functions in PHP, which you can test with the following script, db2test.php

If you wish to use DB2 via PHP Data Objects (PDO) you will need to install PDO itself (if you haven't done so already), then install the DB2 PDO driver,

Again, the installation will ask (in a horribly confusing manner) for the installation path. Once again, enter for the option you wish to change, then specify the DB2 installation path ( by default).

Hints and Tips
The following sections include some hints and tips which may make working with DB2 slightly more pleasant.

Using the DB2 Command Line Processor (CLP) effectively
The DB2 Command Line Processor (CLP) is the main command line interface to DB2 through which you can execute SQL statements, run scripts, perform administration tasks, configure various settings, and obtain information. To new users, especially those familiar with MySQL or PostgreSQL's excellent command line interfaces, the DB2 CLP will feel extremely weird, even primitive. The most obvious difference is that the CLP has no command line history or even editing (okay, that's a slight white lie - it does have these but they're so primitive you don't want to see them ... no, really!).

This is because the CLP doesn't link with the readline library, probably because readline doesn't exist on several of the platforms DB2 is supported on (at this point people who've used the CLP on Windows may pipe up that the CLP appears to have command line editing and history on Windows but actually it doesn't; it's Windows CMD's built-in editing and history support you're seeing there).

Anyway, to learn why this lack of readline support doesn't actually matter you need to learn the multitude of modes in which the CLP can operate...


 * Interactive
 * This is the mode you've probably already discovered. You've run, been presented with a huge chunk of boiler-plate text explaining various options, and been dumped at a "db2 =>" prompt. From here you can run two classes of commands: SQL statements or CLP commands. SQL statements are what you'd expect (the usual INSERT, UPDATE, DELETE, SELECT, and everything else documented under the SQL section of the InfoCenter). CLP commands are DB2 specific commands typically used to perform various administrative tasks like configuring database parameters, backing up a database, or rapidly bulk-loading data (these are documented under CLP Commands in the InfoCenter, or you can obtain a list by entering at the CLP prompt).


 * By default the CLP assumes one statement per line (no statement terminators). While this is fine for simple use, it's generally more useful to run the CLP as so that it requires semi-colon as a statement terminator, allowing you to easily paste in multi-line statements or even entire scripts with a middle-click in the terminal window (you'll come to discover this is all the interactive mode is actually good for).


 * Scripted
 * In this mode the CLP simply runs a script non-interactively from a specified file. The CLP defaults are quite annoying in this mode, so I'd recommend that a typical invocation in this mode use options similar to, where:
 * indicates that the statements are semi-colon terminated,
 * indicates that statements should be echoed before execution (without this knowing which statement failed, if any, can be nightmarish),
 * indicates that execution should stop immediately on an error (without this the script will just plough on to the end), and finally
 * indicates that the CLP should read commands from the file script.sql instead of stdin.


 * Shell
 * This final mode introduces several fascinating capabilities. You can run commands via the CLP straight from your bash prompt (or whatever shell you favour) simply by passing them as an argument (or arguments) to the CLP. The interesting bit is that the CLP preserves connections between invocations, allowing you to take advantage of the shell's own history and editing capabilities for interactive usage, and scripting and control flow capabilities for scripted usage. The rest of this section will be devoted to covering the shell mode as this is by far the most useful mode, but it takes some getting used to!

For starters take a look at the following typical (but very brief) shell mode session:

db2inst1 ~ $ db2 CONNECT TO SAMPLE Database Connection Information Database server       = DB2/LINUXX8664 9.5.2 SQL authorization ID  = DB2INST1 Local database alias  = SAMPLE db2inst1 ~ $ db2 "SELECT COUNT(*) FROM EMP" 1          ---          42   1 record(s) selected. db2inst1 ~ $ db2 CONNECT RESET DB20000I The SQL command completed successfully.

Note that after the initial "CONNECT TO" invocation, the connection is still open & available to the next statement, and is only closed by the eventual "CONNECT RESET" command (which is DB2-ese for "commit & disconnect"). The reason this works is that the CLP is lazy; it doesn't do any work itself. Whenever you tell the CLP to do something with a connection (e.g. open it, close it, execute statement over it, etc.), it actually asks a background process called db2bp to do it on its behalf. If the CLP can't find an active db2bp process, it silently starts one. The db2bp process performs the requested action, and passes the results back to the CLP which displays them. The CLP process then dies, returning control to the shell, while the db2bp process hangs around holding the connection open for subsequent commands.

You can see the db2bp process using, and you can ask the CLP to terminate any associated db2bp process with the (ominously named) "TERMINATE" command, as demonstrated in the following session (note that db2bp persists even after closing the connection - an explicit TERMINATE command is required to kill it):

db2inst1 ~ $ pgrep -l db2bp db2inst1 ~ $ db2 CONNECT TO SAMPLE Database Connection Information Database server       = DB2/LINUXX8664 9.5.2 SQL authorization ID  = DB2INST1 Local database alias  = SAMPLE db2inst1 ~ $ pgrep -l db2bp 14162 db2bp db2inst1 ~ $ db2 CONNECT RESET DB20000I The SQL command completed successfully. db2inst1 ~ $ pgrep -l db2bp 14162 db2bp db2inst1 ~ $ db2 TERMINATE DB20000I The TERMINATE command completed successfully. db2inst1 ~ $ pgrep -l db2bp db2inst1 ~ $

In the first session above, note that I was careful to quote the command. As this command includes an asterisk, any shell on Linux will try and expand this to the name of all files in the current directory ... probably not what you intend (and rather different from DB2's behaviour on Windows where this would work because file globbing is left up to individual applications there)!

Likewise, you'll run into similar problems for parentheses (which bash will try and interpret specially), and file redirection operators like bar (|), less-than (<), and greater-than (>). You may find it simpler to just get in the habit of quoting anything you pass to the CLP at the command line. Always use double quotes to surround commands in these cases otherwise you'll wind up having to escape things if you have any string literals in your command (since in SQL, strings are always surrounded by single quotes).

Something else to be aware of when using the CLP (in any mode as it happens) is that by default the CLP operates in auto-commit mode: after each and every statement is executed an implicit COMMIT is performed. Frequently this is not desired (especially considering that DB2 implements transactional DDL, so it's possible to do interesting things like create a table, insert data into it, then rollback the transaction undoing everything including the table creation).

The CLP can be told not to auto-commit by passing it the option. However, this gets slightly unwieldly in shell mode where "+c" would have to precede every single command executed. Instead you can make this the default by using the DB2OPTIONS environment variable which sets the default command line options for the CLP. For example,

If you recall some of the options from above,


 * will cause the CLP to require a semi-colon terminator for commands
 * will switch off the auto-commit behaviour
 * will tell the CLP to terminate a script immediately if an error occurs

Note that although we've set "-t" in DB2OPTIONS, the CLP won't require semi-colons at the end of commands you execute in shell mode. This is because the last command in a script doesn't need terminating (think of the semi-colon more as a command separator than a command terminator), and in shell mode you're essentially running very short one line scripts. Moreover, it doesn't allow you to execute multiple commands in a single invocation in shell mode. The reasons for this are too complicated to go into here, suffice it to say that the CLP parser is seriously weird, and you're essentially limited to one command per invocation in shell mode (if you want to know more, read this post from the comp.databases.ibm-db2 group).

The "-s" option is also obviously for the benefit of script mode only (and would probably be intensely annoying in interactive mode where at the first error, the CLP will terminate back to the shell!). So, the above set of options is intended for those who will primarily use the shell and scripted modes of the CLP, who are happy performing manual COMMITs (or ROLLBACKs), and who use semi-colon delimiters in their scripts. I suggest you tailor the options to your own preference (you can find all the available CLP options in the InfoCenter). You may also wish to add this export to the end of your script so that they're always set when you login.

The next CLP trick to learn is the "-x" option which causes the CLP to omit the usual header and footer from query results. Here's a brief session demonstrating the difference in output:

db2inst1 ~ $ db2 "SELECT COUNT(*) FROM EMP" 1          ---          42   1 record(s) selected. db2inst1 ~ $ db2 -x "SELECT COUNT(*) FROM EMP" 42 db2inst1 ~ $

Obviously this can come in rather handy when capturing the output of a query into an environment variable:

db2inst1 ~ $ rec_count=$(db2 -x "SELECT COUNT(*) FROM EMP") db2inst1 ~ $ echo "The EMP table contains ${rec_count} records" The EMP table contains         42 records

It's also rather handy when you want to run some SQL conditionally, as you can use bash's syntax to handle the conditionals. Consider the following script, which will only create the table "FOO" if it doesn't already exist:

conditional_create.sh

Some things to note about the above script:


 * The script starts by sourcing the script. In case you missed this in the sections above, this is required before running any DB2 command line utility in Linux/UNIX (for users familiar with DB2 under Windows, this is the equivalent of "db2cmd" on that platform). The "db2profile" script sets up the environment (PATH, LD_LIBRARY_PATH, etc.) for running DB2 commands, and compiling applications which link with the DB2 libraries. You probably want to add this line to the end of your  script for convenience (note that db2profile must be sourced, not just executed, as it needs to alter the environment of the current bash process, not a sub-process).
 * The script checks whether the CLP's exit code is greater than or equal to 4. The exit code of the CLP is a sum of codes (strictly speaking, a "bitmap" of codes) where anything less than 4 indicates that execution was successful although potentially with warnings. The individual codes which are summed to produce the exit code are as follows:
 * 1 = no rows affected by a SELECT/UPDATE/DELETE
 * 2 = warning
 * 4 = database error
 * 8 = CLP error

Finally, this capability also makes it easy to use SQL to generate SQL. For example, there's no statement in DB2's SQL dialect for granting SELECT authority on all tables in a schema to a user. However, we can work around this fairly easily by using a query to generate a whole bunch of GRANT statements from a SELECT on the system catalog, and then pipe this back into another CLP invocation running in interactive mode. The session below first demonstrates what the SELECT statement produces, and then shows what happens when this output is piped into another CLP invocation:

db2inst1 ~ $ db2 -x "SELECT 'GRANT SELECT ON TABLE DB2INST1.' || TABNAME || ' TO USER FRED;'  > FROM SYSCAT.TABLES > WHERE TABSCHEMA = 'DB2INST1'  > WITH UR" GRANT SELECT ON TABLE DB2INST1.ACT TO USER FRED; GRANT SELECT ON TABLE DB2INST1.ADEFUSR TO USER FRED; GRANT SELECT ON TABLE DB2INST1.CL_SCHED TO USER FRED; GRANT SELECT ON TABLE DB2INST1.DEPARTMENT TO USER FRED; GRANT SELECT ON TABLE DB2INST1.DEPT TO USER FRED; ... lots more output chopped ... db2inst1 ~ $ db2 -x "SELECT 'GRANT SELECT ON TABLE DB2INST1.' || TABNAME || ' TO USER FRED;'  FROM SYSCAT.TABLES WHERE TABSCHEMA = 'DB2INST1'  WITH UR" | db2 -t +p DB20000I The SQL command completed successfully. DB20000I The SQL command completed successfully. DB20000I The SQL command completed successfully. DB20000I The SQL command completed successfully. DB20000I The SQL command completed successfully. ... and so on ... db2inst1 ~ $ db2 COMMIT DB20000I The SQL command completed successfully.

Several things to note about the session above:


 * It demonstrates that the CLP is quite happy with multi-line input (and bash is quite good at handling multi-line command editing).
 * After typing the initial query and testing that it produced the output I wanted, I simply recalled it by pressing the up cursor key, and added the piped invocation (this is why the second command doesn't include the secondary ">" prompt at the start of each line). In other words, this is a simple demonstration of history recall and command line editing with the CLP (or more precisely, with the CLP under bash).
 * The parameter for the CLP invocation on the receiving end of the pipe merely suppresses the huge boiler-plate text that the CLP usually prints at the start of interactive mode.
 * The parameter for the CLP invocation on the receiving end of the pipe wasn't strictly necessary as DB2OPTIONS contained.
 * The COMMIT command run at the end was necessary because DB2OPTIONS contained, disabling auto-commit.


 * Using rlwrap

One way to add GNU readline capability to the CLP is to use the rlwrapwrapper. The description from the rlwrap website says

"rlwrap is a wrapper that uses the GNU readline library to allow the editing of keyboard input for any other command. Input history is kept between invocations, separately for each command; history completion and search work as in bash and completion word lists can be specified on the command line."

It should be fairly easy to download and compile.

The only caveat with using this with the CLP is that it starts a new "db2bp" session every time you invoke it. You will need to reconnect to the database (connect to ) every time you start it, and you should end each session with "terminate". Otherwise it will keep the "db2bp" session running, which you will *not* reconnect to the next time you run the CLP; however, you can connect to the lingering "db2bp" by running the command without using the rlwrapwrapper).

Starting and Stopping DB2 (incl. initscript)
DB2 doesn't install an initscript for stopping and starting it. Starting and stopping the database instance is done with the following commands, as the database instance owner,

Likewise, the following commands are used to start and stop the administration instance, as the administration instance owner,

Traditionally, DB2 handles auto-starting at boot time by adding a line to the configuration starting a db2fmcd process, the "DB2 Fault Monitor Coordinator Daemon". This in turn started db2fmd ("DB2 Fault Monitor Daemon") processes, one per database instance configured to auto-start (see below). Each db2fmd process starts it's corresponding database instance, and restarts it in the event of a crash. Likewise, db2fmcd restarts any db2fmd process that crashes, and of course init restarts db2fmcd if it dies. With three layers of auto-restarting processes this is a configuration intended to keep the database running at all costs. However it can be a bit difficult to work with - sometimes it proves insanely difficult to shut down all running DB2 processes (e.g. when installing a fixpack), and of course the procedure of shutting down and restarting things doesn't rely on a nice simple initscript - it's all custom DB2 commands.

If you have a production system that absolutely positively must stay running, you should stick with the traditional configuration (if only because IBM support will probably whinge at anything abnormal - but then, if you're running Gentoo you're unsupported anyway). On the other hand, if you'd much prefer a nice simple initscript and you don't mind handling any crash without the "assistance" of the DB2 Fault Monitor Daemons (although crashes are extremely rare in my experience), then read on:

Firstly, check for and remove (or comment out) any db2fmcd lines at the end of. You may find there's no such line anyway (sometimes the installation doesn't seem to add it although I'm not clear under what circumstances). Next, switch to the database instance owner, and set the instance to auto-start (the initscript below only starts instances explicitly set to auto-start, although the stop action tells all DB2 instances to stop, forcibly if necessary, to ensure safe system shutdowns and reboots),

(The db2iauto command simply sets the DB2AUTOSTART instance-level registry setting to "YES" - you can see this by running as the database instance owner). Finally, copy the following script into and remember to set its execute bits. Then, if you want DB2 to start and stop along with your other default services,

/etc/init.d/db2

Troubleshooting
The comp.databases.ibm-db2 newsgroup (frequented by several DB2 developers and numerous long time users), is an excellent source of help. For those looking for something more immediate, try the #db2 channel on Freenode (but please note it's a fairly quiet channel and it can take some time for the regulars to notice someone's said something!).