TDDB - The Distributed DataBase

TDDB on the Web Official TDDB Discussion Group

What is TDDB?
How to Get TDDB & then install & run it?
How to compile TDDB, and what tools/libraries do I need to compile & run TDDB?
What options can I pass to scons while compiling?
SQL Statements currently supported by TDDB
Data types currently supported by TDDB
Known Limitations/Features/Bugs
Wish list/TODO
Advantages of TDDB
Project Report
FAQ

What is TDDB?

TDDB is an acronym for The Distributed DataBase. Currently, the Distribution bit is unstable, but you can find out more about how TDDB distributes queries here. However, you can use the TDDB Engine & Parser as a normal RDBMS. TDDB is fully ACID compliant & supports Transactions. The list of SQL stataments supported is mentioned below along with a list of limitations and features. TDDB is still under heavy development, so there may be a few bugs. Please let us know if you find any. You can send a mail at dhruvbird@gmail.com or sandesh247@gmail.com or by submitting a Bug/Support/Feature request at http://sourceforge.net/projects/tddb.

How to Get TDDB & then install & run it?

TDDB can be downloaded from CVS or the latest release version from this page. After untaring the sources, you should read the file Readme.txt to find out how to run tddb for the first time. See below for finding out how to compile TDDB.

How to compile TDDB, and what tools/libraries do I need to compile & run TDDB?

TDDB currently works only on x86 & x86-64 platforms running Linux or any other OS that supports the options mentioned below. You will need Scons to compile TDDB.
The various libraries & include files headers needed will be detected by scons automatically. However, here is a list:
- GNU Readline
- Pthreads
- g++ version 4.0 or greater(preferably). Though people have gotten TDDB to compile successfully with lesser versions.
- libstdc++ with the ext headers such as those for hash_map
- Free Hard disk space. At least 300MB.

What options can I pass to scons while compiling?

Here is the list of options, and a brief description of each:
- vdebug=0(default)/1: Specifies whether to enable verbose Debugging output. This is for use mainly by developers, and slows down things by a lot. Don't use unless you want to know where TDDB is creashing.
- release=0(default)/1: Specifies whether to enable(0) or disable(1) all assert() statements in code. Leave this at the default because TDDB is still in Beta stage, and these prevent silly errors from occuring.
- opt=0(default)/1/2: Specifies the optimization level for g++ compiling. A higher optimization level results in faster code, but may occasionally produce incorrect code. Don't use 3 unless you know what you are doing. TDDB has been successfully tested with 1.
- alloc=0/1/2(default): Specifies which allocator to use for memory requests through operator new. The default 2 is the fast Mt_malloc which is a thread-aware allocator, and performs well with multiple threads. 1 is the C-library malloc() memory allocator, while 0 is the Debug allocator which Zeros out memory on Free, but does not actually free the memory. This helps detect memory errors in TDDB.

SQL Statements currently supported by TDDB.
- CREATE TABLE [db_name.]table_name(list of fields [UNIQUE]); Foreign keys or primary keys are not supported as of now.
- DROP TABLE table_name; DROPs a table(deletes it) from the database.
- SELECT field1, field2, field3 AS aliasFN | * FROM table1, table2 AS aliasTN WHERE condition1 [AND|OR] condition2 [AND|OR] conditionN ORDER BY [alias/field](s) [ASC|DESC] LIMIT N; Currently, GROUP BY, HAVING, aggregates, and complex projections such as sum of 2 fields in table(s) is not supported.
- UPDATE table1, table2, tableN SET field1=[value/field], field2=[value/field], fieldN=[value/field] WHERE condition(s); Here, [value/field] can be a constant, a field name, or an arithmetic expression involving multiple fields.
- DELETE FROM table_name WHERE condition(s); Deletes Rows from a table without deleting the table itself.
- SET TRANSACTION ISOLATION LEVEL [READ COMMITTED|REPEATABLE READ|SERIALIZABLE]; Sets the Transaction Isolation Level for the Current Transaction ONLY. If you COMMIT or ROLLBACK the Transaction, then the new Transaction will get the default REPEATABLE READ Transaction isolation Level. Hence, you must Explicitly set the Transaction Isolation Level for each New Transaction. This statement also implicitly COMMITS the currently running Transaction(if possible).
- SET AUTOCOMMIT [ON/OFF]; Sets the AUTOCOMMIT mode to ON(enabled) or OFF(disabled) ONLY for the currently running Transaction. Executing this statement will implicitly COMMIT any previous Transactions(if possible) and start a new Transaction. Note: The default behaviour for the Cleint is to start with AUTOCOMMIT Disabled. This feature(AUTOCOMMIT) is implemented at the client end, and the server does not recognize this statement. If AUTOCOMMIT is enabled, then the client implicitly sends a COMMIT after executing any SQL statement.
- LOCK TABLE(S) table1, table2, tableN IN [READ|INSERT|UPDATE|SERIALIZABLE|EXCLUSIVE] MODE;.
- UNLOCK TABLE(S) table1, table2, tableN;.
- USE db_name; Where db_name is the default DB to use. This command can be used ONLY at the client end. The TDDB server doesn't recognize this command.
- COMMIT;
- ROLLBACK;

Data types currently supported by TDDB.
- INT/INTEGER
- CHAR(N) Max(N) is 248
- VARCHAR(N) Max(N) is 8100
- CLOB

Known Limitations/Features/Bugs
- The INTEGER/INT data type doesn't support negative numbers. Yes, only non-negative numbers can be used. Furthermore, decrementing an INT beyond 0 results in the INT getting the value NULL.
- You can view the list of tables, and the commands used to create them by selecting from the table 'global.tabtab' as: 'SELECT * FROM global.tabtab;'. This command is treated as a DDL statement for compatibility reasons.
- DDL Commands are also Transaction oriented!! Suppose you CREATE or DROP a Table by mistake, you can revert your action by doing a ROLLBACK. However, if you want to make the changes permanant, use COMMIT.
- You can't Mix DML & DDL commands in a single Transaction. Suppose you CREATE a table, you need to COMMIT or ROLLBACK before you issue a DML Command. You may however, issue another DDL Command in the same Transaction such as creating or dropping another table. Don't try to CREATE & DROP the same table in the same Transaction though.
- Maximum Row Size = 8100 Bytes. That is the Sum of all data types in the Row should be not greater than 8100 Bytes. The sizes of the various data types for this calculation are mentioned below:
  
  Data Type Size in Bytes
  
  INTEGER 8
  
  CHAR(N) N+4
  
  VARCHAR(N) 32
  
  CLOB 16
- There is currently no support for Indices, and so each Query processed needs to do a Full Table Scan. Even JOINs are performed using the Nested Loops JOIN algorithm. We are working on getting more efficient JOIN algorithms which don't need Indices or create the Indices dynamically into place, so that JOIN processing can be made faster.
- There is currently NO support for Nested Queries.
- The PRIMARY KEY or the FOREIGN KEY integrity constraints are not currently supported.
- The type checking of the parser is quite weak. That needs to be fixed. For example, certain type conversions are performed silently by the engine without the user ever being notified about them. If the user passes a string in case of an integer, then the string is parsed as an int, and the first few digit characters(if any) are parsed as the integer. For the reverse case, the integer parsed as the string is entered as a string with the representation of the integer.

Wish list/TODO
Things that might find their way into TDDB as and when we get time to work on it.
- GUI Client: Yes!!!! Something similar to MySql Query Browser. And we are thinking of supporting in-grid editing for Queries involving JOINs too!!!!
- Support for Aggregate functions & non-field Projections.
- Support for Indexes.
- Support for more data types, esp, the float, double, numeric, date, time & datetime data types.
- A Query optimizer accompanied with better JOIN algorithms thrown in.
- Automatic Distribution(horizintal fragmentation) of tables across multiple computers running the TDDB Engine, and a co-ordinator(Master) Node. This is what we call Distribution Support.
- Config-file support. Currently, the user has to manually type in the DB file name and the other options. There should be an option for saving these options to a file, and letting the server detect them at startup.
- Support for specifying custom Sizes for the Logging Segment & Dirty Read Segment at DB creating time. Currently, these options are initialized as the defaults.
- Something like an ODBC Driver so that people can easily connect to the TDDB server, and a shared library for the above.
- The TDDB server as a shared library, so that it can be run in a way similar to that of sqlite. sqlite can currently run as a standalone application so that it is easy to embed in applications say as a shared library.

Advantages of TDDB:
Now that I've mentioned many drawbacks/limitations of TDDB, let me highlight a few positives of TDDB ;-)
- TDDB has been a great learning experience for all of us involved in it's making, and we got to learn a lot from it. It is a fairly simple system, and since it has been designed keeping in mind that we would want to add more features to it, it would aid in adding possibly non-standard extensions to it too.
- There is just one file involved, so it is fairly simple to take backups. That one file contains the data as well as the logs, so you don't have to worry about backing up both, and then making sure that they both belong to the same version(in time) of the Database so as to speak. Even if the server shut down incorrectly, it will detect that at startup(since all the information is stored in that single file), and replay the LOGs if required.
Project Report:

You can download my final year project here

Data Type	Size in Bytes
INTEGER	8
CHAR(N)	N+4
VARCHAR(N)	32
CLOB	16

TDDB on the Web Official TDDB Discussion Group FAQ

TDDB - The Distributed DataBase

What is TDDB?

How to Get TDDB & then install & run it?

How to compile TDDB, and what tools/libraries do I need to compile & run TDDB?

What options can I pass to scons while compiling?

SQL Statements currently supported by TDDB.

Data types currently supported by TDDB.

Known Limitations/Features/Bugs

Wish list/TODO

Advantages of TDDB:

Project Report: