''Bringing some sanity to database maintenance.''
XML to DDL is a set of python programs to convert an XML representation of a database into a set of DDL (Data Definition Language) statements.
In addition XML to DDL can examine the difference between two XML files and output a sequence of ALTER statements that will update the database to conform to the new schema.
Finallly, XML to DDL can generate HTML documentation of your schema.
You can find more information and download files at the Berlios page
The following is a simple schema XML definition of a database:
<schema> <table name="students" fullname="List of Students" desc="List of students with their full names"> <columns> <column name="id" fullname="Primary Key" inherits="index" key="1" desc="Primary key for the table"/> <column name="student_name" fullname="Student Name" class="name" desc="The full name of the student Can this span multiple lines?"/> </columns> </table> </schema>
Here we run the program indicating output for PostgreSQL:
python xml2ddl.py --dbms postgres schema1.xml
We get the following output:
DROP TABLE students; CREATE TABLE students ( id integer, student_name varchar(80), CONSTRAINT pk_students PRIMARY KEY (id)); COMMENT ON TABLE students IS 'List of students with their full names'; COMMENT ON COLUMN students.id IS 'Primary key for the table'; COMMENT ON COLUMN students.student_name IS 'The full name of the student';
If we run the program again for Firebird:
python xml2ddl.py --dbms firebird schema1.xml
we'll get different output:
DROP TABLE students; CREATE TABLE students ( id integer, student_name varchar(80), CONSTRAINT pk_students PRIMARY KEY (id)); UPDATE RDB$RELATIONS SET RDB$DESCRIPTION = 'List of students with their full names' WHERE RDB$RELATION_NAME = upper('students'); UPDATE RDB$RELATION_FIELDS SET RDB$DESCRIPTION = 'Primary key for the table' WHERE RDB$RELATION_NAME = upper('students') AND RDB$FIELD_NAME = upper('id'); UPDATE RDB$RELATION_FIELDS SET RDB$DESCRIPTION = 'The full name of the student' WHERE RDB$RELATION_NAME = upper('students') AND RDB$FIELD_NAME = upper('student_name');
The example shows a feature of XML to DDL, database independence. Currently the program supports the Firebird, PostgreSQL, and MySQL databases, but more will probably become available.
Another key feature is the ability to examine the changes done to the XML and generate the DDL statements necessary to perform the changes to the database. If this is a new XML schema:
<schema> <table name="students" fullname="List of Students" desc="List of students with their full names"> <columns> <column name="id" fullname="Primary Key" type="integer" key="1" desc="Primary key for the table"/> <column name="student_name" fullname="Student Name" type="varchar" size="80" desc="The full name of the student"/> </columns> </table> </schema>
Running this program:
diffxml2ddl.py --dbms postgres schema2.xml schema1.xml
Produces the following DDL output:
ALTER TABLE students ALTER student_name TYPE varchar(100); ALTER TABLE students ADD email varchar(100); COMMENT ON TABLE students IS 'List of students';
However, an older version of PostgreSQL doesn't support altering the column type:
python diffxml2ddl.py --dbms postgres7 schema2.xml schema1.xml
The a temporary column needs to be created, the data copied over and the old column dropped:
ALTER TABLE students ADD tmp_student_name varchar(80); UPDATE students SET tmp_student_name = student_name; ALTER TABLE students DROP student_name; ALTER TABLE students RENAME tmp_student_name TO student_name; ALTER TABLE students DROP email; COMMENT ON TABLE students IS 'List of students with their full names';
If you find yourself repeating the same attributes in your XML schema over and over you can put these in a dictionary:
<dictionary name="column"> <dict class="key" name="id" fullname="Primary Key" type="integer" null="no" key="1" desc="Primary key for the table" /> </dictionary>
In this example we are telling the parser that the dictionary is for the nodes called column and when it sees the class key, it should put in the the other attributes listed. So using this dictionary this would be equivalent::
... <columns> <column class="key"/> </columns> ...
as:
... <columns> <column name="id" fullname="Primary Key" type="integer" null="no" key="1" desc="Primary key for the table"/> </columns> ...
In addition you can override any attributes in the dictionary, for example this:
... <columns> <column class="key" name="student_id"/> </columns> ...
would then be equivalent to:
... <columns> <column name="student_id" fullname="Primary Key" type="integer" null="no" key="1" desc="Primary key for the table"/> </columns> ...
The dictionaries can also support multiple inheritance through the inherits attribute. Here's a rather contrived example:
<dictionary name="column"> <dict class="index" type="integer" null="no"/> <dict class="pk key="1"/> <dict class="key" inherits="index,pk" name="id" fullname="Primary Key" desc="Primary key for the table"/> </dictionary>
Some of the attributes in the XML are used solely for documentation purposes. For example, fullname has no equivalent in most DBMSs. Another, it deprecated which indicates that a column or table should no longer be used, but hasn't been deleted yet.
Here's how to output the HTML document:
python xml2html.py --file schema.html schema.xml
Storing the schema in this form has some advantages:
Here are the major directions I see XML to DDL going: