VistaDB.Net Logo

VDB4 Filesystem Changes

by Jason Short 18 December 2008

VDB4 Filesystem Changes

VistaDB 4 has been a major undertaking for us over the past year.  We have had some long and heated discussions internally about architecture changes and directions.  One of the largest sticking points for us internally was the file format.  The current VDB3 file format is mostly undocumented and does not meet some of our needs going forward for a Server or Agent system.  We have really struggled with the decision to replace it though because we know users depend on their data being in a stable format.  After exhaustive study of pro and con we have decided to change the format. 

Once we made that decision we quickly added on some other features we consider vital for Server systems, and some we just wanted for robust design reasons.  The list below is by no means complete (and not all of these will be in the first releases), but we wanted to give you some of our thought processes and design descisions.  The hope is that the VDB4 format will now be able to carry us through to the VistaDB Agent and VistaDB Server releases without further modification.  We will also finally be able to release documentation on the format for third party vendors as well.

  • Torn Page Detection - Ability to detect and flag partially written pages within the database.  This is an advanced feature for error checking.
  • Logging - Ability to log certain types of actions to be rolled back outside the context of the calling application.  This becomes more important in a server scenario.
  • Background Maintenance - Ability to run background threads to perform free space consolidation, compacting, FTS Index maintenance, etc.
  • Multi file support - In a Server scenario or clustering it becomes vital that we support the concept of multiple files per database.  Sometimes you do this for speed, sometimes for redundancy.
  • Free space allocation - Growth by a fixed or dynamic ability is important for performance.  The ability to reclaim free space without a pack operation is also vital.
  • LOB data - Large Object Data is stored out of band with the current row.  These large chunks are stored in a separate area of the database.  Right now we have no way to stream load or save these chunks.  The ability to partially load a 1GB file stored as a single column is vital for certain usage scenarios.
  • Page / Extent - Our current page system does not map well with NTFS.  We are actually causing the OS to work a lot harder on small page sizes than really is required.  For some very small space savings the IO cache of the drive may suffer severe degradation in performance.  Page tuning is complicated and we want to simplify it.
  • Automatic vs Manual file expansion- Right now the file format has no way to restrict growth.  This is obviously not a good scenario for hosting companies.  There are many valid situations where providing the database a fixed maximum size is a good thing.
  • Automatic vs Manual shrinkage - Databases should have the ability to shrink the database and release free space when required outside of a pack operation.  Being able to flag this for automatic release beyond a certain level of free space is also a goal.
  • Transaction Rollback - Right now a single transaction being orphaned causes the app to pack the database to enforce recovery.  This can be quite cumbersome in some scenarios.  An out of process rollback after xx seconds is another goal for the system.  This would allow other engine instances to rollback transactions when the initiating process has been offline for xx seconds.
  • DBCC type commands - We would like to be able to offer several of the DBCC style functions to DBAs.  The ability to perform a database consistency check without taking the system offline is high on our design goals list.
  • Single User - Multi User - In a server or Agent scenario being able to force the database down to a single user mode is often required.  Right now the only way to do this is through exclusive file access.
  • Database Snapshots - Online backup of open database files is always tricky.  Forcing the apps to be closed in order to back them up is not always the best solution.  An internal engine ability to snapshot the database file off for backup of an open database would be a very nice feature.

I hope you can now see we have spent a lot of time thinking through the changes before we begin.  The new VDB4 format is not in the current Alpha at this time.  We are working hard to get it included for the first customer preview in January.  Right now we are experimenting with the format in test harness applications.  Some of the things we are hammering very hard are fragmentation, free space, and consistency checking.

All of these features will be usable in the VDB4 format.  They may not all make sense in the desktop engine, but most of them will be in every VistaDB engine edition.

 

Comments

22 December 2008 #

How will, or will the, file format changes affect the speed?


22 December 2008 #

Can an administrator determine who is using the database, their machine ID and their full name?


22 December 2008 #

Hello,


all this is very exciting but will there be a tool to convert vdb3 files to vdb4? Some kind of migration wizard?


Thank you.


22 December 2008 #

js_vistadb

Yes, we will have a migration wizard from vdb3 to vdb4.


The VDB3 file has some interesting design limits.  One of the problems with allowing the user to choose any page size is that you don't match the OS page size.  This leads to a lot of partial reads at the OS layer, and write stalls as well.  One of our tests for max perf to the drive shows that VDB3 can only sustain about 1.3 mbit in random IO.  The changes in VDB4 allow us to hit 9.7 mbit in the same random IO test.  That is of course not the whole story in performance, but if you can get the data from and to the disk a lot faster it has to help the engine as well.


StaidVB - Administrator?  We do not have that concept in VistaDB. I think you are thinking about a Server layer.  Only a the Server would be able to do that, has nothing to do with the File or IO layer (this post).


js_vistadb

22 December 2008 #

Jason, when you say "Multi file support", are you talking about each table and each index in their own file like MySQL or FoxPro?  Is this going to be configurable for the desktop to allow it to have just one file as it does now?


Can you provide more on your thoughts?


22 December 2008 #

js_vistadb

Multi file support is a way to partition data on disk.  Lets say you have 3 drives (d,e,f) that are on physical drives.  You may want to split your database across all 3 for performance reasons.  The database would be logically partitioned.


Yes, we have every intention to keep the 1 file per database for ease of use.  This would be an option for more advanced deployment scenarios.  Not that we will have it immediately, but we want to ensure we can do it.  VDB3 can't do it because B trees contain absolute file pointers rather than file:extent:page pointers.


js_vistadb

Comments are closed

Powered by BlogEngine.NET and VistaDB

Log in