Performance numbers and VistaDB

written by Jason Short on Friday, June 29 2007

Ok, the question has been asked to me enough times again recently that I am going to write this up as a blog and give the long answer as to why we don’t allow performance numbers to be posted (and why I don’t even think they are valid). The EULA for VistaDB has always had a statement about non-publication of performance numbers.  This is something that predates me buying the company.  At the time when I was just a user of the product I always thought it was weird.  Anthony had good reasons, and I didn’t really care.  The DB did what I needed and that was what mattered to me. Now I am the owner and the shoe is on the other foot.  I left that clause in because all the same reasons were valid, but people need SOME benchmark that the database works at all.  So I am rethinking a way to present data in a useable fashion. The biggest reasons (and they are still valid) is that ops per second is meaningless without context.  And that any numbers given will be obsolete the moment they are published, but things on the Internet live forever.  Someone will still be pointing to some "performance number" eight versions from now when it is totally irrelevant. Vendor optimization I remember a few years ago trying some other database vendors and their demo for speed (everyone had one) showed these huge numbers for performance.  I was pretty excited and spent about two weeks learning their system.  It was totally object based which was quite a learning curve, and the performance was terrible.  So I went back and looked at the demo.  The demo was a seriously artificial example of how not to do performance numbers.  All the objects had like one variable in them.  Who would need to store that in an object database?  The vendor does not use a speed demo anymore, and states that you cannot compare a relational database to an object based one without measuring a lot more than TPS.  I agree.  It was a bad move for them to put out that demo in the first place.  They now stick by their guns that being object oriented is just different from relational. VistaDB is different VistaDB is different as well.  We are not based on objects, but being fully managed is a big deal.  There are a lot people looking for a replacement for Access that stumble across us and don’t understand what the big deal is about managed code.  There are some benefits to VistaDB being managed, but speed is not one of them.  Will it improve in time?  Yes.  As Microsoft optimizes the frameworks, and as new hardware comes out that Dot Net can take advantage of (64 bit Windows being a good example of a free upgrade built into Dot Net) things will improve.  This is a long term commitment by Microsoft to make Dot Net their top tier platform.  I would guess in less than 5 years you will be able to get a development tool from Microsoft that does not rely on Dot Net. Graphics card manufacturers A few years ago there were a number of standard graphics benchmarks that we supposed to give users an idea of what each card would relatively perform the same tasks.  Almost as soon as they came out graphics card manufacturers started optimizing their drivers and cards for those specific cases to make themselves look better on the test.  Did it make the games look better or perform better?  No.  In many cases two cards with similar numbers would not play the same game at the same frame rate.  That is why places like Tom’s Hardware use lots of different games now to benchmark.  It is then a true comparison of how the card will perform on your specific game. TPC benchmarks So what is that benchmark in the database world?  Argueably it is the TPC benchmark.
TPC Benchmark™ App (TPC-App) is an application server and web services benchmark. The workload is performed in a managed environment that simulates the activities of a business-to-business transactional application server operating in a 24x7 environment. TPC-App showcases the performance capabilities of application server systems.
So does this give you a real idea of how well one database performs against another?  No, not even close again.  Microsoft and the other big vendors all tweak their code and setups to get the best performance on that test.  And usually the current leaders on the boards are due to the hardware they throw at the problem, not the database vendor at all.  I looked at a recent “world record” that was set by a vendor.  The hardware for the test cost over $250,000, not to mention the licenses for Windows Server Datacenter edition, etc.  That is silly.  The benchmarks have become a way for database vendors to show off the numbers, even though less than 1% of their users could probably ever afford that hardware. Artificial benchmarks are worthless Sure, we could make up a benchmark that shows query 60 million records in less than 1 second.  I have seen that type of claim before from vendors.  It would be a very artificial test.  And you could then build optimizations in to make sure that ONE test executes insanely fast.  I don’t want to do that. I looked for some sort of open benchmark for databases like Winmark and could not find one that was even remotely close to realistic.  If you know of one please let me know. So what is the answer? I think some type of benchmark is needed.  And I think what VistaDB really needs is a comparative benchmark that shows our own performance over time.  How are we doing?  Are we improving?  This will let people know that we are serious about performance, and that it will continue to improve over time.  But I will repeat that data security is much more important than speed to me.  Given two solutions to a problem, one that might lose data, and one that will not I will always choose the safer system. I will work on a benchmark that we will include with our system.  It will store our past performance, and current performance.  I don’t want people to take this test and make every other database on the planet run against it.  That is not our goal.  Our goal is to measure our own performance and make sure it is acceptable.  If people only choose their database based upon raw numbers printed somewhere then that is not for us anyway.  There is much more to be gained with managed code than speed.  Speed of development is often much more important than the execution speed.  Safety of the code is another.  Not having to worry about buffer overflows and memory violations is a big deal. Thanks as always for being a customer.

Similar Posts

  1. The GC does not solve all memory leaks
  2. devLINK 2007 – post show analysis
  3. SQL Server 2008 (Katmai) Information

Comments are closed

Options:

Size

Colors