Home

I'm quite the emotional person, so I end up saying something along the lines of the subject to oh, say, two or three people, some days. I emotionally invest myself in things that I probably shouldn't, as all of us are probably guilty of that from time to time.

I was particularly mean spirited in my recent post about the TPC-H(tm) benchmark when I suggested that an unnamed analyst should perhaps leave the industry behind. I didn't really mean that.

I choose to give the TPC-H(tm) numbers significance, especially as an indicator of price/performance ratios. Kickfire kicks ass with a PP ratio of $0.81 per QPPH. This is very impressive. You can choose to disagree if you like.

mysql> select now() + interval 2918472 day;
+------------------------------+
| now() + interval 2918472 day |
+------------------------------+
| 9999-12-31 16:14:03          |
+------------------------------+
1 row in set (0.00 sec)

mysql> select now() + interval 2918473 day;
+------------------------------+
| now() + interval 2918473 day |
+------------------------------+
| NULL                         |
+------------------------------+
1 row in set, 1 warning (0.00 sec)


Hehe.

There has been a lot of discussion in the blogosphere recently about benchmarks, and I can't help but notice that a lot of the comments and opinions are harsh, sometimes downright mean, and alas uninformed.

Why do people care so much about a benchmark. IT IS JUST NUMBERS PEOPLE.

Seriously (and this isn't directed at any one person in particular):

Who doesn't like to play "my thing is bigger than your thing" every once in awhile, but you don't have to take it so personally. If a vendor wants to waste money running a huge benchmark for little profit - WHO CARES? There must be some audience for the benchmark results, but if you aren't it, why bother screaming at the top of your lungs that the benchmark is ludicrous, or useless or ineffective. Just ignore it.  You obviously are not the intended audience.  If you are the intended audience and you still don't care, then go play another game.  You've obviously tired of the industry.

People spend way too much time arguing about unimportant things, particularly when they don't understand the underlying technical issues to the fullest extent possible.

I personally want to say:
I think ParAccell's TPC-H result is impressive. Running the TPC-H queries at 30TB is an ENGINEERING MARVEL, regardless of the amount of tech that was necessary to produce the result. Period. If your database can run 30TB TPC-H, then fine, feel free to criticize the 30TB results and BEST THEM, otherwise, give some fucking credit where credit is due. dudes. seriously.

fine print:
I work for Kickfire, who has published audited TPC-H benchmark results on the Kickfire MySQL Appliance.  
My opinions here don't represent Kickfire.  This is my blog, with my own opinions.

I built a new benchmarking machine

  • Jun. 16th, 2009 at 10:14 PM

I've been wanting to do some benchmarking of different databases, tuning of various database parameters, different I/O subsystems, etc. I put myself together a new high-end workstation/mid-range server system to test with.

I stopped by Central Computers the other day, and in one of their nice glass display cases I saw the ASUS Z8NA-D6C motherboard, the first ATX dual socket 1366 motherboard. I decided to make that the centerpiece of my new system.

The goods:
ASUS Z8NA-D6C motherboard
OCZ 850W power supply
2 quadcore E5504 Xeon processors (2GHz/4MB cache)
12GB unbuffered DDR3 super talent memory
Adaptec 5805 unified SATA/SAS w/ 512MB cache
4 Hitachi Deskstar 7200RPM 1TB SATA drives
2 OCZ Vertex 250GB SSD drives
2 Patriot Memory Warp V3 250GB SSD drives
EVGA Nvidia GTX285 video card
Creative X-Fi sound card

So far performance has been great.

Avoid Patriot Memory 256GB WARP v3 drives

  • Jun. 8th, 2009 at 12:06 PM

I recently purchased two of these drives on recommendation from a friend, but they are really terrifically horrible drives, especially for MySQL. I won't be accepting his recommendations anymore. I have the same problem with BOTH drives, and surprisingly the drives seem to have more problems and perform worse in RAID0 than as JBOD.

The drives lock up during random write, even for small block transfers. The only way to get the drive to respond after it 'locks', is to power cycle it or the computer.

The following tests were done on my Adaptec 5805 Dual Core RAID controller so I'm sure it isn't the controller's fault. Just to make sure, I tested a Highpoint 2320, a 3ware 9650se and onboard NVIDIA raid (as a PC enthusiast, I have a lot of RAID controllers to test) and all performed similarly.

Random write to the SSD (anything larger than 512K will lock the drive):



Random write to 4 SATA Hitachi hard disks:

Tags:


It uses MySQL proxy, and MySQL proxy quietly refuses to work with these versions. The client reports 'Lost connection to MySQL server' during the authentication phase with 'error 0'.

I've filed Bug #45167 to track the issue.

As a workaround you can comment out the check for the version in src/network-mysqld.c +920
      if (recv_sock->challenge &&
          recv_sock->challenge->server_version > 50113 && recv_sock->challenge->server_version < 50118) {


        /**
         * Bug #25371
         *
         * COM_CHANGE_USER returns 2 ERR packets instead of one
         *
         * we can auto-correct the issue if needed and remove the second packet
         * Some clients handle this issue and expect a double ERR packet.
         */


        con->state = CON_STATE_ERROR;
      }


Apparently "we can auto-correct the issue", but the code fails to, and instead just returns CON_STATE_ERROR and disconnects. Without writing anything into the log.

YOU HAVE NO IDEA HOW LONG IT TOOK TO FIGURE THIS OUT

I sent mails to the proxy-discuss mailing list, which apparently never showed up. Talking with Jan back and forth, he thought it might be 'old_passwords=1' at first, but that wasn't the problem. Only after sending him debug information directly from strace, did he realize that the above code was at fault, and suggest I comment it out. So thanks Jan, I really appreciate the help, but I wish I didn't have to modify the code.

This isn't because I dislike changing the code, but because Enterprise Manager isn't open source. And it includes GPL components like MySQL proxy, but it isn't easy to replace the proxy component without changing the enterprise monitor packaging which is closed source, and not redistributable, as you have to go through the trial process or download the licensed binaries directly.

str_to_date can be a little ambiguous

  • May. 14th, 2009 at 4:51 PM

The MySQL manual for str_to_date states:
If str contains an illegal date, time, or datetime value, STR_TO_DATE() returns NULL. An
illegal value also produces a warning.

Surely "I'm_not_a_valid_date" is not a valid date, time or datetime value.
mysql> select str_to_date("I'm_not_a_valid_date","I'm_not_a_valid_date");
+------------------------------------------------------------+
| str_to_date("I'm_not_a_valid_date","I'm_not_a_valid_date") |
+------------------------------------------------------------+
| 0000-00-00                                                 |
+------------------------------------------------------------+
1 row in set (0.00 sec)


The problem here is that values in the format string which are not preceded by the percent sign (%) are treated as constant characters which must match the input string exactly. Normally these characters are used as delimiters. For example, the dash characters in the format string '%Y-%m-%d'.

For example:
mysql> select str_to_date('2008?01?01','%Y?%m?%d');
+--------------------------------------+
| str_to_date('2008?01?01','%Y?%m?%d') |
+--------------------------------------+
| 2008-01-01                           |
+--------------------------------------+
1 row in set (0.00 sec)


The ambiguity comes into play when both strings consist ONLY of matching characters with NO format specifications:
mysql> select str_to_date('2008?01?01','2008?01?01');
+----------------------------------------+
| str_to_date('2008?01?01','2008?01?01') |
+----------------------------------------+
| 0000-00-00                             |
+----------------------------------------+
1 row in set (0.00 sec)


I think this should return NULL, because no format characters have been specified, therefore no legal date should be output.

I've filed Bug 44891 to let MySQL decide.

Tags:


From the MySQL manual:
BIT_COUNT(N)

Returns the number of bits that are set in the argument N.

-------

I thought that meant that it returned the number of bits which needed to be set in order to store the value. To store 2^32 you need 33 bits of storage, with only one bit /set/. I expected 33 instead of 1.

#tested on 5.0.45 and 5.1.32-community
mysql> select i, pow(2,i), bit_count(pow(2,i)) from pow2;
+----+------------+---------------------+
| i  | pow(2,i)   | bit_count(pow(2,i)) |
+----+------------+---------------------+
|  1 |          2 |                   1 |
|  2 |          4 |                   1 |
|  3 |          8 |                   1 |
|  4 |         16 |                   1 |
|  5 |         32 |                   1 |
|  6 |         64 |                   1 |
|  7 |        128 |                   1 |
|  8 |        256 |                   1 |
|  9 |        512 |                   1 |
| 10 |       1024 |                   1 |
| 11 |       2048 |                   1 |
| 12 |       4096 |                   1 |
| 13 |       8192 |                   1 |
| 14 |      16384 |                   1 |
| 15 |      32768 |                   1 |
| 16 |      65536 |                   1 |
| 17 |     131072 |                   1 |
| 18 |     262144 |                   1 |
| 19 |     524288 |                   1 |
| 20 |    1048576 |                   1 |
| 21 |    2097152 |                   1 |
| 22 |    4194304 |                   1 |
| 23 |    8388608 |                   1 |
| 24 |   16777216 |                   1 |
| 25 |   33554432 |                   1 |
| 26 |   67108864 |                   1 |
| 27 |  134217728 |                   1 |
| 28 |  268435456 |                   1 |
| 29 |  536870912 |                   1 |
| 30 | 1073741824 |                   1 |
| 31 | 2147483648 |                   1 |
| 32 | 4294967296 |                   1 |
+----+------------+---------------------+
32 rows in set (0.00 sec)


[edit] I'm a boob.  I forgot to subtract one.  all powers of two do only have one bit set.  I'm still not feeling well and taking cold medication.

Tags:


In a blog post today, Stefan of the MySQL documentation team says that the MySQL documents will not be open sourced. Now, I have to go back and review the MySQL CE Keynote that Karen gave, but I'm pretty sure she said to the audience that it was "going to happen". Why would she even bother bringing it up in the keynote if it was never really being considered anyway? There has been little community input on this, and I for one am outraged that a GPL product would not release the documentation as GPL too.

I'm now totally convinced that MySQL does not understand, and will never understand the MySQL community.

Is this an opportunity for Oracle to try to stall community participation in MySQL? I don't want to be paranoid, but this could just be the hints of bold new changes in MySQL strategy, especially considering Karen's keynote mention of GPLing the docs.

I think the Free Software Foundation should protest MySQL's latest move and coordinate a letter-writing campaign in support for open source documentation for open source licensed software. Sun is actively moving to thwart the efforts of the community by not putting the documentation under the GPL license.

Is it time for an open source version of the MySQL documentation to emerge from the community? Keeping manuals in sync between multiple versions of multiple product lines requires a very large team of contributors. An entire team at MySQL is responsible for documentation changes. I'm not sure that the community can sustain useful database innovation while also being burdened under the weight of recreating the omnibus MySQL documentation set.

I went looking for the MySQL 5.4 documentation today, but alas I could not find it. In particular, I went looking for the list of changes between 5.1.34 and 5.4.1.

If 5.4.1 is beta then it should definitely have documentation, at least in beta form. This is especially true in light of the Sun MySQL conference keynote kind of dangling a carrot on a stick, as it were, with a question to the audience about opening the docs up under the GPL. The comments system also needs updated too. I think that if public comments are available on the web site, then those comments should have a rating system attached so that crappy, wrong and useless manual comments can easily be filtered. But I digress.

So, Sun, when are going to release all the MySQL documents under the GPL? Is it going to be in the MySQL 5.4 time frame, or in some far off hazy future? And why are there no 5.4 documents available online, when the binaries are already available for download?

I appreciate the timing of the release of the binaries with the conference, but for me, some documentation about changes and information about how to get the most out of the new release would be beneficial.

In particular I have questions about the new BKA algorithm, any new or changed my.cnf parameters related to the performance changes, and a quick overview of all the new features and changes since MySQL 5.1.

That isn't too much to ask, is it?

Can't wait for table change logs.

  • Apr. 21st, 2009 at 8:43 PM

At today's keynote by Mark Callaghan one of the new options he talked about are table change logs. He mentioned they might be of use to external applications, like Flexviews, which he mentioned but not directly. He asked if the guy who wrote it was in the audience, so I got to wave my hand and yell 'Flexviews!'.

I caught up with Mark at the Facebook party this evening. I had a chance to talk to him not only about the change logs, but also about Kickfire and the SQL chip. He asked me what I thought about working at Kickfire and I smiled and said I love it. I think I said "I've never been able to join a billion row table to a hundred million row table, sort, group and get results back in less than a minute" and I'm sure the smile never left my face.

As far as the table change logs, he verified:

  • The global transaction id will be stored in the table

  • OLD and NEW rows will be available for updated rows



This is great news for Flexviews (http://flexviews.sourceforge.net). Currently triggers must be used to maintain table change logs and this incurs a very high performance penalty for writes. With storage engine level change logs, this penalty is much reduce because the version information for the rows needs to be maintained for rollback purposes anyway. MVCC is great.

I also spoke with the folks over at the Primebase booth about getting similar change-logging enabled for PBXT tables as well.

So far I'm enjoying the conference immensely. See you tomorrow!

Just like this time last year (I hope this isn't a perennial event) everybody is expressing fear, uncertainty and doubt about the future of MySQL in the wake of yet another acquisition. What is going to happen?

MySQL isn't going to disappear tomorrow. Or even in the next ten years. Why?

  • The internet runs on the LAMP stack. There is so much technological infrastructure dedicated to MySQL that anybody would be crazy to think that it is going to disappear anytime soon. Hell, lots of companies still run 4.0 databases.

  • The MySQL codebase is GPL. Even if Oracle just closed up the MySQL office and set everybody packing the source code is ours to use, to repackage, to redistribute and to enhance. Drizzle already exists as a fork and others could emerge. I don't think that forks and alternative distributions are bad.


  • So if MySQL isn't going to disappear, what is going to happen (not that I really know)?

  • Support contracts might get more expensive in the Oracle world.

  • MySQL will continue to provide a low barrier-to-entry and high performance database for small enterprises and OLTP environments. There will still be lots of consulting and support opportunities for these companies. Large companies already running MySQL will be less likely to switch to an alternative technology unless compatibility with their existing apps can be ensured, or there is a clear value prop in terms of price/performance from an alternative technology.
  • The community might move behind a new fork of MySQL. Drizzle is a good alternative, but its goals are not aligned with all use cases of the MySQL database, in particular data warehousing and analytics. This obviously could change in the future.
  • Oracle could actually improve MySQL. MySQL needs improved triggers, stored procedures, materialized views, roles and other features. MySQL 5.1 was in RC for almost two years and a lot of stagnation has happened. A lot of planned features are half-ass or half-baked.

    Honestly, these are probably pretty much the same things people said last year about the Sun acquisition. Has anything /really/ changed over the last year, except community getting shafted by not having enough version numbers remaining? We still have a buggy 5.1 release, the conference is an explosion of marketing diarrhea and nobody is certain about the future of MySQL.

    I find it funny that I started as an Oracle DBA, and I guess technically, I am again!

There are two problems with the CAST function in all versions of MySQL:

bug#43774CAST(CAST(value) as TIME as DATETIME ) returns the wrong value.

example:
mysql> select '0000-00-00 21:00:00' expected, cast(cast('21:00:00' as time) as datetime) actual;
+---------------------+---------------------+
| expected | actual |
+---------------------+---------------------+
| 0000-00-00 21:00:00 | 2021-00-00 00:00:00 |
+---------------------+---------------------+
1 row in set (0.00 sec)

bug#43865CAST( -1.0 as UNSIGNED) yields 0 instead of MAXINT and returns a warning.

example:
create table t1(c1 int signed,
c2 decimal(5,2),
c3 float(5,2)
);

insert into t1 values (-1,-1,-1);

ACTUAL RESULT:

select cast(-1.0 as unsigned), cast(c1 as unsigned), cast(c2 as unsigned), cast(c3 as
unsigned) from t1 \G
*************************** 1. row ***************************
cast(-1.0 as unsigned): 0
cast(c1 as unsigned): 18446744073709551615
cast(c2 as unsigned): 0
cast(c3 as unsigned): 18446744073709551615
1 row in set, 2 warnings (0.00 sec)

mysql> show warnings;
+-------+------+---------------------------------------+
| Level | Code | Message |
+-------+------+---------------------------------------+
| Error | 1292 | Truncated incorrect DECIMAL value: '' |
| Error | 1292 | Truncated incorrect DECIMAL value: '' |
+-------+------+---------------------------------------+
2 rows in set (0.01 sec)

EXPECTED RESULT:
cast(-1.0 as unsigned): 18446744073709551615
cast(c1 as unsigned): 18446744073709551615
cast(c2 as unsigned): 18446744073709551615
cast(c3 as unsigned): 18446744073709551615
1 row in set, 0 warnings (0.01 sec)

Here the division between td1.c1 and td2.c2 is correct:
select td1.c1, td2.c2, td1.c1/td2.c2, -99 / 0.03    
  from testdata td1,         
       testdata td2   
 where td1.c1 = -99 
   and td2.c2 = 0.03 
 limit 1;

+------+------+---------------+------------+
| c1   | c2   | td1.c1/td2.c2 | -99 / 0.03 |
+------+------+---------------+------------+
|  -99 | 0.03 |    -3300.0000 | -3300.0000 |
+------+------+---------------+------------+
1 row in set (0.00 sec)

Here DISTINCT is added to the query.  The result is incorrect:

select distinct td1.c1, td2.c2, td1.c1/td2.c2, -99 / 0.03    
  from testdata td1,         
       testdata td2   
 where td1.c1 = -99 
   and td2.c2 = 0.03 
 limit 1;

+------+------+---------------+------------+
| c1   | c2   | td1.c1/td2.c2 | -99 / 0.03 |
+------+------+---------------+------------+
|  -99 | 0.03 |     -999.9999 | -3300.0000 |
+------+------+---------------+------------+
1 row in set (0.00 sec)


see MySQL bug#41814
http://bugs.mysql.com/bug.php?id=41814

The query returns correct (approximate) results on FLOAT columns:
select distinct td1.c1, td2.c2, td1.c1/td2.c2, -99 / 0.03 
  from testdata2 td1, 
       testdata2 td2 
 where td1.c1 = -99 
   and td2.c2 between 0.02 and 0.04;
+------+------+-------------------+------------+
| c1   | c2   | td1.c1/td2.c2     | -99 / 0.03 |
+------+------+-------------------+------------+
|  -99 | 0.03 | -3300.00007376075 | -3300.0000 |
+------+------+-------------------+------------+
1 row in set (0.00 sec)

Tags:

a question: current_thd macro and UDFs

  • Sep. 17th, 2008 at 1:55 PM

I saw some MySQL UDF code recently that uses the current_thd macro to get a THD object pointing to the current thread.

I was under the impression that this was not safe to do from a UDF? Am I wrong? Is a critical section needed to read or modify it?

The new MySQL 6.0 replacement API for the UDF interface looks like it specifically gives access to a THD object, similar to native functions. Is this just to make the interfaces more similar? Or is this because the current UDF interface (<=5.1) doesn't provide a safe way to access the current THD?

Tags:

MySQL bug #31031

  • Aug. 26th, 2008 at 7:29 PM
big badda boom (upset)
I came across this behavior today while writing some additional comprehensive ALTER TABLE tests for use in the test suite that we run against the Kickfire database appliance to ensure quality.

mysql> CREATE TABLE `t1` (
-> `col1` tinyint(4) DEFAULT NULL,
-> `id` int(11) NOT NULL DEFAULT '0' COMMENT 'min=1,max=65535',
-> PRIMARY KEY (`id`)
-> ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
Query OK, 0 rows affected (0.00 sec)

mysql> alter table t1 drop column id,
-> add column id int auto_increment primary key;
ERROR 1089 (HY000): Incorrect prefix key; the used key part isn't a string, the used length is longer than the key part, or the storage engine doesn't support unique prefix keys

This is a regression in 5.0 (and 5.1, I didn't try 6) that was reported almost one year ago. This alter table works just fine in 4.1 and there isn't any reason it should complain about 'prefix index length' when there are no prefix indexes involved.

Ask MySQL to fix this regression in 5.0 before 5.1 GA.

http://bugs.mysql.com/bug.php?id=31031

CentOS 5 ships with version 5.0.22 of MySQL, which has a number of issues, but I discovered a major one that I wasn't aware of until yesterday. There is a problem in the 5.0.22 client that causes empty strings, or strings consisting only of spaces to be displayed as NULL:
http://bugs.mysql.com/bug.php?id=19564

CREATE TABLE test.x1 (c char(10)) ENGINE=MYISAM;
INSERT INTO test.x1 values (' ');
INSERT INTO test.x1 values ('');


-- on a 5.0.22 client (/usr/local/mysql):
mysql> select * from test.x1;
+------+
| c    |
+------+
| NULL |
| NULL |
+------+
2 rows in set (0.00 sec)

-- on a 5.0.37 client (~/bin/mysql):
mysql> select * from test.x1;
+------+
| c    |
+------+
|      |
|      |
+------+
2 rows in set (0.00 sec)

Flexviews for MySQL 1.0.3-alpha released

  • Jul. 15th, 2008 at 6:22 PM

Make sure you set sp_max_recursion_depth - see the instructions file.

changes:
functions/procedures now take parameters in DB,TABLE order.
demo database removed from sourceforge and added into SVN/release tarball.
GRANT USAGE on *.* to flexviews; is now the default permission level of the flexviews user
I removed the mview_ prefix from filenames.
flexviews.add_table() now checks for the existence of the table and the table's mvlog

get it at http://flexviews.sourceforge.net

Tags:

pretty sql

  • Jul. 10th, 2008 at 1:30 PM

I wanted to post an example of how I like to format my SQL on Sheeri's blog, but her blog eats my formatting.

*edit* she fixed my formatting on her blog, but I'll leave these here anyway
*edit* adding some other keywords






I like to CAPITALIZE KEYWORDS and right align them:
SELECT a.c1,
       a.c2,
       b.c1,
       c.cX
  FROM a
  JOIN b
    ON a.c1 = b.c1
   AND a.status > 0
  JOIN c
    ON c.c1 = a.c1
   AND c.c2 = b.c2
   AND c.cX IN (1,5,9)


I like to enclose inline views (dependent subqueries) in parenthesis and match the begin and end parens on expressions for readability:
SELECT a.c1,
       a.c2,
       b.c1,
       c.cX,
       COUNT(*) cnt
  FROM a
  JOIN (SELECT c1, c2             -- dependent subquery
          FROM b,d
         WHERE b.id = d.id
           AND d.status = -1
       ) b
    ON a.c1 = b.c1
   AND a.status > 0
  JOIN c                          -- comments for 
    ON c.c1 = a.c1                -- readability go out here
   AND c.c2 = b.c2
   AND ( c.cX IN (1,5,9) OR
         c.cY < 3
       )
   AND EXISTS ( SELECT 1
                  FROM table
                 WHERE c = 0 
              )
 WHERE c.c1 = 7
HAVING count(*) > 1
 GROUP BY a.c1,
          a.c2,
          b.c1,
          c.cx 

I will be demonstrating the features of Flexviews as well as talking about how you can improve performance by:

  • pre-aggregating important tables for improved query performance

  • pre-calculating joins on InnoDB using materialized views

  • nesting incrementally refreshed materialized views to further improve performance



http://mysql.meetup.com/101/

Profile

[info]swanhart
Justin Swanhart
Kickfire

Advertisement

Spottt Link Exchange

Spottt
Spottt

Syndicate

RSS Atom

Latest Month

July 2009
S M T W T F S
   1234
567891011
12131415161718
19202122232425
262728293031 
Powered by LiveJournal.com