Before a monitoring and notification system for MySQL (and other infrastructure) can be fully conceived, a variety of questions must be asked. I’ve attempted to put some of the most important of them together in this post.
These are what I feel are some best practice recommendations.
#1 Define what is to be monitored.
#2 Determine what level of health monitoring is desirable.
#3 Determine what hosts and services require active checks.
#4 Collect metrics when possible.
#5 Define the alerting/notification requirements.
#6 Define the notification escalation policy based on the SLA for the response.
#7 Pick the right tools.
These are what I feel are some best practice recommendations.
#1 Define what is to be monitored.
- Is it a HOST?
- Is it a SERVICE?
#2 Determine what level of health monitoring is desirable.
- Basic Monitoring Probes (typically active checks)
- Simple TCP/IP port up/down
- Disk space, free memory, swap utilization, etc
- File exists (or does not exist, or is not of at least X size)
- Process is running/not running
- etc
- Simple TCP/IP port up/down
- Log Monitoring (typically passive checks)
- System logs should be collected and analyzed.
- Application logs should be collected an analyzed.
- Any log monitoring software should issue alerts via the notification tool API for notification escalation,etc
- System logs should be collected and analyzed.
- Application monitoring (combination of active/passive)
- Simple service monitoring - send a string to this port and expect a response back(might be validated with regex)
- Complex service monitoring which requires a custom plugin
- Services for which nagios plugins already exist
- Passive alerts sent directly from the application
- Simple service monitoring - send a string to this port and expect a response back(might be validated with regex)
#3 Determine what hosts and services require active checks.
- How often is the service/host checked?
- How is “flapping” handled? Flapping is repeated up/down states during a service check interval.
- Who is notified at failure?
- What is the notification escalation policy
- How do you respond to alerts, quiesce the notification, etc?
#4 Collect metrics when possible.
- Health checks should collect metrics whenever possible. For instance, a “server health” plugin, which monitors CPU,swap,memory, disk space, etc, should return the metrics for the services which are being monitored. This avoids pulling for resource utilization AND alerting.
- Metrics collection may be resource intensive on the monitoring server.
- What is the long term policy for metrics collection, aggregation and reporting?
#5 Define the alerting/notification requirements.
- What is the SLA on the service?
- What is the SLA on the response to an alert?
- Who is to be notified when the service fails?
- What is the escalation policy?
#6 Define the notification escalation policy based on the SLA for the response.
- When an alert is not answered who is it escalated to?
- How is it escalated (email, sms, etc)
- What are the incentives for timely alert/notification response?
- Disincentives for non-timely response?
#7 Pick the right tools.
- I like Nagios. It is pluggable, supports metrics collection and it is reliable.
- Try not to reinvent the wheel too much.
- Try to come up with integrated monitoring/metrics collection and reporting. Keep as little operational data in your operational databases as possible. Limit operational monitoring databases to one month of historical data, in addition to maintaining status and notification information. Move historical information into a dedicated metrics reporting database schema.
edit: I filed MySQL bug#46867 about this issue.
Ask anyone who has learned to count the following two questions. The answer to both of which should be yes.
Q:Is 5 between 1 and 10? A: Yes
Q:Is 5 between 10 and 1? A: Yes
Ask MySQL those same questions:
This is a problem because applications may produce BETWEEN clauses. I don't think most applications specifically check that the lower bounds is lower than the upper bounds in left to right order. Logically, a number between 1 and 10 is also between 10 and 1.
I encountered this on my own benchmarking application that generates BETWEEN clauses.
So generate BETWEEN clauses with caution. Perhaps you would be better off with two >= and <= expressions instead of the more convenient BETWEEN.
Ask anyone who has learned to count the following two questions. The answer to both of which should be yes.
Q:Is 5 between 1 and 10? A: Yes
Q:Is 5 between 10 and 1? A: Yes
Ask MySQL those same questions:
mysql> (select 'Yes'
from dual
where 5 between 1 and 10
)
union
(select 'No'
)
limit 1;
+-----+
| Yes |
+-----+
| Yes |
+-----+
1 row in set (0.00 sec)
mysql> (select 'Yes'
from dual
where 5 between 10 and 1
)
union
(select 'No')
limit 1;
+-----+
| Yes |
+-----+
| No |
+-----+
1 row in set (0.00 sec)
This is a problem because applications may produce BETWEEN clauses. I don't think most applications specifically check that the lower bounds is lower than the upper bounds in left to right order. Logically, a number between 1 and 10 is also between 10 and 1.
I encountered this on my own benchmarking application that generates BETWEEN clauses.
So generate BETWEEN clauses with caution. Perhaps you would be better off with two >= and <= expressions instead of the more convenient BETWEEN.
Unlike COUNT(expression), COUNT(DISTINCT expression) is not a distributable function. That is, the expression can not be computed by looking at only the changed rows in the table change logs.
The following view can not be FAST refreshed with Flexviews today:
However, a dependent child materialization could be created to support the value for COUNT(DISTINCT d):
-- child materialization, dependent subview
--There will be one row for each DISTINCT value of d
The original view could then be rewritten as:
This can easily be done manually with the current implementation, you just have to be careful to always refresh the child materialization first, before refreshing the dependent view.
I am seriously considering porting Flexviews directly into Drizzle. I'm excited about replication plugins as this may make it easy to produce the necessary table change logs to support the materialization logic. Drizzle is becoming completely plugin oriented, so eventually materialized view rewrite and other cool features could be implemented too as optimizer plugins.
The following view can not be FAST refreshed with Flexviews today:
SELECT a a_alias,
b b_alias,
c c_alias,
COUNT(DISTINCT d) d_alias
FROM T1
GROUP BY a, b, c;
However, a dependent child materialization could be created to support the value for COUNT(DISTINCT d):
-- child materialization, dependent subview
--There will be one row for each DISTINCT value of d
SELECT a a_alias,
b b_alias,
c c_alias,
count(*) d_alias_cnt
FROM T1
GROUP BY a, b, c, d;
The original view could then be rewritten as:
SELECT a a_alias,
b b_alias,
c c_alias,
(SELECT COUNT(*)
FROM child_materialization) d_alias
FROM T1
GROUP by a_alias, b_alias, c_alias;
This can easily be done manually with the current implementation, you just have to be careful to always refresh the child materialization first, before refreshing the dependent view.
I am seriously considering porting Flexviews directly into Drizzle. I'm excited about replication plugins as this may make it easy to produce the necessary table change logs to support the materialization logic. Drizzle is becoming completely plugin oriented, so eventually materialized view rewrite and other cool features could be implemented too as optimizer plugins.
The MySQL manual for str_to_date states:
If str contains an illegal date, time, or datetime value, STR_TO_DATE() returns NULL. An
illegal value also produces a warning.
Surely "I'm_not_a_valid_date" is not a valid date, time or datetime value.
The problem here is that values in the format string which are not preceded by the percent sign (%) are treated as constant characters which must match the input string exactly. Normally these characters are used as delimiters. For example, the dash characters in the format string '%Y-%m-%d'.
For example:
The ambiguity comes into play when both strings consist ONLY of matching characters with NO format specifications:
I think this should return NULL, because no format characters have been specified, therefore no legal date should be output.
I've filed Bug 44891 to let MySQL decide.
If str contains an illegal date, time, or datetime value, STR_TO_DATE() returns NULL. An
illegal value also produces a warning.
Surely "I'm_not_a_valid_date" is not a valid date, time or datetime value.
mysql> select str_to_date("I'm_not_a_valid_date","I'm_not_a_valid_date");
+------------------------------------------------------------+
| str_to_date("I'm_not_a_valid_date","I'm_not_a_valid_date") |
+------------------------------------------------------------+
| 0000-00-00 |
+------------------------------------------------------------+
1 row in set (0.00 sec)
The problem here is that values in the format string which are not preceded by the percent sign (%) are treated as constant characters which must match the input string exactly. Normally these characters are used as delimiters. For example, the dash characters in the format string '%Y-%m-%d'.
For example:
mysql> select str_to_date('2008?01?01','%Y?%m?%d');
+--------------------------------------+
| str_to_date('2008?01?01','%Y?%m?%d') |
+--------------------------------------+
| 2008-01-01 |
+--------------------------------------+
1 row in set (0.00 sec)
The ambiguity comes into play when both strings consist ONLY of matching characters with NO format specifications:
mysql> select str_to_date('2008?01?01','2008?01?01');
+----------------------------------------+
| str_to_date('2008?01?01','2008?01?01') |
+----------------------------------------+
| 0000-00-00 |
+----------------------------------------+
1 row in set (0.00 sec)
I think this should return NULL, because no format characters have been specified, therefore no legal date should be output.
I've filed Bug 44891 to let MySQL decide.
In a blog post today, Stefan of the MySQL documentation team says that the MySQL documents will not be open sourced. Now, I have to go back and review the MySQL CE Keynote that Karen gave, but I'm pretty sure she said to the audience that it was "going to happen". Why would she even bother bringing it up in the keynote if it was never really being considered anyway? There has been little community input on this, and I for one am outraged that a GPL product would not release the documentation as GPL too.
I'm now totally convinced that MySQL does not understand, and will never understand the MySQL community.
Is this an opportunity for Oracle to try to stall community participation in MySQL? I don't want to be paranoid, but this could just be the hints of bold new changes in MySQL strategy, especially considering Karen's keynote mention of GPLing the docs.
I think the Free Software Foundation should protest MySQL's latest move and coordinate a letter-writing campaign in support for open source documentation for open source licensed software. Sun is actively moving to thwart the efforts of the community by not putting the documentation under the GPL license.
Is it time for an open source version of the MySQL documentation to emerge from the community? Keeping manuals in sync between multiple versions of multiple product lines requires a very large team of contributors. An entire team at MySQL is responsible for documentation changes. I'm not sure that the community can sustain useful database innovation while also being burdened under the weight of recreating the omnibus MySQL documentation set.
I'm now totally convinced that MySQL does not understand, and will never understand the MySQL community.
Is this an opportunity for Oracle to try to stall community participation in MySQL? I don't want to be paranoid, but this could just be the hints of bold new changes in MySQL strategy, especially considering Karen's keynote mention of GPLing the docs.
I think the Free Software Foundation should protest MySQL's latest move and coordinate a letter-writing campaign in support for open source documentation for open source licensed software. Sun is actively moving to thwart the efforts of the community by not putting the documentation under the GPL license.
Is it time for an open source version of the MySQL documentation to emerge from the community? Keeping manuals in sync between multiple versions of multiple product lines requires a very large team of contributors. An entire team at MySQL is responsible for documentation changes. I'm not sure that the community can sustain useful database innovation while also being burdened under the weight of recreating the omnibus MySQL documentation set.
- Mood:angry
I went looking for the MySQL 5.4 documentation today, but alas I could not find it. In particular, I went looking for the list of changes between 5.1.34 and 5.4.1.
If 5.4.1 is beta then it should definitely have documentation, at least in beta form. This is especially true in light of the Sun MySQL conference keynote kind of dangling a carrot on a stick, as it were, with a question to the audience about opening the docs up under the GPL. The comments system also needs updated too. I think that if public comments are available on the web site, then those comments should have a rating system attached so that crappy, wrong and useless manual comments can easily be filtered. But I digress.
So, Sun, when are going to release all the MySQL documents under the GPL? Is it going to be in the MySQL 5.4 time frame, or in some far off hazy future? And why are there no 5.4 documents available online, when the binaries are already available for download?
I appreciate the timing of the release of the binaries with the conference, but for me, some documentation about changes and information about how to get the most out of the new release would be beneficial.
In particular I have questions about the new BKA algorithm, any new or changed my.cnf parameters related to the performance changes, and a quick overview of all the new features and changes since MySQL 5.1.
That isn't too much to ask, is it?
If 5.4.1 is beta then it should definitely have documentation, at least in beta form. This is especially true in light of the Sun MySQL conference keynote kind of dangling a carrot on a stick, as it were, with a question to the audience about opening the docs up under the GPL. The comments system also needs updated too. I think that if public comments are available on the web site, then those comments should have a rating system attached so that crappy, wrong and useless manual comments can easily be filtered. But I digress.
So, Sun, when are going to release all the MySQL documents under the GPL? Is it going to be in the MySQL 5.4 time frame, or in some far off hazy future? And why are there no 5.4 documents available online, when the binaries are already available for download?
I appreciate the timing of the release of the binaries with the conference, but for me, some documentation about changes and information about how to get the most out of the new release would be beneficial.
In particular I have questions about the new BKA algorithm, any new or changed my.cnf parameters related to the performance changes, and a quick overview of all the new features and changes since MySQL 5.1.
That isn't too much to ask, is it?
Just like this time last year (I hope this isn't a perennial event) everybody is expressing fear, uncertainty and doubt about the future of MySQL in the wake of yet another acquisition. What is going to happen?
MySQL isn't going to disappear tomorrow. Or even in the next ten years. Why?
MySQL isn't going to disappear tomorrow. Or even in the next ten years. Why?
- The internet runs on the LAMP stack. There is so much technological infrastructure dedicated to MySQL that anybody would be crazy to think that it is going to disappear anytime soon. Hell, lots of companies still run 4.0 databases.
- The MySQL codebase is GPL. Even if Oracle just closed up the MySQL office and set everybody packing the source code is ours to use, to repackage, to redistribute and to enhance. Drizzle already exists as a fork and others could emerge. I don't think that forks and alternative distributions are bad.
- Support contracts might get more expensive in the Oracle world.
- MySQL will continue to provide a low barrier-to-entry and high performance database for small enterprises and OLTP environments. There will still be lots of consulting and support opportunities for these companies. Large companies already running MySQL will be less likely to switch to an alternative technology unless compatibility with their existing apps can be ensured, or there is a clear value prop in terms of price/performance from an alternative technology.
- The community might move behind a new fork of MySQL. Drizzle is a good alternative, but its goals are not aligned with all use cases of the MySQL database, in particular data warehousing and analytics. This obviously could change in the future.
- Oracle could actually improve MySQL. MySQL needs improved triggers, stored procedures, materialized views, roles and other features. MySQL 5.1 was in RC for almost two years and a lot of stagnation has happened. A lot of planned features are half-ass or half-baked.
Honestly, these are probably pretty much the same things people said last year about the Sun acquisition. Has anything /really/ changed over the last year, except community getting shafted by not having enough version numbers remaining? We still have a buggy 5.1 release, the conference is an explosion of marketing diarrhea and nobody is certain about the future of MySQL.
I find it funny that I started as an Oracle DBA, and I guess technically, I am again!
So if MySQL isn't going to disappear, what is going to happen (not that I really know)?
There are two problems with the CAST function in all versions of MySQL:
bug#43774CAST(CAST(value) as TIME as DATETIME ) returns the wrong value.
example:
mysql> select '0000-00-00 21:00:00' expected, cast(cast('21:00:00' as time) as datetime) actual;
+---------------------+----------------- ----+
| expected | actual |
+---------------------+----------------- ----+
| 0000-00-00 21:00:00 | 2021-00-00 00:00:00 |
+---------------------+----------------- ----+
1 row in set (0.00 sec)
bug#43865CAST( -1.0 as UNSIGNED) yields 0 instead of MAXINT and returns a warning.
example:
create table t1(c1 int signed,
c2 decimal(5,2),
c3 float(5,2)
);
insert into t1 values (-1,-1,-1);
ACTUAL RESULT:
select cast(-1.0 as unsigned), cast(c1 as unsigned), cast(c2 as unsigned), cast(c3 as
unsigned) from t1 \G
*************************** 1. row ***************************
cast(-1.0 as unsigned): 0
cast(c1 as unsigned): 18446744073709551615
cast(c2 as unsigned): 0
cast(c3 as unsigned): 18446744073709551615
1 row in set, 2 warnings (0.00 sec)
mysql> show warnings;
+-------+------+------------------------ ---------------+
| Level | Code | Message |
+-------+------+------------------------ ---------------+
| Error | 1292 | Truncated incorrect DECIMAL value: '' |
| Error | 1292 | Truncated incorrect DECIMAL value: '' |
+-------+------+------------------------ ---------------+
2 rows in set (0.01 sec)
EXPECTED RESULT:
cast(-1.0 as unsigned): 18446744073709551615
cast(c1 as unsigned): 18446744073709551615
cast(c2 as unsigned): 18446744073709551615
cast(c3 as unsigned): 18446744073709551615
1 row in set, 0 warnings (0.01 sec)
bug#43774CAST(CAST(value) as TIME as DATETIME ) returns the wrong value.
example:
mysql> select '0000-00-00 21:00:00' expected, cast(cast('21:00:00' as time) as datetime) actual;
+---------------------+-----------------
| expected | actual |
+---------------------+-----------------
| 0000-00-00 21:00:00 | 2021-00-00 00:00:00 |
+---------------------+-----------------
1 row in set (0.00 sec)
bug#43865CAST( -1.0 as UNSIGNED) yields 0 instead of MAXINT and returns a warning.
example:
create table t1(c1 int signed,
c2 decimal(5,2),
c3 float(5,2)
);
insert into t1 values (-1,-1,-1);
ACTUAL RESULT:
select cast(-1.0 as unsigned), cast(c1 as unsigned), cast(c2 as unsigned), cast(c3 as
unsigned) from t1 \G
*************************** 1. row ***************************
cast(-1.0 as unsigned): 0
cast(c1 as unsigned): 18446744073709551615
cast(c2 as unsigned): 0
cast(c3 as unsigned): 18446744073709551615
1 row in set, 2 warnings (0.00 sec)
mysql> show warnings;
+-------+------+------------------------
| Level | Code | Message |
+-------+------+------------------------
| Error | 1292 | Truncated incorrect DECIMAL value: '' |
| Error | 1292 | Truncated incorrect DECIMAL value: '' |
+-------+------+------------------------
2 rows in set (0.01 sec)
EXPECTED RESULT:
cast(-1.0 as unsigned): 18446744073709551615
cast(c1 as unsigned): 18446744073709551615
cast(c2 as unsigned): 18446744073709551615
cast(c3 as unsigned): 18446744073709551615
1 row in set, 0 warnings (0.01 sec)
Here the division between td1.c1 and td2.c2 is correct:
see MySQL bug#41814
http://bugs.mysql.com/bug.php?id=41814
The query returns correct (approximate) results on FLOAT columns:
select td1.c1, td2.c2, td1.c1/td2.c2, -99 / 0.03
from testdata td1,
testdata td2
where td1.c1 = -99
and td2.c2 = 0.03
limit 1;
+------+------+---------------+------------+
| c1 | c2 | td1.c1/td2.c2 | -99 / 0.03 |
+------+------+---------------+------------+
| -99 | 0.03 | -3300.0000 | -3300.0000 |
+------+------+---------------+------------+
1 row in set (0.00 sec)
Here DISTINCT is added to the query. The result is incorrect:
select distinct td1.c1, td2.c2, td1.c1/td2.c2, -99 / 0.03
from testdata td1,
testdata td2
where td1.c1 = -99
and td2.c2 = 0.03
limit 1;
+------+------+---------------+------------+
| c1 | c2 | td1.c1/td2.c2 | -99 / 0.03 |
+------+------+---------------+------------+
| -99 | 0.03 | -999.9999 | -3300.0000 |
+------+------+---------------+------------+
1 row in set (0.00 sec)
see MySQL bug#41814
http://bugs.mysql.com/bug.php?id=41814
The query returns correct (approximate) results on FLOAT columns:
select distinct td1.c1, td2.c2, td1.c1/td2.c2, -99 / 0.03
from testdata2 td1,
testdata2 td2
where td1.c1 = -99
and td2.c2 between 0.02 and 0.04;
+------+------+-------------------+------------+
| c1 | c2 | td1.c1/td2.c2 | -99 / 0.03 |
+------+------+-------------------+------------+
| -99 | 0.03 | -3300.00007376075 | -3300.0000 |
+------+------+-------------------+------------+
1 row in set (0.00 sec)
I saw some MySQL UDF code recently that uses the current_thd macro to get a THD object pointing to the current thread.
I was under the impression that this was not safe to do from a UDF? Am I wrong? Is a critical section needed to read or modify it?
The new MySQL 6.0 replacement API for the UDF interface looks like it specifically gives access to a THD object, similar to native functions. Is this just to make the interfaces more similar? Or is this because the current UDF interface (<=5.1) doesn't provide a safe way to access the current THD?
I was under the impression that this was not safe to do from a UDF? Am I wrong? Is a critical section needed to read or modify it?
The new MySQL 6.0 replacement API for the UDF interface looks like it specifically gives access to a THD object, similar to native functions. Is this just to make the interfaces more similar? Or is this because the current UDF interface (<=5.1) doesn't provide a safe way to access the current THD?
I will be demonstrating the features of Flexviews as well as talking about how you can improve performance by:
http://mysql.meetup.com/101/
- pre-aggregating important tables for improved query performance
- pre-calculating joins on InnoDB using materialized views
- nesting incrementally refreshed materialized views to further improve performance
http://mysql.meetup.com/101/
This problem is "documented" in two bugs:
http://bugs.mysql.com/bug.php?id=36073
http://bugs.mysql.com/bug.php?id=20001
Why in the world is this just a "documentation fix"? There have been a number of occasions when I've tried to write generic stored procedures that might want to analyze information about temporary tables. For instance, given a SELECT statement, I want to determine what datatypes each select clause produces. A fairly straightforward way to do this might be to run a "create temporary table xyz as (SELECT ... LIMIT 0), to produce an empty temporary table to analyze. This doesn't work however, because there isn't any information on xyz in the information schema! Worse, SHOW statements don't work as a PREPARED STATEMENT so there isn't any way to determine this information in a programmatic manner!
http://bugs.mysql.com/bug.php?id=36073
http://bugs.mysql.com/bug.php?id=20001
Why in the world is this just a "documentation fix"? There have been a number of occasions when I've tried to write generic stored procedures that might want to analyze information about temporary tables. For instance, given a SELECT statement, I want to determine what datatypes each select clause produces. A fairly straightforward way to do this might be to run a "create temporary table xyz as (SELECT ... LIMIT 0), to produce an empty temporary table to analyze. This doesn't work however, because there isn't any information on xyz in the information schema! Worse, SHOW statements don't work as a PREPARED STATEMENT so there isn't any way to determine this information in a programmatic manner!
mysql> create database if not exists test;
Query OK, 0 rows affected, 1 warning (0.02 sec)
mysql> use test;
Database changed
mysql> CREATE TEMPORARY TABLE x
-> (z int) engine=MYISAM;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT * FROM test.x;
Empty set (0.00 sec)
mysql> SELECT *
-> FROM INFORMATION_SCHEMA.TABLES
-> WHERE TABLE_NAME='x';
Empty set (0.01 sec)I've come up with some interesting workarounds for missing features using @session variables and I'd like to share one with you today: DELETE ... JOIN ... LIMIT N;
okay, so we've got two tables, one with many duplicates and one with no duplicates.
We want to join the tables using 'id', but only want to delete N duplicates from t1.
MySQL's DELETE does not include LIMIT support when JOIN is used, so we need to work around
that using some fancy footwork:
mysql> select * from t1;
+------+------+
| id | abc |
+------+------+
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
+------+------+
8 rows in set (0.00 sec)
mysql> select * from t2;
+----+------+
| id | xyz |
+----+------+
| 1 | 1 |
+----+------+
1 row in set (0.00 sec)
mysql> set @limit=2; set @rowCount=0;
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
mysql> DELETE t1.*
-> FROM t1
-> JOIN t2 on (t1.id = t2.id)
-> WHERE t2.id = 1
-> AND CONCAT(t1.id, t1.abc, @rowCount +1)
-> = CONCAT(t1.id, t1.abc,
-> CASE @rowCount
-> WHEN @LIMIT THEN @rowCount:=NULL
-> ELSE @rowCount := @rowCount+1
-> END
-> );
Query OK, 2 rows affected (0.00 sec)
mysql> select * from t1;
+------+------+
| id | abc |
+------+------+
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
+------+------+
6 rows in set (0.00 sec)
how it works:
When MySQL first evaluates the expression, @rowCount is zero. The CONCAT in the where clause concatenates all the values given together, so at first we get CONCAT(1, 1, 1) = CONCAT(1, 1, 1). The next time through, we get CONCAT(1, 1, 2) = CONCAT(1, 1, 2). At the third execution we get CONCAT(1, 1, 3) = CONCAT(1, 1, NULL), which does not evaluate true. At this time @rowCount is now NULL, so subsequent scanned rows evaluate to
CONCAT(1, 1, NULL) = CONCAT(1, 1, NULL), which can never evaluate true (NULL != NULL).
So, what interesting ways have you used @session variables to come up with solutions to intractable problems?
okay, so we've got two tables, one with many duplicates and one with no duplicates.
We want to join the tables using 'id', but only want to delete N duplicates from t1.
MySQL's DELETE does not include LIMIT support when JOIN is used, so we need to work around
that using some fancy footwork:
mysql> select * from t1;
+------+------+
| id | abc |
+------+------+
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
+------+------+
8 rows in set (0.00 sec)
mysql> select * from t2;
+----+------+
| id | xyz |
+----+------+
| 1 | 1 |
+----+------+
1 row in set (0.00 sec)
mysql> set @limit=2; set @rowCount=0;
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
mysql> DELETE t1.*
-> FROM t1
-> JOIN t2 on (t1.id = t2.id)
-> WHERE t2.id = 1
-> AND CONCAT(t1.id, t1.abc, @rowCount +1)
-> = CONCAT(t1.id, t1.abc,
-> CASE @rowCount
-> WHEN @LIMIT THEN @rowCount:=NULL
-> ELSE @rowCount := @rowCount+1
-> END
-> );
Query OK, 2 rows affected (0.00 sec)
mysql> select * from t1;
+------+------+
| id | abc |
+------+------+
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
+------+------+
6 rows in set (0.00 sec)
how it works:
When MySQL first evaluates the expression, @rowCount is zero. The CONCAT in the where clause concatenates all the values given together, so at first we get CONCAT(1, 1, 1) = CONCAT(1, 1, 1). The next time through, we get CONCAT(1, 1, 2) = CONCAT(1, 1, 2). At the third execution we get CONCAT(1, 1, 3) = CONCAT(1, 1, NULL), which does not evaluate true. At this time @rowCount is now NULL, so subsequent scanned rows evaluate to
CONCAT(1, 1, NULL) = CONCAT(1, 1, NULL), which can never evaluate true (NULL != NULL).
So, what interesting ways have you used @session variables to come up with solutions to intractable problems?
I need to generate large (1TB-3TB) synthetic MySQL datasets for testing, with a number of requirements:
a) custom output formatting (SQL, CSV, fixed-len row, etc)
b) referential integrity support (ie, child tables should reference PK values, no orphans,etc)
c) able to generate multiple tables in parallel
d) preferably able to operate without a GUI and/or manual intervention
e) uses a well defined templating construct for data generation
f) preferably open source
Does anyone out there know of a product that meets at least most of these requirements?
*edit*
I found a PHP based data generation script (www.generatedata.com) that is extensible in its output formatting, so it should do everything I need it to do.
a) custom output formatting (SQL, CSV, fixed-len row, etc)
b) referential integrity support (ie, child tables should reference PK values, no orphans,etc)
c) able to generate multiple tables in parallel
d) preferably able to operate without a GUI and/or manual intervention
e) uses a well defined templating construct for data generation
f) preferably open source
Does anyone out there know of a product that meets at least most of these requirements?
*edit*
I found a PHP based data generation script (www.generatedata.com) that is extensible in its output formatting, so it should do everything I need it to do.
Brian Moon suggest that community provided example my.cnf files would be a great thing to have on MySQLforge in this recent post: http://doughboy.wordpress.com/2008/05/0 6/example-mycnf-files/
I pulled out the "innodb heavy" config sample file and modified it with the standard settings that I typically start with when setting up a new InnoDB master. I've also modified the comments in the file a bit and have added some of my own too. I removed the sample slave configuration parameters (master-host, etc) because you should be using 'CHANGE MASTER TO'.
He suggested tagging such files with a 'mycnf' tag and very kindly tagged mine after I posted it :)
Feel free to share yours too and please feel free to make any comments about my configuration choices.
You can find it here (along with any other mycnf tagged files):
http://forge.mysql.com/tools/search.p hp?t=tag&k=mycnf
I pulled out the "innodb heavy" config sample file and modified it with the standard settings that I typically start with when setting up a new InnoDB master. I've also modified the comments in the file a bit and have added some of my own too. I removed the sample slave configuration parameters (master-host, etc) because you should be using 'CHANGE MASTER TO'.
He suggested tagging such files with a 'mycnf' tag and very kindly tagged mine after I posted it :)
Feel free to share yours too and please feel free to make any comments about my configuration choices.
You can find it here (along with any other mycnf tagged files):
http://forge.mysql.com/tools/search.p
Today is my first official day with Kickfire. I've spent most of the day reading up about how the appliance works and trying to wrap my head around some of the finer details.
My starting role here is essentially as an internal consultant, which means that I'll be the one that gets the MySQL server related questions from the development team. This is going to allow me a chance to really sink my teeth further into the source code and help implement some really cool tech, which has me quite excited.
Later I will be doing some external consulting (sales, implementation, etc) that will allow me to travel a bit.
Overall, I think this position is a great fit for me and I'm really psyched.
My starting role here is essentially as an internal consultant, which means that I'll be the one that gets the MySQL server related questions from the development team. This is going to allow me a chance to really sink my teeth further into the source code and help implement some really cool tech, which has me quite excited.
Later I will be doing some external consulting (sales, implementation, etc) that will allow me to travel a bit.
Overall, I think this position is a great fit for me and I'm really psyched.
Recently, I came upon Baron Swartz's (xaprb's) great now_usec UDF:
http://www.xaprb.com/blog/2007/10/30/ho w-i-built-the-now_usec-udf-for-mysql/
Hooray..
So in summary: to compile a C++ UDF on OSX, use -dynamiclib instead of -shared
and add -lstdc++ to the compile flags.
http://www.xaprb.com/blog/2007/10/30/ho
I use an Intel Macbook Pro for my primary development environment. GCC on OS X has some interesting quirks. Usually to compile a UDF on GNU/Linux I use the following command line:
gcc -fPIC -Wall -I/usr/include/mysql -shared -o udf_now_usec.so udf_now_usec.cc
When I tried to execute the above on my OS X box, I got some errors:
$ gcc -fPIC -Wall -I../include -shared -o udf_now_usec.so udf_now_usec.cc i686-apple-darwin8-gcc-4.0.1: unrecognized option '-shared' /usr/bin/ld: Undefined symbols: _main ___gxx_personality_v0</b> collect2: ld returned 1 exit status
The first error had me particularly flummoxed for a few moments, as -shared is supposed to work to create shared objects :) It turns out that OSX uses -dynamiclib instead of shared. I have no idea why.
$ gcc -fPIC -Wall -I../include -dynamiclib -o udf_now_usec.so udf_now_usec.cc ld: Undefined symbols: ___gxx_personality_v0 /usr/bin/libtool: internal link edit command failed
Okay, thats better, but there is still an undefined symbol... Well, it turns out after some digging that ___gxx_personality_v0 is defined in the stdc++ library..
Adding -lstdc++ to the compile should fix that...
$ gcc -fPIC -Wall -I../include -dynamiclib -lstdc++ -o $
Hooray..
So in summary: to compile a C++ UDF on OSX, use -dynamiclib instead of -shared
and add -lstdc++ to the compile flags.


