Tuesday, October 16, 2012
Once more about comparison of subquery optimization in MySQL 5.6 and MariaDb 5.5, once more about nothing...
Has it ever happened to you, when attending a party or any other convention, to enter the hall, see a gang of nice people drinking, smiling, laughing, doing small talks, and you saying “hi” to them and in response getting just and empty stare somewhere above your head? And you then – trying to approach a guy looking quite familiar to you – stretch your hand to shake his hand just to watch him demonstratively turn his back to you? And so there you nervously looking around, and stealthily checking your back (no, no tail), and throwing a glance at your toes (still no hooves), and you passing your palm across your sweating forehead (no hints of horns). And “am I plagued?” comes to your head as you start regretting utterly that your appeared here at all, and you start wishing to be disintegrated, annihilated, completely blown out... You don't have to. Calm down. You are invisible. You are invisible for them. This is an Oracle convention, and you are not from Oracle. You do not exist. It's that simple.
Why am I getting so emotional, so mad? What's actually happened? What has forced me to express myself in the language that is not my native? Have I read this this blog? What's about it? The guy says “ I had demonstrated how subquery materialization, introduced in MySQL 5.6.5” , even though the feature was introduced as long ago as in MySQL 6.0 alpha in 2007? Well, big deal: lame wording, nothing more. The guy says about his latest feature: “I implemented”? I've checked: he really implemented it. Quite new code. The code is raw, contains obvious bugs,the first query you submitted from a test suite returned a badly chosen plan? Of course the code is raw (it's only RC after all), of course it contains bugs (any code contains bugs), and, of course you are extremely unlucky. The code exploits the same idea underneath as an unnamed implementation? Well, it's quite disputable. Besides “the same idea” with what? With that in MariaDB code? But MariaDB does not exist! Look at the blogs of MySQL developers. All of them. Look at the presentations from MySQL Connect. No mention of MariaDB. Ergo: it does not exists.
Dear MySQL developers from Oracle, the database engineers from Sun and my former colleagues from MySQL that still stay with Oracle (not too many of them though): Intentionally or unintentionally, with bad will or good will, you effectively destroy the house of FOSS (that is already pretty shaky due to its flimsy architecture) when you defiantly do not notice any other MySQL development, do not give any credit to the developers from the community. When I and my colleagues develop a new feature and give it to the community the only things I and my colleagues ask for is the acknowledgment from the community that I and my colleagues are considered as the authors of the feature. And the fact how much money Oracle spends on the development of its own features cannot deprive us of our authorship. So please, respect our authorship as we respect the authorship of anybody else, including yours. Otherwise there won't be any development community for MySQL. There might be some user community around Oracle's MySQL . This perhaps may be a goal for Oracle, but I doubt that it can be an attractive perspective for Percona, SkySQL, Galera and many other people from MySQL community.
I would like also to remind you about the culture of annotations that is indisputably acknowledged in open publications. Especially I remind about it for the engineers from Norway among whom are a few PhDs and who for sure are familiarized with this culture. Of course, theoretically speaking in any of my articles I can use the result of any theorem without giving any reference to the first publication. Yet, the chances are extremely low that the article will be published in any solid edition. And the chances are high that my next articles won't be accepted under some ridiculous pretext.
What if Oracle implicitly or explicitly prohibits the mention of the achievements of others in the blogs of its employees? Well, it's a hard question and at the moment I don't have any answer. I only know that a malicious corporate order cannot be taken as an excuse for my broken integrity. That's why, anticipating a very high probability of such kind of moral collisions, I preferred joining Monty Program AB instead of joining Oracle.
At the end I should apologize for Sergey Petrunia who published this blog several days ago and then went on vacation. In his blog, Sergey claimed that no implementation of Cost-based choice between Materialization and IN->EXISTS strategies could be found in MySQL 5.6.7. It turns out he was mistaken. His mistake is quite understandable though. He could not see any public commits for this feature, any traces of the feature in the official change log for the release. Anyway, unintentionally he published facts that were not true and he should have delivered his apologies, had he existed. But he doesn't. Neither do I.
Sunday, February 26, 2012
Comparing the optimizer features of MariaDB 5.3 and MySQL 5.6
In his comment for Vadim Tkachenko's post Baron Schwartz writes:
Baron is known as a prominent advocate of the Oracle MySQL products. In this role he probably believes that the less credit he gives to the competing product MariaDB, the better. As a member of the MariaDB team, I, on my part, can hardly be expected to consider it as a fair position.
My intention is not to convince Baron that MariaDB deserves a more gracious attitude from his side, but rather to provide other members of the MySQL community with a guidance how the information on the state of the MySQL optimizer development at MariaDB and at Oracle could be extracted from the product source code.
I conventionally understand “optimizer” here as the piece of the MySQL server code that is responsible for :
- transformations of the submitted query to one with better execution time
- search for the best execution plan of the transformed query.
Usually new, more efficient algorithms to perform relational operations employed by query execution plans are also considered as the area of responsibility of the optimizer. The user oriented functionality directly related to the optimization process (like support of EXPLAIN) also belong to the domain of the optimizer.
Ok, I'm a member of the MySQL optimizer team since 2003, I'm tightly involved in the MariaDB optimizer development, so it's quite probable that I know a lot of details concerning the optimizer code in MariDB. But what about the optimizer development at Oracle? Can I be considered knowledgeable here? I think so, if you take into account the following circumstances:
After the back-port of optimizer features from MySQL 6.0, the MariaDB team decided to add:
And of course, from the very beginning we planned to back-port the code of WL #4777 and to complete the development of the optimizations for derived tables (MWL #106).
We did not experience any problems with the implementations of MWL #68 and MWL #66. MWL #106 promised to be a not so easy project, but by May 2010 it was successfully completed. MWL #90 required some re-engineering work. Yet it was finished in more or less expected time.
Quite unexpectedly MWL #89 and MWL #121 turned out to be the tasks at which we stumbled badly (especially the first one). So badly that I had enough time to finish the task I had never planned for MariaDB 5.3: an implementation of the block-based hash join algorithm (MWL #128). This task required some re-engineering work for the other block-based join algorithms though.
Anyway, by the end of July 2011 (too much for a development cycle, anyone of us agrees upon this!) all optimizers features were in and we released MariaDB 5.3.0 beta. 7 months more and now we are ready to release 5.3.5 GA.
What about the back-ported optimizer features in MySQL 5.6?
Anyway, if you really want to compare the gains you get from optimizer features of MySQL 5.6 with those of MariaDB 5.3 I would recommend you to build the MySQL 5.6 tree with subquery optimizations.
Did we make such comparison? Yes, we did with the DBT-3 benchmark on the databases of scale factor 10 (~30GB), 30 (~80GB), 100 (270GB) for InnoDB and MyISAM.
Where does MariaDB win? On Q3, Q5, Q7, Q8, Q10. Why? Mainly because when employing BKA we can use MRR for primary keys in InnoDB and MRR with sorted keys for MyISAM/InnoDB. How much do we win? It depends the engine and the database scale factor (and the type the disk system, of course).
For example, when executing Q7 for the InnoDB database scale factor 10 we won 257s : 1886s , while when executing Q3 we won 288s : 691, when executing Q8 – 330s : 952s (a conventional HDD was used).
For the InnoDB scale factor 100 we had the following ratio
Q3. 980s : 2008s
Q7. 1393s : 8498s
Q8. 1118s : 3209s.
Do we lose sometimes? Never for MyISAM. Yet on Q9 and Q12 for InnoDB we did. We did not lose too much, but quite noticeably : 589s : 403s on Q9, 465s : 336s on Q12 if to execute on the database of scale factor 10.
The actual cause of this loss is to be investigated yet, but the interesting thing is that we do not lose here if we do not ask for MRR key sorting.
What about subquery optimizations for DBT-3? Here MySQL 5.6 and MariaDB 5.3 show approximately the same results. A more disappointing fact is that, with the exception of Q18 execution of which on MariaDB 5.2 takes years while on MariaDB 5.3 it takes a couple of minutes, other queries with subqueries from DBT-3 do not benefit too much from the new subquery optimizations.
Comparing optimizer features of MariaDB 5.3 and MySQL 5.6 I should mention also other optimizer improvements that do not overlap.
For MariaDB 5.3 these are:For MySQL 5.6 these are:(BTW, have you ever tried to execute Q7 or Q8 with optimizer trace turned on and to look through the results?).
The MariaDB 5.5 tree is green in Buildbot and I expect the first alpha release these days. MariaDB 5.5 will not include much of new development from MP AB. It will be mainly a merge of MySQL 5.5 with MariaDB 5.3. Yet some cute optimizer features will appear there.
I already blogged about one of them. Another feature will allow us to use subquery optimizations for some [NOT] EXISTS subqueries. There will be a couple of other very specific optimizer features developed for our valuable customers.
The optimizer code of MariaDB and that of MySQL diverge more and more. Does it mean that we won't be able to back-port interesting features from MySQL 5.6 into MariaDB? Not at all. We still understand the optimizer internals pretty well.
Do Oracle's MySQL optimizer guys understand it as well? Judging the evolution of their bug fixes I could say that definitely their understanding is getting better and better. I can't say that they've already reached our level, but it may happen in the future.
I almost forgot to give you their names in order you could easily detect their patches in the huge mysql trunk tree to follow their progress yourself and to be able to make judgments yourself rather the trust some shaky speculations. Here they are (in no particular order):
And here we are, those who actively work on optimizer features in MariaDB:
"...I speculate that when MySQL 5.6 is GA, the official MySQL from
Oracle will have a clear advantage over GA versions of MariaDB in
several common types of workloads. When will MariaDB based on MySQL
5.6 be released? I suppose that when MariaDB based on 5.5 is
finished, we will have a rule of thumb that might be useful to
estimate the lag. I’d expect (but I could be wrong) that it will
take somewhat longer to port to 5.6, because unlike the 5.5
codebase where Oracle, Percona, and Monty Program made a lot of
changes in somewhat disparate parts of the server, in MySQL 5.6
there will be a lot of changes that will potentially conflict —
in MySQL 5.6 there are extensive changes to the query optimizer
and the replication codebase, which have been changed a lot in
MariaDB as well. These changes might be difficult to merge."
Baron is known as a prominent advocate of the Oracle MySQL products. In this role he probably believes that the less credit he gives to the competing product MariaDB, the better. As a member of the MariaDB team, I, on my part, can hardly be expected to consider it as a fair position.
My intention is not to convince Baron that MariaDB deserves a more gracious attitude from his side, but rather to provide other members of the MySQL community with a guidance how the information on the state of the MySQL optimizer development at MariaDB and at Oracle could be extracted from the product source code.
I conventionally understand “optimizer” here as the piece of the MySQL server code that is responsible for :
- transformations of the submitted query to one with better execution time
- search for the best execution plan of the transformed query.
Usually new, more efficient algorithms to perform relational operations employed by query execution plans are also considered as the area of responsibility of the optimizer. The user oriented functionality directly related to the optimization process (like support of EXPLAIN) also belong to the domain of the optimizer.
Ok, I'm a member of the MySQL optimizer team since 2003, I'm tightly involved in the MariaDB optimizer development, so it's quite probable that I know a lot of details concerning the optimizer code in MariDB. But what about the optimizer development at Oracle? Can I be considered knowledgeable here? I think so, if you take into account the following circumstances:
- the main bulk of the optimizer features of MySQL 5.6 were back-ported from ill-fated MySQL 6.0
- these features were either written by my colleagues/me from Monty Program AB or were partly designed by me or were reviewed by me
- the MySQL 5.6 code is still open (lp:mysql-server/trunk) and I regularly review the optimizer related fixes from there and back-port them to the MariaDB code if I find them appropriate
- the MySQL optimizer features that were developed by Oracle proper are not too complex yet and can be accessed by any expert
- subquery semi-join optimizations (developed for MySQL-6.0 by Sergey Petrunia in 2007-2008)
- subquery materialization (developed for MySQL-6.0 by Timour Katchaounov in 2007)
- disk sweep implementation of the multi-range read (MRR) interface (developed for MySQL 6.0 by Sergey Petrunia in 2007)
- batched key access join algorithm (developed for MySQL 6.0 by me, Igor Babaev, in 2007-2008)
- index condition pushdown (ICP) (developed for MySQL 6.0 by Sergey Petrunia in 2007)
- optimizations for derived tables: merge of derived tables, late materialization of derived tables/views, indexes for derived tables/views (incomplete development for MySQL 6.0 by Evgeny Potemkin was done in 2008-2009, see WL #4777)
After the back-port of optimizer features from MySQL 6.0, the MariaDB team decided to add:
- null-aware check of IN/NOT non-correlated subquery predicates (MWL #68)
- smart choice between materialization of not-semijoin IN subqueries and transformations of the IN subquery predicates into EXISTS subqueries (MWL #89)
- inside-out execution of non-semijoin subqueries (MWL #90)
- subquery cache for correlated subqueries (MWL #66)
And of course, from the very beginning we planned to back-port the code of WL #4777 and to complete the development of the optimizations for derived tables (MWL #106).
We did not experience any problems with the implementations of MWL #68 and MWL #66. MWL #106 promised to be a not so easy project, but by May 2010 it was successfully completed. MWL #90 required some re-engineering work. Yet it was finished in more or less expected time.
Quite unexpectedly MWL #89 and MWL #121 turned out to be the tasks at which we stumbled badly (especially the first one). So badly that I had enough time to finish the task I had never planned for MariaDB 5.3: an implementation of the block-based hash join algorithm (MWL #128). This task required some re-engineering work for the other block-based join algorithms though.
Anyway, by the end of July 2011 (too much for a development cycle, anyone of us agrees upon this!) all optimizers features were in and we released MariaDB 5.3.0 beta. 7 months more and now we are ready to release 5.3.5 GA.
What about the back-ported optimizer features in MySQL 5.6?
- April 2011,MySQL 5.6.2:
- MRR is released (yet, with no MRR for innodb primary keys)
- ICP is released - October 2011, MySQL 5.6.3:
- optimizations for derived tables are released
- BKA join algorithm is released - December 2011, MySQL 5.6.4:
- no new optimizer features - ???, MySQL 5.6.5
- ???
Anyway, if you really want to compare the gains you get from optimizer features of MySQL 5.6 with those of MariaDB 5.3 I would recommend you to build the MySQL 5.6 tree with subquery optimizations.
Did we make such comparison? Yes, we did with the DBT-3 benchmark on the databases of scale factor 10 (~30GB), 30 (~80GB), 100 (270GB) for InnoDB and MyISAM.
Where does MariaDB win? On Q3, Q5, Q7, Q8, Q10. Why? Mainly because when employing BKA we can use MRR for primary keys in InnoDB and MRR with sorted keys for MyISAM/InnoDB. How much do we win? It depends the engine and the database scale factor (and the type the disk system, of course).
For example, when executing Q7 for the InnoDB database scale factor 10 we won 257s : 1886s , while when executing Q3 we won 288s : 691, when executing Q8 – 330s : 952s (a conventional HDD was used).
For the InnoDB scale factor 100 we had the following ratio
Q3. 980s : 2008s
Q7. 1393s : 8498s
Q8. 1118s : 3209s.
Do we lose sometimes? Never for MyISAM. Yet on Q9 and Q12 for InnoDB we did. We did not lose too much, but quite noticeably : 589s : 403s on Q9, 465s : 336s on Q12 if to execute on the database of scale factor 10.
The actual cause of this loss is to be investigated yet, but the interesting thing is that we do not lose here if we do not ask for MRR key sorting.
What about subquery optimizations for DBT-3? Here MySQL 5.6 and MariaDB 5.3 show approximately the same results. A more disappointing fact is that, with the exception of Q18 execution of which on MariaDB 5.2 takes years while on MariaDB 5.3 it takes a couple of minutes, other queries with subqueries from DBT-3 do not benefit too much from the new subquery optimizations.
Comparing optimizer features of MariaDB 5.3 and MySQL 5.6 I should mention also other optimizer improvements that do not overlap.
For MariaDB 5.3 these are:For MySQL 5.6 these are:(BTW, have you ever tried to execute Q7 or Q8 with optimizer trace turned on and to look through the results?).
The MariaDB 5.5 tree is green in Buildbot and I expect the first alpha release these days. MariaDB 5.5 will not include much of new development from MP AB. It will be mainly a merge of MySQL 5.5 with MariaDB 5.3. Yet some cute optimizer features will appear there.
I already blogged about one of them. Another feature will allow us to use subquery optimizations for some [NOT] EXISTS subqueries. There will be a couple of other very specific optimizer features developed for our valuable customers.
The optimizer code of MariaDB and that of MySQL diverge more and more. Does it mean that we won't be able to back-port interesting features from MySQL 5.6 into MariaDB? Not at all. We still understand the optimizer internals pretty well.
Do Oracle's MySQL optimizer guys understand it as well? Judging the evolution of their bug fixes I could say that definitely their understanding is getting better and better. I can't say that they've already reached our level, but it may happen in the future.
I almost forgot to give you their names in order you could easily detect their patches in the huge mysql trunk tree to follow their progress yourself and to be able to make judgments yourself rather the trust some shaky speculations. Here they are (in no particular order):
- Roy Lyseng (from Sun, Norway)
- Jorgen Loland (from Sun, Norway),
- Ole John Aske (from Sun, Norway),
- Jon Olav Hauglid (from Sun, Norway),
- Olav Sandstaa (from Sun, Norway),
- Oystein Grovlen (from Sun, Norway),
- Tor Didriksen (from Sun, Norway),
- Guilhem Bichot (from MySQL AB, original Maria team, France)
- Evgeny Potemkin (from MySQL AB, original optimizer team, Russia)
And here we are, those who actively work on optimizer features in MariaDB:
- Sergey Petrunia (with MySQL/MariaDB since November 2003, Russia)
- Timour Katchaounov (with MySQL/MariaDB since April 2004, Bulgaria)
- Sanja Byelkin (the author of the query cache, the author the original subquery code, one who implemented views in MySQL, Ukraine)
- Igor Babaev (with MySQL/MariaDB since December 2002, USA/Russia)
and, of course, the last, but not the least, - Michael (Monty) Widenius (the main reviewer of the optimizer code now, Finland)
Saturday, December 31, 2011
A 3-way join that touches only indexes
Can an execution of 3-way join use only indexes and not touch table rows at all? If we have MyISAM tables it's just impossible. Yet with InnoDB tables it would be possible if we could exploit so called extended keys – the regular secondary keys extended by the components of the primary key. The fact is the InnoDB engine works fine if you pass a key extended by primary key components, and, it uses the key to the full length without trimming it up to the base key fields. In the result we have a more narrow search and numerous obvious benefits from it.
Let's see how extended keys could be employed by execution for the following query built over a DBT-3/TPC-H database with one added index defined on p_retailprice.
(The query asks for orderkeys of the orders on 1992-07-01 that ordered parts with retail price greater than 2095.)
The query could be executed by the following execution plan:
What prevents the current MariaDB/MySQL optimizer from choosing such an apparently efficient plan? A trivial shortcoming of the optimizer: it does not consider extended keys when looking for possible index accesses to join a table.
Quite surprisingly this defect attracted my attention when I investigated the efficiency of index condition push-down (that, btw, exploits extended keys to the full measure) at the latest MySQL UC. Since it did not seem too difficult to fix this problem I decided to do it as soon as I came back from the conference. Indeed, it took me less than a week to produce a working variant that made the join optimizer, the range optimizer and min/max optimizations to be aware of extended keys. The implementation was fast and robust, but rather cumbersome since it used iterator classes to look through parts of the extended keys. It required quite a few changes in the server code.
Then we, at MP, became extremely busy with the first MariaDB 5.3 beta release. So it was only this fall that I managed to find some time for an alternative implementation. The new implentation just expanded the key definitions with additional key parts when filling the TABLE_SHARE structures by the info read from frm files. It allowed to keep the changes in the optimizer code minimal.
You can see this implementation in this tree on Launchpad. The patch was applied to the latest MariaDB 5.3 build. Yet, with a minor modifications it could be easily applied to any of the MySQL/MariaDB/PerconaServer or even Drizzle releases. When experimenting with the tree from Launchpad bear in mind that the optimizer switch must have the flag 'extended_keys' set to 'on' to enable the feature.
Were other people in the MySQL community also annoyed with the deficiency of the MySQL optimizer fixed by the patch? Yes, yes. See for example Domas's blog . So I expect quite a lot of interest towards the published patch. The patch has all chances to appear pretty soon in the first beta release of MariaDB 5.5 that is MariaDB 5.3.3-rc merged with the latest release of MySQL 5.5.
To intrigue you more I copy the EXPLAIN output returned by the patch for the above query:
Let's see how extended keys could be employed by execution for the following query built over a DBT-3/TPC-H database with one added index defined on p_retailprice.
select o_orderkey
from part, lineitem, orders
where p_retailprice > 2095 and o_orderdate='1992-07-01'
and o_orderkey=l_orderkey and p_partkey=l_partkey;
(The query asks for orderkeys of the orders on 1992-07-01 that ordered parts with retail price greater than 2095.)
The query could be executed by the following execution plan:
- Scan the entries of the index i_p_retailprice where p_retailprice>2095 and read p_partkey values from the extended keys.
- For each value p_partkey make an index look-up into the table lineitem employing index i_l_partkey and fetch the values of l_orderkey from the extended index.
- For each fetched value of l_orderkey append it to the date '1992-07-01' and use the resulted key for a index look-up by index i_o_orderdate to fetch the values of o_orderkey from the found index entries.
What prevents the current MariaDB/MySQL optimizer from choosing such an apparently efficient plan? A trivial shortcoming of the optimizer: it does not consider extended keys when looking for possible index accesses to join a table.
Quite surprisingly this defect attracted my attention when I investigated the efficiency of index condition push-down (that, btw, exploits extended keys to the full measure) at the latest MySQL UC. Since it did not seem too difficult to fix this problem I decided to do it as soon as I came back from the conference. Indeed, it took me less than a week to produce a working variant that made the join optimizer, the range optimizer and min/max optimizations to be aware of extended keys. The implementation was fast and robust, but rather cumbersome since it used iterator classes to look through parts of the extended keys. It required quite a few changes in the server code.
Then we, at MP, became extremely busy with the first MariaDB 5.3 beta release. So it was only this fall that I managed to find some time for an alternative implementation. The new implentation just expanded the key definitions with additional key parts when filling the TABLE_SHARE structures by the info read from frm files. It allowed to keep the changes in the optimizer code minimal.
You can see this implementation in this tree on Launchpad. The patch was applied to the latest MariaDB 5.3 build. Yet, with a minor modifications it could be easily applied to any of the MySQL/MariaDB/PerconaServer or even Drizzle releases. When experimenting with the tree from Launchpad bear in mind that the optimizer switch must have the flag 'extended_keys' set to 'on' to enable the feature.
Were other people in the MySQL community also annoyed with the deficiency of the MySQL optimizer fixed by the patch? Yes, yes. See for example Domas's blog . So I expect quite a lot of interest towards the published patch. The patch has all chances to appear pretty soon in the first beta release of MariaDB 5.5 that is MariaDB 5.3.3-rc merged with the latest release of MySQL 5.5.
To intrigue you more I copy the EXPLAIN output returned by the patch for the above query:
MariaDB [dbt3sf10]> explain
-> select o_orderkey
-> from part, lineitem, orders
-> where p_retailprice > 2095 and o_orderdate='1992-07-01'
-> and o_orderkey=l_orderkey and p_partkey=l_partkey\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: part
type: range
possible_keys: PRIMARY,i_p_retailprice
key: i_p_retailprice
key_len: 9
ref: NULL
rows: 100
Extra: Using where; Using index
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: lineitem
type: ref
possible_keys: PRIMARY,i_l_suppkey_partkey,i_l_partkey,i_l_orderkey,i_l_orderkey_quantity
key: i_l_partkey
key_len: 5
ref: dbt3sf10.part.p_partkey
rows: 15
Extra: Using index
*************************** 3. row ***************************
id: 1
select_type: SIMPLE
table: orders
type: ref
possible_keys: PRIMARY,i_o_orderdate
key: i_o_orderdate
key_len: 8
ref: const,dbt3sf10.lineitem.l_orderkey
rows: 1
Extra: Using index
3 rows in set (0.00 sec)
Tuesday, August 4, 2009
In Defense of MRR
A few days before OSCon Henrik Ingo, our newly hired COO, forwarded me a post of Mark Callaghan with the following plaint on the state of certain latest optimizations that had been introduced into the MySQL Server, namely, Multi-Range Read (MRR), Index Condition Pushdown (ICP) and Batched Key Access (BKA).
Further in the post, Mark points to the slew of bugs allegedly in the implementation of ICP for the InnoDb engine that forced managers at Sun/Oracle to ask Sergey Petrunia, the author of the code, to turn the feature off , though for this engine only. The patch was submitted and the next alpha release of the Azalea version will appear with this optimization disabled. At the same time Sergey was asked to disable the implementation of the MRR interface for InnoDB too.
Why? Because at Sun they were not sure who the culprit was. They had a choir of unhappy customers (reported bugs #36981, #34591, #35080, #37415, #34590, #37208, #40992, #42580, #43617), some of whom blamed ICP, some – MRR, and some - both. Taking into account a subdued grumble of some undisputed authorities who first planned to port the BKA code into their product and then put the project on hold due to a mere fact that the MRR code was said to be “utterly crappy”, what would you do if you were responsible for making decisions? Right. Condemn both suspects to minimize the consequences of a mistake. None of them is worth a single tear of our customers. What? Are they are not even of the same kind? This makes them even more guilty. This makes them an organized gang. You are still not sure they are desperate criminals? Open the bug reports, read them. Do they make you cry too? Are you convinced at last?
Anyway, the verdict runs as follows:
End of the story. Justice reigns. The villains deserve capital punishment, but we are civilized people and we believe that properly applied corrective measures have a good chance of improving any scoundrel. As for now a full isolation of the convicts would be not only in the interest of good folks, but in the interest of the poor creatures as well.
At last all of us can exhale with a relief and rejoice the life where there are no bugs just because everything that could attract them has been mortified.
I used to know MRR for InndDb when he was a just a child. There was nothing wrong in how he was raising up. He was brought up together with his siblings, first with MRR for MyISAM, MRR for NDB Cluster, later with MRR for Falcon. That's true that they have different biological mothers, but all these children were conceived by the same father. Moreover the implementations of the MRR interface for MyISAM and for InnoDB were based on the same code and were engine independent. If there had been any genetic defect in the code it would have manifested itself for MyISAM as well.
The implementation of the MRR interface for MyISAM/InnoDb employs other handler functions such as index_read and rnd_pos. Either there are some subtleties or side effects in the implementations of these functions for InnoDB, or the accusations are false.
So I decided to spend some of my time to investigate the case on my own. First I looked through the bug reports.
The reports for bugs #36981, #34591, #35080, #37415 say that with the settings:
no crashes are observed for the reported cases.
The report for bug #34590 is certain that the setting
is enough to prevent crashing.
The analysis made by Paul DuBois for bug #40992 leads to the conclusion that only settings for engine_condition_pushdown matter when trying to cause a crash. It's worth noting that the test case for this bug resembles pretty much the one reported by Shane Bester for bug #36981.
The reports for bugs #37208, #42580, #43617 complain about wrong results when engine_condition_pushdown is 'on'. Once more the report for #42580 says that the problem appears when both optimizer_use_mrr and engine_condition_pushdown are set to default values.
After I had read all these reports, it became clear for me that there were no direct evidences of MRR's bad behavior, at least from the cases that were submitted.
To check that the MRR code had nothing to do the reported problems, I built the server from the latest mysql-azalea tree, reverted the patch from bug #45029, and ran the test cases with engine_condition_pushdown set to 'off'. All the test cases passed through without crashes and with correct results. (Of course I could run only those test cases that I was able to extract from bug database easily. These were the test cases for bugs: #36981, #34590, #42580, #43617.)
I continued my investigation. First I debugged the test case for bug #42580 that returned a wrong result and looked quite simple. After several attempts to force my way through the InnoDB native code finally I came to the offending lines in innobase/row/row0sel.c:
It was Sergey's code and the intention was to mask in the null bit of an index field in the internal InnoDB row cache. It could be properly done with the code like this:
I applied the fix and all the wrong results (bugs #37208, #42580, #43617) went away.
The crashes remained though. Several hours of additional debugging for the test case of bug #36981 brought me to a really bad memory overwrite in the build_template function from innobase/handler/ha_innodb.cc. By a pure chance the overwrite did not cause a problem for my test case. The crash was caused by usage of wrong template structures for reading row fields from mysql buffers. The crash happened in the code that had been added by Sergey in the row_search_for_mysql function to evaluate conditions pushed down to indexes. This problem can not be easily fixed since with ICP we may need two arrays of prebuilt template structures when executing the SELECT FOR UPDATE queries or multi-UPDATE/DELETE queries: one for the fields of the scanning index, another for the fields of the clustering index.
As a temporary solution I could suggest to block usage of ICP for such queries. It can be done with the following code:
This code prevents crashes for all reported test cases.
Only two days of investigations (I literally spent only a week-end for it, and, I would have spent much less time if I were familiar a little bit with the InnoDB code) convinced me that MRR for InnoDB is absolutely innocent, while ICP for InnoDB, though being guilty of serious misdeeds, should not undergo any severe punishment, as it's quite naturally to expect some faults from such a young feature and we don't have to employ any penitentiary institutions to correct these deviations.
A different attitude to this misdemeanor would make me doubt that we are really supportive for young talents: we turn them down should they manifest some erratic behavior .
I want to be clear here. Stating that MRR is innocent, I don't want to say that the MRR code is absolutely clean. It's a relatively new code, so most probably it still contains serious bugs. Sergey has recently pointed me to the problem of unlocked gaps for InnoDB. A similar problem should exist for Index Merge. If it's resolved there, why can't the same solution be applied for MMR? If it's not resolved there, should we disable Index Merge for InnoDB as well?
And what about you? How do you find MRR/ICP? Guilty, or NOT guilty? Should the case be appealed?
On a side note, I would like to add that MRR and ICP for InnoDB are really smart optimizations. MRR for InnoDB allows us to accumulate primary keys for multiple lookups in a buffer before fetching data in a sequential manner. The more keys you accumulate, the less disk sweeps you need to fetch the data. ICP allows us to reject a row as soon as the condition over the index fields is not satisfied.
MRR and ICP can interplay, or can be used independently. They both can be very helpful for BKA . Moreover, in fact, it does not make too much sense to use BKA for InnoDB/MyISAM without MRR.
The three features put together can give you a boost of performance for join queries that involve many rows. Yet, this will be the subject of a separate blog.
> I have seen descriptions for each of these features that describes
> them in isolation. Is there one page where they are described
> together? And if there isn't can I convince Sergey and Igor to write a
> new blog post?
Further in the post, Mark points to the slew of bugs allegedly in the implementation of ICP for the InnoDb engine that forced managers at Sun/Oracle to ask Sergey Petrunia, the author of the code, to turn the feature off , though for this engine only. The patch was submitted and the next alpha release of the Azalea version will appear with this optimization disabled. At the same time Sergey was asked to disable the implementation of the MRR interface for InnoDB too.
Why? Because at Sun they were not sure who the culprit was. They had a choir of unhappy customers (reported bugs #36981, #34591, #35080, #37415, #34590, #37208, #40992, #42580, #43617), some of whom blamed ICP, some – MRR, and some - both. Taking into account a subdued grumble of some undisputed authorities who first planned to port the BKA code into their product and then put the project on hold due to a mere fact that the MRR code was said to be “utterly crappy”, what would you do if you were responsible for making decisions? Right. Condemn both suspects to minimize the consequences of a mistake. None of them is worth a single tear of our customers. What? Are they are not even of the same kind? This makes them even more guilty. This makes them an organized gang. You are still not sure they are desperate criminals? Open the bug reports, read them. Do they make you cry too? Are you convinced at last?
Anyway, the verdict runs as follows:
[25 Jun 4:00] Paul DuBois Noted in 5.4.4 changelog.
The Multi-Range Read access method does not work reliably for InnoDB
and has been disabled for InnoDB tables.
End of the story. Justice reigns. The villains deserve capital punishment, but we are civilized people and we believe that properly applied corrective measures have a good chance of improving any scoundrel. As for now a full isolation of the convicts would be not only in the interest of good folks, but in the interest of the poor creatures as well.
At last all of us can exhale with a relief and rejoice the life where there are no bugs just because everything that could attract them has been mortified.
I used to know MRR for InndDb when he was a just a child. There was nothing wrong in how he was raising up. He was brought up together with his siblings, first with MRR for MyISAM, MRR for NDB Cluster, later with MRR for Falcon. That's true that they have different biological mothers, but all these children were conceived by the same father. Moreover the implementations of the MRR interface for MyISAM and for InnoDB were based on the same code and were engine independent. If there had been any genetic defect in the code it would have manifested itself for MyISAM as well.
The implementation of the MRR interface for MyISAM/InnoDb employs other handler functions such as index_read and rnd_pos. Either there are some subtleties or side effects in the implementations of these functions for InnoDB, or the accusations are false.
So I decided to spend some of my time to investigate the case on my own. First I looked through the bug reports.
The reports for bugs #36981, #34591, #35080, #37415 say that with the settings:
set optimizer_use_mrr='disable';
set engine_condition_pushdown=off;
no crashes are observed for the reported cases.
The report for bug #34590 is certain that the setting
set engine_condition_pushdown=off;
is enough to prevent crashing.
The analysis made by Paul DuBois for bug #40992 leads to the conclusion that only settings for engine_condition_pushdown matter when trying to cause a crash. It's worth noting that the test case for this bug resembles pretty much the one reported by Shane Bester for bug #36981.
The reports for bugs #37208, #42580, #43617 complain about wrong results when engine_condition_pushdown is 'on'. Once more the report for #42580 says that the problem appears when both optimizer_use_mrr and engine_condition_pushdown are set to default values.
After I had read all these reports, it became clear for me that there were no direct evidences of MRR's bad behavior, at least from the cases that were submitted.
To check that the MRR code had nothing to do the reported problems, I built the server from the latest mysql-azalea tree, reverted the patch from bug #45029, and ran the test cases with engine_condition_pushdown set to 'off'. All the test cases passed through without crashes and with correct results. (Of course I could run only those test cases that I was able to extract from bug database easily. These were the test cases for bugs: #36981, #34590, #42580, #43617.)
I continued my investigation. First I debugged the test case for bug #42580 that returned a wrong result and looked quite simple. After several attempts to force my way through the InnoDB native code finally I came to the offending lines in innobase/row/row0sel.c:
*(prebuilt->fetch_cache[prebuilt->n_fetch_cached] + offs) ^=
(*(remainder_buf + offs) & templ->mysql_null_bit_mask);
It was Sergey's code and the intention was to mask in the null bit of an index field in the internal InnoDB row cache. It could be properly done with the code like this:
null_byte= prebuilt->fetch_cache[prebuilt->n_fetch_cached]+offs;
(*null_byte)&= ~templ->mysql_null_bit_mask;
(*null_byte)|= (*(remainder_buf + offs) &
templ->mysql_null_bit_mask);
I applied the fix and all the wrong results (bugs #37208, #42580, #43617) went away.
The crashes remained though. Several hours of additional debugging for the test case of bug #36981 brought me to a really bad memory overwrite in the build_template function from innobase/handler/ha_innodb.cc. By a pure chance the overwrite did not cause a problem for my test case. The crash was caused by usage of wrong template structures for reading row fields from mysql buffers. The crash happened in the code that had been added by Sergey in the row_search_for_mysql function to evaluate conditions pushed down to indexes. This problem can not be easily fixed since with ICP we may need two arrays of prebuilt template structures when executing the SELECT FOR UPDATE queries or multi-UPDATE/DELETE queries: one for the fields of the scanning index, another for the fields of the clustering index.
As a temporary solution I could suggest to block usage of ICP for such queries. It can be done with the following code:
if (file->active_index == file->pushed_idx_cond_keyno &&
file->active_index != MAX_KEY &&
index == prebuilt->index)
do_idx_cond_push= need_second_pass= TRUE;
This code prevents crashes for all reported test cases.
Only two days of investigations (I literally spent only a week-end for it, and, I would have spent much less time if I were familiar a little bit with the InnoDB code) convinced me that MRR for InnoDB is absolutely innocent, while ICP for InnoDB, though being guilty of serious misdeeds, should not undergo any severe punishment, as it's quite naturally to expect some faults from such a young feature and we don't have to employ any penitentiary institutions to correct these deviations.
A different attitude to this misdemeanor would make me doubt that we are really supportive for young talents: we turn them down should they manifest some erratic behavior .
I want to be clear here. Stating that MRR is innocent, I don't want to say that the MRR code is absolutely clean. It's a relatively new code, so most probably it still contains serious bugs. Sergey has recently pointed me to the problem of unlocked gaps for InnoDB. A similar problem should exist for Index Merge. If it's resolved there, why can't the same solution be applied for MMR? If it's not resolved there, should we disable Index Merge for InnoDB as well?
And what about you? How do you find MRR/ICP? Guilty, or NOT guilty? Should the case be appealed?
On a side note, I would like to add that MRR and ICP for InnoDB are really smart optimizations. MRR for InnoDB allows us to accumulate primary keys for multiple lookups in a buffer before fetching data in a sequential manner. The more keys you accumulate, the less disk sweeps you need to fetch the data. ICP allows us to reject a row as soon as the condition over the index fields is not satisfied.
MRR and ICP can interplay, or can be used independently. They both can be very helpful for BKA . Moreover, in fact, it does not make too much sense to use BKA for InnoDB/MyISAM without MRR.
The three features put together can give you a boost of performance for join queries that involve many rows. Yet, this will be the subject of a separate blog.
Thursday, July 2, 2009
Newly born...
Yesterday I still worked for Sun Microsystems as a Principal Engineer & MySQL Sr. Architect. Today was my first day of work at Monty Program, Inc, a subsidiary of a tiny company established by Monty Widenius in February. Yet I didn't even make the top ten. If people want not to miss the train they have to hurry up.
Am I happy? Oh, yeah...
No more waking up with the question constantly drilling my mind: “What am I doing here?” What are all of us, MySQL Server developers, doing without Monty? Waiting for the time when all our options are vested? I can't . That's too long for me. I'm already too old to wait any more.
Besides, we've already lost at least 3 years. We have to do what we planned to do in 2005. We have to raise the Server to the level where any RDBMS that claims to be called mature should be.
So who is newly born? Me? In a way, yes. This is my second reincarnation for the history of MySQL, after the first one that happened in December 2002 .
Also newly born is this blog where I'm planning to share with you, from time to time, my observations on the Server development at Monty Program and on interesting patches in the new server code that other people contribute. However, I don't want to limit myself only to this topic. What else am I going to share with you? You'll see soon enough...
Am I happy? Oh, yeah...
No more waking up with the question constantly drilling my mind: “What am I doing here?” What are all of us, MySQL Server developers, doing without Monty? Waiting for the time when all our options are vested? I can't . That's too long for me. I'm already too old to wait any more.
Besides, we've already lost at least 3 years. We have to do what we planned to do in 2005. We have to raise the Server to the level where any RDBMS that claims to be called mature should be.
So who is newly born? Me? In a way, yes. This is my second reincarnation for the history of MySQL, after the first one that happened in December 2002 .
Also newly born is this blog where I'm planning to share with you, from time to time, my observations on the Server development at Monty Program and on interesting patches in the new server code that other people contribute. However, I don't want to limit myself only to this topic. What else am I going to share with you? You'll see soon enough...
Subscribe to:
Posts (Atom)