S
Sergei
Guest
Let me start with a little story. You sit in your house near the fireplace in the living room and need a book from the library… Eh, no, sorry, wrong century. You’re building a robotic arm that will open your beer or brew your coffee or supply you with whatever other drinks of your choice… while you’ll be building the next robotic arm. So, you — soldering iron in one hand and Arduino in another — ask your little brother to bring a box with specific resistors (that you unexpectedly run out of) from the cellar. The problem — your brother is small and cannot tell a resistor from a respirator. You explain that it’s small thing with two wires sticking out of it. And he starts going back and forth brining you boxes after boxes of different small things with two wires.
This is approximately where we were in MySQL when NDB Cluster was just added. The use wants to find a row, say WHERE number_of_wires=2 AND size='small' AND type='resistor' and ohm=10000. And the optimizer devises an execution plan to use the “ref” access for the number_of_wires column. And NDB Cluster engine starts retrieving all rows that have number_or_wires=2 from all nodes and send all these rows over the network to the server only to see them rejected over and over, because they don’t satisfy the rest of the WHERE condition.
Thus, you see, originally engine condition pushdown was implemented for NDB Cluster. With engine condition pushdown the server can send the complete WHERE clause to NDB, and the latter will send it to all the nodes, so only rows that actually satisfy the complete WHERE clause will be sent over the network back to the server. If the WHERE clause is too complex for nodes to understand (for example, it involves a stored function) the NDB can push only a part of it and let the server enforce the rest.
It wasn’t all perfect, though. If engine condition pushdown is used the server needs to prepare a condition for pushing. In a join it means extracting parts of the WHERE clause that don’t depend on data from tables which weren’t read yet. It creates a sizable overhead for every query even if it doesn’t use NDB Cluster. And most queries don’t. So, engine condition pushdown was disabled by default.
Fast forward to MariaDB. MariaDB doesn’t have NDB Cluster, but it has many other storage engines. For some of them, for example, for SphinxSE, engine condition pushdown is not just a useful optimization, but a requirement, SphinxSE simply cannot function properly if engine condition pushdown is disabled. We’ve solved it in MariaDB 5.2 by implementing an override — the engine could activate engine condition pushdown for its tables, even if it would be disabled for all other tables. No overhead for queries that don’t use SphinxSE. Later other engines started to use this override too.
That’s how we ended up in a strange situation, engine condition pushdown is disabled globally by default, but de facto all engines that can use engine condition pushdown, use it anyway, via this override. And if a user enables engine condition pushdown, it adds no additional optimizations or functionality, but only an overhead of the server preparing pushdown conditions for engines that cannot use them.
This is why in MariaDB 10.1.1 we deprecate engine_condition_pushdown flag. You can still write
SET optimizer_switch="engine_condition_pushdown=on";
but it will not do anything besides printing a deprecation warning. Starting from MariaDB 10.1.1 engine condition pushdown is always enabled for engines that support it and always disabled for engines that do not support it.
Continue reading...
This is approximately where we were in MySQL when NDB Cluster was just added. The use wants to find a row, say WHERE number_of_wires=2 AND size='small' AND type='resistor' and ohm=10000. And the optimizer devises an execution plan to use the “ref” access for the number_of_wires column. And NDB Cluster engine starts retrieving all rows that have number_or_wires=2 from all nodes and send all these rows over the network to the server only to see them rejected over and over, because they don’t satisfy the rest of the WHERE condition.
Thus, you see, originally engine condition pushdown was implemented for NDB Cluster. With engine condition pushdown the server can send the complete WHERE clause to NDB, and the latter will send it to all the nodes, so only rows that actually satisfy the complete WHERE clause will be sent over the network back to the server. If the WHERE clause is too complex for nodes to understand (for example, it involves a stored function) the NDB can push only a part of it and let the server enforce the rest.
It wasn’t all perfect, though. If engine condition pushdown is used the server needs to prepare a condition for pushing. In a join it means extracting parts of the WHERE clause that don’t depend on data from tables which weren’t read yet. It creates a sizable overhead for every query even if it doesn’t use NDB Cluster. And most queries don’t. So, engine condition pushdown was disabled by default.
Fast forward to MariaDB. MariaDB doesn’t have NDB Cluster, but it has many other storage engines. For some of them, for example, for SphinxSE, engine condition pushdown is not just a useful optimization, but a requirement, SphinxSE simply cannot function properly if engine condition pushdown is disabled. We’ve solved it in MariaDB 5.2 by implementing an override — the engine could activate engine condition pushdown for its tables, even if it would be disabled for all other tables. No overhead for queries that don’t use SphinxSE. Later other engines started to use this override too.
That’s how we ended up in a strange situation, engine condition pushdown is disabled globally by default, but de facto all engines that can use engine condition pushdown, use it anyway, via this override. And if a user enables engine condition pushdown, it adds no additional optimizations or functionality, but only an overhead of the server preparing pushdown conditions for engines that cannot use them.
This is why in MariaDB 10.1.1 we deprecate engine_condition_pushdown flag. You can still write
SET optimizer_switch="engine_condition_pushdown=on";
but it will not do anything besides printing a deprecation warning. Starting from MariaDB 10.1.1 engine condition pushdown is always enabled for engines that support it and always disabled for engines that do not support it.
Continue reading...