MySQL Is Driving Me Insane – GROUPed VIEWs and Multi-Table UPDATEs

I was pretty happy to discover that the latest releases of MySQL (5.0+) allow for creating VIEWs and offer improved subqueries. I’ve been trying to incorporate some of this into my latest little game project, but a few irritations in MySQL are making it a miserable process.

The biggest problem is that the new MySQL VIEWs automatically resort to TEMPTABLE if you throw a GROUP BY into a query. As a result, it’s impossible to create certain types of VIEWs on very large tables, because your logical WHERE restriction on the VIEW behaves as if you issued GROUP BY…HAVING on the underlying query instead of WHERE somerestriction[s]…GROUP BY. If your query would return 10,000 rows (yes, I know that’s only relatively “very large” but I need to keep my query result times in the sub-second range and I only need 32 of those rows) before HAVING is applied, well, that’s what you get with the VIEW. For example:

CREATE VIEW myview AS SELECT foo.ID FROM foo_table foo GROUP BY foo.bar;

In that little snippet, you’d think that you could “SELECT * FROM myview WHERE foo.ID = 1” and restrict the result set, but the WHERE restriction is applied as a HAVING. This is really, really unfortunate, as it makes the VIEW painfully, unusably slow when you have more than a tiny handful of rows.

The second annoyance is that while MySQL supports multi-table UPDATEs, it behaves in an illogical (to me) fashion when trying to do the following:

UPDATE foo JOIN bar ON somecondition SET foo.somefield = foo.somefield + 1;

The logical behavior seems to me to be that for each “bar”, “somefield” would get incremented by 1. However, in practice, MySQL updates “somefield” once and ignores the rest of the results returned by the “bar” JOIN. It’s almost the same problem discussed here and here, but dumping/reloading the tables seems irrelevant (as does my MySQL version – 5.0.22 currently) and I seem to be bumping up against a restriction briefly mentioned at the bottom of the MySQL “UPDATE Syntax” documentation page – “Currently, you cannot update a table and select from the same table in a subquery.” [Emphasis mine] I’m not the only person to have raised questions about UPDATE’s odd multi-table behavior.

I was hoping to use something like the second to mitigate the problems created by the VIEW shortcomings, but that doesn’t seem to be happening. The best solution I’ve encountered seems to be to use group functions in a subquery as per:

UPDATE foo
(SELECT SUM(somevalue), id FROM bar WHERE somecondition GROUP BY something) bar_alias
SET foo.somefield = foo.somefield + bar_alias.somevalue
WHERE foo.id = bar_alias.id;

That’s much less elegant and less flexible than using either an UPDATE…JOIN myview or UPDATE…join_query which I’d hoped to be able to do.

Here’s to hoping that the VIEW gets optimized better for the future. It would definitely make my life easier to be able to use GROUP BY in a VIEW without choking up my system. Or maybe “Currently” will become “Long ago” and I can use self-joining table UPDATEs, which is better than nothing.

And if you happen to know any way around either of these that doesn’t involve eliminating the GROUP BY on a VIEW or using a SUM() on the UPDATE, please tell me!

Leave a Reply

Your email address will not be published. Required fields are marked *