Quantcast

mm_webform_file_download query very slow on large databases

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

mm_webform_file_download query very slow on large databases

McBride, Ian S.
Short version: we recommend upgrading to webform 7.x-4.2 or applying the patch provided here https://www.drupal.org/node/2213945

We attempted to upgrade the main Middlebury site to Drupal 7 this weekend and, while the upgrade completed successfully, the site was too slow to remain online and we had to roll back. It turns out that the culprit was the query in mm_webform_file_download() which is run every time a file is accessed to check if it's a file associated with a webform submission and, therefore, needs another access check.

The query compares f.fid (the file ID) with the values of d.data (the value of every component in every webform submission). Prior to webform 7.x-4.2 there is not an index on webform_submitted_data.data. For us, this meant that every file access check was taking about a second to execute on the database, because we have 1,562,169 rows in webform_submitted_data.

I wanted to let you all know that this may be an issue causing performance degradation on your sites and there's a simple patch available that adds an index to the appropriate column.

Ian McBride
[hidden email]<mailto:[hidden email]>
skype: [hidden email]<mailto:[hidden email]>
Web Technologies & Services
Information Technology Services
Middlebury College

---
You are currently subscribed to monster_menus as: [hidden email].
To unsubscribe click here: http://lists.middlebury.edu/u?id=685503.6b071f880fe6a965a128164e6d09ea81&n=T&l=monster_menus&o=711321
or send a blank email to [hidden email]
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: mm_webform_file_download query very slow on large databases

Adam Franco
Administrator
After doing some more testing, it turns out that the patch in webform 7.x-4.2 won't completely solve this issue. Because the mm_webform_file_download($uri) function is joining between an integer file_managed.fid column and a text webform_submitted_data.data column, the new data index in webform-7.x0-4.2 won't be used.

The query in mm_webform_file_download($uri) can be sped up by replacing:
'INNER JOIN {webform_submitted_data} d ON STRCMP(f.fid,d.data) = 0 ' .
with
'INNER JOIN {webform_submitted_data} d ON CONVERT(f.fid USING utf8) = d.data ' .

After we are fully tested and verify that this doesn't cause other problems I'll create an issue with this patch.



Adam

--

Adam Franco
Senior Software Developer
Information Technology Services
Middlebury College
Middlebury, VT 05753
[hidden email]
802.443.2244

On Tue, Feb 3, 2015 at 11:19 AM, McBride, Ian S. <[hidden email]> wrote:
Short version: we recommend upgrading to webform 7.x-4.2 or applying the patch provided here https://www.drupal.org/node/2213945

We attempted to upgrade the main Middlebury site to Drupal 7 this weekend and, while the upgrade completed successfully, the site was too slow to remain online and we had to roll back. It turns out that the culprit was the query in mm_webform_file_download() which is run every time a file is accessed to check if it's a file associated with a webform submission and, therefore, needs another access check.

The query compares f.fid (the file ID) with the values of d.data (the value of every component in every webform submission). Prior to webform 7.x-4.2 there is not an index on webform_submitted_data.data. For us, this meant that every file access check was taking about a second to execute on the database, because we have 1,562,169 rows in webform_submitted_data.

I wanted to let you all know that this may be an issue causing performance degradation on your sites and there's a simple patch available that adds an index to the appropriate column.

Ian McBride
[hidden email]<mailto:[hidden email]>
skype: [hidden email]<mailto:[hidden email]>
Web Technologies & Services
Information Technology Services
Middlebury College

---
You are currently subscribed to monster_menus as: [hidden email].
To unsubscribe click here: http://lists.middlebury.edu/u?id=685438.780c6126d238396bdd2f98c1d84c15c7&n=T&l=monster_menus&o=711321
or send a blank email to [hidden email]

---

You are currently subscribed to monster_menus as: [hidden email].

To unsubscribe click here: http://lists.middlebury.edu/u?id=685503.6b071f880fe6a965a128164e6d09ea81&n=T&l=monster_menus&o=711332

(It may be necessary to cut and paste the above URL if the line is broken)

or send a blank email to [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: mm_webform_file_download query very slow on large databases

Dan Wilga-2
In reply to this post by McBride, Ian S.
It's understandable that simply adding an index wouldn't help. Frankly, I doubt that CONVERT will be any faster, though; in my experience any sort of JOIN using columns of disparate character sets (which causes an implict CONVERT) is very slow.

I think the better solution would be to add a new integer column to webform_submitted_data which mirrors what's in the text "data" column. But, of course, this would require a pretty hefty change in webform.

On 2/3/15 2:52 PM, Adam Franco wrote:
After doing some more testing, it turns out that the patch in webform 7.x-4.2 won't completely solve this issue. Because the mm_webform_file_download($uri) function is joining between an integer file_managed.fid column and a text webform_submitted_data.data column, the new data index in webform-7.x0-4.2 won't be used.

The query in mm_webform_file_download($uri) can be sped up by replacing:
'INNER JOIN {webform_submitted_data} d ON STRCMP(f.fid,d.data) = 0 ' .
with
'INNER JOIN {webform_submitted_data} d ON CONVERT(f.fid USING utf8) = d.data ' .

After we are fully tested and verify that this doesn't cause other problems I'll create an issue with this patch.



Adam

--

Adam Franco
Senior Software Developer
Information Technology Services
Middlebury College
Middlebury, VT 05753
[hidden email]
802.443.2244

On Tue, Feb 3, 2015 at 11:19 AM, McBride, Ian S. <[hidden email]> wrote:
Short version: we recommend upgrading to webform 7.x-4.2 or applying the patch provided here https://www.drupal.org/node/2213945

We attempted to upgrade the main Middlebury site to Drupal 7 this weekend and, while the upgrade completed successfully, the site was too slow to remain online and we had to roll back. It turns out that the culprit was the query in mm_webform_file_download() which is run every time a file is accessed to check if it's a file associated with a webform submission and, therefore, needs another access check.

The query compares f.fid (the file ID) with the values of d.data (the value of every component in every webform submission). Prior to webform 7.x-4.2 there is not an index on webform_submitted_data.data. For us, this meant that every file access check was taking about a second to execute on the database, because we have 1,562,169 rows in webform_submitted_data.

I wanted to let you all know that this may be an issue causing performance degradation on your sites and there's a simple patch available that adds an index to the appropriate column.

Ian McBride
[hidden email]<mailto:[hidden email]>
skype: [hidden email]<mailto:[hidden email]>
Web Technologies & Services
Information Technology Services
Middlebury College

---
You are currently subscribed to monster_menus as: [hidden email].
To unsubscribe click here: http://lists.middlebury.edu/u?id=685438.780c6126d238396bdd2f98c1d84c15c7&n=T&l=monster_menus&o=711321
or send a blank email to [hidden email]

---

You are currently subscribed to monster_menus as: [hidden email].

To unsubscribe click here: http://lists.middlebury.edu/u?id=685500.19fa7de7038497527f6a88cf1629251d&n=T&l=monster_menus&o=711332

(It may be necessary to cut and paste the above URL if the line is broken)

or send a blank email to [hidden email]


---

You are currently subscribed to monster_menus as: [hidden email].

To unsubscribe click here: http://lists.middlebury.edu/u?id=685503.6b071f880fe6a965a128164e6d09ea81&n=T&l=monster_menus&o=711338

(It may be necessary to cut and paste the above URL if the line is broken)

or send a blank email to [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: mm_webform_file_download query very slow on large databases

Adam Franco
Administrator
In reply to this post by McBride, Ian S.
Hi Dan,

I've been doing testing on the query in question on a copy of our production data which has 1,759,838 rows in webform_submitted_data, 5,866 rows in webform_component, and 60,704 rows in file_managed.

The original version of the query takes 0.85 seconds to execute for webform file uris and 4.75 seconds for non-webform file uris:

SELECT
    f.*, d.nid
FROM
    file_managed f
    INNER JOIN webform_submitted_data d ON f.fid = d.data
    INNER JOIN webform_component c ON d.nid = c.nid AND d.cid = c.cid AND c.type = 'file'
WHERE
    f.uri = 'sites/www.middlebury.edu/files/webform/xxxxx.pdf'

The EXPLAIN results indicate that while there is an index on data, it is never used.

In contrast, the new version of this query with CONVERT() takes only 0.0006 seconds for all files:

SELECT
    f.*, d.nid
FROM
    file_managed f
    INNER JOIN webform_submitted_data d ON CONVERT(f.fid USING utf8) = d.data
    INNER JOIN webform_component c ON d.nid = c.nid AND d.cid = c.cid AND c.type = 'file'
WHERE
    f.uri = 'sites/www.middlebury.edu/files/webform/xxxxx.pdf'

As far as I can tell, the reason for the drastic speed-up is that the execution plan is shifted to select first from file_managed using the `uri` index from the WHERE clause, then only convert 0 or 1 values before joining to webform_submitted_data -- rather than having to convert all of the values in either table.

I found that the same effect could be achieved by rewriting the query to generate the file_managed result in a subquery, a method slightly less at the whims of the internals of the MySQL's execution planning:

SELECT
    f.*, d.nid
FROM
    (SELECT
        f.*, CONVERT(f.fid USING utf8) AS fidstring
    FROM
        file_managed f 
    WHERE
        f.uri = 'sites/www.middlebury.edu/files/webform/fakhreddine-acls.pdf'
     ) AS f
    INNER JOIN webform_submitted_data d ON fidstring = d.data
    INNER JOIN webform_component c ON d.nid = c.nid AND d.cid = c.cid AND c.type = 'file'

I'd be happy to do addition tests if there are edge cases you can think of, but rewriting the query seems to be a much simpler option than maintaining derived data in its own column.

Best,
Adam

--

Adam Franco
Senior Software Developer
Information Technology Services
Middlebury College
Middlebury, VT 05753
[hidden email]
802.443.2244

On Tue, Feb 3, 2015 at 3:37 PM, Dan Wilga <[hidden email]> wrote:
It's understandable that simply adding an index wouldn't help. Frankly, I doubt that CONVERT will be any faster, though; in my experience any sort of JOIN using columns of disparate character sets (which causes an implict CONVERT) is very slow.

I think the better solution would be to add a new integer column to webform_submitted_data which mirrors what's in the text "data" column. But, of course, this would require a pretty hefty change in webform.

On 2/3/15 2:52 PM, Adam Franco wrote:
After doing some more testing, it turns out that the patch in webform 7.x-4.2 won't completely solve this issue. Because the mm_webform_file_download($uri) function is joining between an integer file_managed.fid column and a text webform_submitted_data.data column, the new data index in webform-7.x0-4.2 won't be used.

The query in mm_webform_file_download($uri) can be sped up by replacing:
'INNER JOIN {webform_submitted_data} d ON STRCMP(f.fid,d.data) = 0 ' .
with
'INNER JOIN {webform_submitted_data} d ON CONVERT(f.fid USING utf8) = d.data ' .

After we are fully tested and verify that this doesn't cause other problems I'll create an issue with this patch.



Adam

--

Adam Franco
Senior Software Developer
Information Technology Services
Middlebury College
Middlebury, VT 05753
[hidden email]<mailto:[hidden email]>
<a href="tel:802.443.2244" value="+18024432244">802.443.2244

On Tue, Feb 3, 2015 at 11:19 AM, McBride, Ian S. <[hidden email]<mailto:[hidden email]>> wrote:
Short version: we recommend upgrading to webform 7.x-4.2 or applying the patch provided here https://www.drupal.org/node/2213945

We attempted to upgrade the main Middlebury site to Drupal 7 this weekend and, while the upgrade completed successfully, the site was too slow to remain online and we had to roll back. It turns out that the culprit was the query in mm_webform_file_download() which is run every time a file is accessed to check if it's a file associated with a webform submission and, therefore, needs another access check.

The query compares f.fid (the file ID) with the values of d.data (the value of every component in every webform submission). Prior to webform 7.x-4.2 there is not an index on webform_submitted_data.data. For us, this meant that every file access check was taking about a second to execute on the database, because we have 1,562,169 rows in webform_submitted_data.

I wanted to let you all know that this may be an issue causing performance degradation on your sites and there's a simple patch available that adds an index to the appropriate column.

Ian McBride
[hidden email]<mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>
skype: [hidden email]<mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>
Web Technologies & Services
Information Technology Services
Middlebury College

---
You are currently subscribed to monster_menus as: [hidden email]<mailto:[hidden email]>.
To unsubscribe click here: http://lists.middlebury.edu/u?id=685438.780c6126d238396bdd2f98c1d84c15c7&n=T&l=monster_menus&o=711321
or send a blank email to [hidden email]<mailto:[hidden email]>


---

You are currently subscribed to monster_menus as: [hidden email]<mailto:[hidden email]>.

To unsubscribe click here: http://lists.middlebury.edu/u?id=685500.19fa7de7038497527f6a88cf1629251d&n=T&l=monster_menus&o=711332

(It may be necessary to cut and paste the above URL if the line is broken)

or send a blank email to [hidden email]<mailto:[hidden email]>


---

You are currently subscribed to monster_menus as: [hidden email]<mailto:[hidden email]>.

To unsubscribe click here: http://lists.middlebury.edu/u?id=685438.780c6126d238396bdd2f98c1d84c15c7&n=T&l=monster_menus&o=711338

(It may be necessary to cut and paste the above URL if the line is broken)

or send a blank email to [hidden email]<mailto:[hidden email]>

---

You are currently subscribed to monster_menus as: [hidden email].

To unsubscribe click here: http://lists.middlebury.edu/u?id=685503.6b071f880fe6a965a128164e6d09ea81&n=T&l=monster_menus&o=711349

(It may be necessary to cut and paste the above URL if the line is broken)

or send a blank email to [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: mm_webform_file_download query very slow on large databases

grahamtk
In reply to this post by McBride, Ian S.
I dont know if your drupal 7 site performed slow in more areas than the
mentioned sql query, but our drupal 7 installation on http://www.nmbu.no
responded until very recently below expectations with average execution time
for logged in users at about 1.5 seconds per page and around 150 MB memory
used per page load.

When we upgraded to php 5.5, apache 2.4 and mysql 5.5 and this blew us away.
we now have page execution times of around 500ms and memory usage of around
35MB. This is a huge improvement and has saved a lot of frustration :)

cheers.
Øyvind



--
View this message in context: http://monster-menus.2910260.n2.nabble.com/mm-webform-file-download-query-very-slow-on-large-databases-tp7573036p7573040.html
Sent from the Monster Menus mailing list archive at Nabble.com.

---
You are currently subscribed to monster_menus as: [hidden email].
To unsubscribe click here: http://lists.middlebury.edu/u?id=685503.6b071f880fe6a965a128164e6d09ea81&n=T&l=monster_menus&o=711373
or send a blank email to [hidden email]
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: mm_webform_file_download query very slow on large databases

grahamtk
This post has NOT been accepted by the mailing list yet.
In reply to this post by McBride, Ian S.
I dont know if your drupal 7 site performed slow in more areas than the mentioned sql query, but our drupal 7 installation on http://www.nmbu.no responded until very recently below expectations with average execution time for logged in users at about 1.5 seconds per page and around 150 MB memory used per page load.

When we upgraded to php 5.5, apache 2.4 and mysql 5.5 and this blew us away. we now have page execution times of around 500ms and memory usage of around 35MB. This is a huge improvement and has saved a lot of frustration :)

cheers.
Øyvind
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: mm_webform_file_download query very slow on large databases

Dan Wilga-2
In reply to this post by McBride, Ian S.
Hi Adam,

As long as you're happy with your tests, I'm happy. Go ahead and work up a patch using either of the faster methods.

Is the index on the fid column still needed? I'm guessing not, though it would be good to test, since not having to add that would be a plus.

By the way, are you using actual MySQL or MariaDB? The latter might have better performance in this area, potentially skewing the tests.

On 2/3/15 4:53 PM, Adam Franco wrote:
Hi Dan,

I've been doing testing on the query in question on a copy of our production data which has 1,759,838 rows in webform_submitted_data, 5,866 rows in webform_component, and 60,704 rows in file_managed.

The original version of the query takes 0.85 seconds to execute for webform file uris and 4.75 seconds for non-webform file uris:

SELECT
    f.*, d.nid
FROM
    file_managed f
    INNER JOIN webform_submitted_data d ON f.fid = d.data
    INNER JOIN webform_component c ON d.nid = c.nid AND d.cid = c.cid AND c.type = 'file'
WHERE
    f.uri = 'sites/www.middlebury.edu/files/webform/xxxxx.pdf'

The EXPLAIN results indicate that while there is an index on data, it is never used.

In contrast, the new version of this query with CONVERT() takes only 0.0006 seconds for all files:

SELECT
    f.*, d.nid
FROM
    file_managed f
    INNER JOIN webform_submitted_data d ON CONVERT(f.fid USING utf8) = d.data
    INNER JOIN webform_component c ON d.nid = c.nid AND d.cid = c.cid AND c.type = 'file'
WHERE
    f.uri = 'sites/www.middlebury.edu/files/webform/xxxxx.pdf'

As far as I can tell, the reason for the drastic speed-up is that the execution plan is shifted to select first from file_managed using the `uri` index from the WHERE clause, then only convert 0 or 1 values before joining to webform_submitted_data -- rather than having to convert all of the values in either table.

I found that the same effect could be achieved by rewriting the query to generate the file_managed result in a subquery, a method slightly less at the whims of the internals of the MySQL's execution planning:

SELECT
    f.*, d.nid
FROM
    (SELECT
        f.*, CONVERT(f.fid USING utf8) AS fidstring
    FROM
        file_managed f 
    WHERE
        f.uri = 'sites/www.middlebury.edu/files/webform/fakhreddine-acls.pdf'
     ) AS f
    INNER JOIN webform_submitted_data d ON fidstring = d.data
    INNER JOIN webform_component c ON d.nid = c.nid AND d.cid = c.cid AND c.type = 'file'

I'd be happy to do addition tests if there are edge cases you can think of, but rewriting the query seems to be a much simpler option than maintaining derived data in its own column.

Best,
Adam

--

Adam Franco
Senior Software Developer
Information Technology Services
Middlebury College
Middlebury, VT 05753
[hidden email]
802.443.2244

On Tue, Feb 3, 2015 at 3:37 PM, Dan Wilga <[hidden email]> wrote:
It's understandable that simply adding an index wouldn't help. Frankly, I doubt that CONVERT will be any faster, though; in my experience any sort of JOIN using columns of disparate character sets (which causes an implict CONVERT) is very slow.

I think the better solution would be to add a new integer column to webform_submitted_data which mirrors what's in the text "data" column. But, of course, this would require a pretty hefty change in webform.

On 2/3/15 2:52 PM, Adam Franco wrote:
After doing some more testing, it turns out that the patch in webform 7.x-4.2 won't completely solve this issue. Because the mm_webform_file_download($uri) function is joining between an integer file_managed.fid column and a text webform_submitted_data.data column, the new data index in webform-7.x0-4.2 won't be used.

The query in mm_webform_file_download($uri) can be sped up by replacing:
'INNER JOIN {webform_submitted_data} d ON STRCMP(f.fid,d.data) = 0 ' .
with
'INNER JOIN {webform_submitted_data} d ON CONVERT(f.fid USING utf8) = d.data ' .

After we are fully tested and verify that this doesn't cause other problems I'll create an issue with this patch.



Adam

--

Adam Franco
Senior Software Developer
Information Technology Services
Middlebury College
Middlebury, VT 05753
[hidden email]<mailto:[hidden email]>
<a moz-do-not-send="true" href="tel:802.443.2244" value="+18024432244">802.443.2244

On Tue, Feb 3, 2015 at 11:19 AM, McBride, Ian S. <[hidden email]<mailto:[hidden email]>> wrote:
Short version: we recommend upgrading to webform 7.x-4.2 or applying the patch provided here https://www.drupal.org/node/2213945

We attempted to upgrade the main Middlebury site to Drupal 7 this weekend and, while the upgrade completed successfully, the site was too slow to remain online and we had to roll back. It turns out that the culprit was the query in mm_webform_file_download() which is run every time a file is accessed to check if it's a file associated with a webform submission and, therefore, needs another access check.

The query compares f.fid (the file ID) with the values of d.data (the value of every component in every webform submission). Prior to webform 7.x-4.2 there is not an index on webform_submitted_data.data. For us, this meant that every file access check was taking about a second to execute on the database, because we have 1,562,169 rows in webform_submitted_data.

I wanted to let you all know that this may be an issue causing performance degradation on your sites and there's a simple patch available that adds an index to the appropriate column.

Ian McBride
[hidden email]<mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>
skype: [hidden email]<mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>
Web Technologies & Services
Information Technology Services
Middlebury College

---

You are currently subscribed to monster_menus as: [hidden email].

To unsubscribe click here: http://lists.middlebury.edu/u?id=685503.6b071f880fe6a965a128164e6d09ea81&n=T&l=monster_menus&o=711375

(It may be necessary to cut and paste the above URL if the line is broken)

or send a blank email to [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: mm_webform_file_download query very slow on large databases

McBride, Ian S.
In reply to this post by McBride, Ian S.
We’re running MySQL 5.6.10 Enterprise Server – Advanced Edition (Commercial).

Ian McBride
[hidden email]<mailto:[hidden email]>
skype: [hidden email]<mailto:[hidden email]>
Web Technologies & Services
Information Technology Services
Middlebury College

From: Dan Wilga <[hidden email]<mailto:[hidden email]>>
Reply-To: Monster Development <[hidden email]<mailto:[hidden email]>>
Date: Wednesday, February 4, 2015 at 8:50 AM
To: Monster Development <[hidden email]<mailto:[hidden email]>>
Subject: Re: mm_webform_file_download query very slow on large databases

Hi Adam,

As long as you're happy with your tests, I'm happy. Go ahead and work up a patch using either of the faster methods.

Is the index on the fid column still needed? I'm guessing not, though it would be good to test, since not having to add that would be a plus.

By the way, are you using actual MySQL or MariaDB? The latter might have better performance in this area, potentially skewing the tests.

On 2/3/15 4:53 PM, Adam Franco wrote:
Hi Dan,

I've been doing testing on the query in question on a copy of our production data which has 1,759,838 rows in webform_submitted_data, 5,866 rows in webform_component, and 60,704 rows in file_managed.

The original version of the query takes 0.85 seconds to execute for webform file uris and 4.75 seconds for non-webform file uris:

SELECT
    f.*, d.nid
FROM
    file_managed f
    INNER JOIN webform_submitted_data d ON f.fid = d.data
    INNER JOIN webform_component c ON d.nid = c.nid AND d.cid = c.cid AND c.type = 'file'
WHERE
    f.uri = 'sites/www.middlebury.edu/files/webform/xxxxx.pdf<http://www.middlebury.edu/files/webform/xxxxx.pdf>'

The EXPLAIN results indicate that while there is an index on data, it is never used.

In contrast, the new version of this query with CONVERT() takes only 0.0006 seconds for all files:

SELECT
    f.*, d.nid
FROM
    file_managed f
    INNER JOIN webform_submitted_data d ON CONVERT(f.fid USING utf8) = d.data
    INNER JOIN webform_component c ON d.nid = c.nid AND d.cid = c.cid AND c.type = 'file'
WHERE
    f.uri = 'sites/www.middlebury.edu/files/webform/xxxxx.pdf<http://www.middlebury.edu/files/webform/xxxxx.pdf>'

As far as I can tell, the reason for the drastic speed-up is that the execution plan is shifted to select first from file_managed using the `uri` index from the WHERE clause, then only convert 0 or 1 values before joining to webform_submitted_data -- rather than having to convert all of the values in either table.

I found that the same effect could be achieved by rewriting the query to generate the file_managed result in a subquery, a method slightly less at the whims of the internals of the MySQL's execution planning:

SELECT
    f.*, d.nid
FROM
    (SELECT
        f.*, CONVERT(f.fid USING utf8) AS fidstring
    FROM
        file_managed f
    WHERE
        f.uri = 'sites/www.middlebury.edu/files/webform/fakhreddine-acls.pdf<http://www.middlebury.edu/files/webform/fakhreddine-acls.pdf>'
     ) AS f
    INNER JOIN webform_submitted_data d ON fidstring = d.data
    INNER JOIN webform_component c ON d.nid = c.nid AND d.cid = c.cid AND c.type = 'file'

I'd be happy to do addition tests if there are edge cases you can think of, but rewriting the query seems to be a much simpler option than maintaining derived data in its own column.

Best,
Adam

--

Adam Franco
Senior Software Developer
Information Technology Services
Middlebury College
Middlebury, VT 05753
[hidden email]<mailto:[hidden email]>
802.443.2244

On Tue, Feb 3, 2015 at 3:37 PM, Dan Wilga <[hidden email]<mailto:[hidden email]>> wrote:
It's understandable that simply adding an index wouldn't help. Frankly, I doubt that CONVERT will be any faster, though; in my experience any sort of JOIN using columns of disparate character sets (which causes an implict CONVERT) is very slow.

I think the better solution would be to add a new integer column to webform_submitted_data which mirrors what's in the text "data" column. But, of course, this would require a pretty hefty change in webform.

On 2/3/15 2:52 PM, Adam Franco wrote:
After doing some more testing, it turns out that the patch in webform 7.x-4.2 won't completely solve this issue. Because the mm_webform_file_download($uri) function is joining between an integer file_managed.fid column and a text webform_submitted_data.data column, the new data index in webform-7.x0-4.2 won't be used.

The query in mm_webform_file_download($uri) can be sped up by replacing:
'INNER JOIN {webform_submitted_data} d ON STRCMP(f.fid,d.data) = 0 ' .
with
'INNER JOIN {webform_submitted_data} d ON CONVERT(f.fid USING utf8) = d.data ' .

After we are fully tested and verify that this doesn't cause other problems I'll create an issue with this patch.



Adam

--

Adam Franco
Senior Software Developer
Information Technology Services
Middlebury College
Middlebury, VT 05753
[hidden email]<mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>
802.443.2244<tel:802.443.2244>

On Tue, Feb 3, 2015 at 11:19 AM, McBride, Ian S. <[hidden email]<mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>> wrote:
Short version: we recommend upgrading to webform 7.x-4.2 or applying the patch provided here https://www.drupal.org/node/2213945

We attempted to upgrade the main Middlebury site to Drupal 7 this weekend and, while the upgrade completed successfully, the site was too slow to remain online and we had to roll back. It turns out that the culprit was the query in mm_webform_file_download() which is run every time a file is accessed to check if it's a file associated with a webform submission and, therefore, needs another access check.

The query compares f.fid (the file ID) with the values of d.data (the value of every component in every webform submission). Prior to webform 7.x-4.2 there is not an index on webform_submitted_data.data. For us, this meant that every file access check was taking about a second to execute on the database, because we have 1,562,169 rows in webform_submitted_data.

I wanted to let you all know that this may be an issue causing performance degradation on your sites and there's a simple patch available that adds an index to the appropriate column.

Ian McBride
[hidden email]<mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>><mailto:[hidden email]<mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>>
skype: [hidden email]<mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>><mailto:[hidden email]<mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>>
Web Technologies & Services
Information Technology Services
Middlebury College


---

You are currently subscribed to monster_menus as: [hidden email]<mailto:[hidden email]>.

To unsubscribe click here: http://lists.middlebury.edu/u?id=685439.7e7cbccf9bb225cf8471bffe1cb67503&n=T&l=monster_menus&o=711375

(It may be necessary to cut and paste the above URL if the line is broken)

or send a blank email to [hidden email]<mailto:[hidden email]>

---
You are currently subscribed to monster_menus as: [hidden email].
To unsubscribe click here: http://lists.middlebury.edu/u?id=685503.6b071f880fe6a965a128164e6d09ea81&n=T&l=monster_menus&o=711376
or send a blank email to [hidden email]
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: mm_webform_file_download query very slow on large databases

grahamtk
In reply to this post by grahamtk
Hi!
More on performance, and my attempt to stay on the topic of this mail
thread:

About the non indexed join in the mm_webform_file_download query:
Slightly increasing this might have helped the non-indexed join a bit:
join_buffer_size = 6M
Getting the index on is ofc. better.

*My notes on mysql tuning: *
( would love to get comments on this!)

I mentioned our huge performance increase with php 5.5 and mysql 5.5 on
ubuntu 12.04,
and that most settings were standard. I believed this to be the case,
but it turns out that the upgrade kept the my.cnf settings I had entered.

These are optimized after inspiration from various sources like
stackexchange,
drupal.org, and by trying out adjustments while keeping an eye on the mysql
throughput measurements
and xhprof + devel module query time output,
I have not had time (nor competence) to do this thoroughly scientific.

table_cache   = 2024                                
table_definition_cache = 2024                        
table_open_cache = 16384                            

query_cache_limit = 2M
query_cache_size        = 64M
innodb_flush_method            = O_DIRECT   << I just heard this is not
better than the default.
innodb_log_files_in_group      = 2                  
innodb_log_buffer_size = 4M                          
innodb_additional_mem_pool_size = 4M                
#innodb_log_file_size           = 128M              
innodb_flush_log_at_trx_commit = 2                  
innodb_file_per_table          = 1                  
innodb_buffer_pool_size        = 4G                  
                                                     
skip-name-resolve                                    
                                                                                                         



--
View this message in context: http://monster-menus.2910260.n2.nabble.com/mm-webform-file-download-query-very-slow-on-large-databases-tp7573036p7573044.html
Sent from the Monster Menus mailing list archive at Nabble.com.

---
You are currently subscribed to monster_menus as: [hidden email].
To unsubscribe click here: http://lists.middlebury.edu/u?id=685503.6b071f880fe6a965a128164e6d09ea81&n=T&l=monster_menus&o=711555
or send a blank email to [hidden email]
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: mm_webform_file_download query very slow on large databases

grahamtk
This post has NOT been accepted by the mailing list yet.
In reply to this post by grahamtk
Hi!
More on performance, and my attempt to stay on the topic of this mail thread:

About the non indexed join in the mm_webform_file_download query:
Slightly increasing this might have helped the non-indexed join a bit:
join_buffer_size = 6M
Getting the index on is ofc. better.

My notes on mysql tuning:
( would love to get comments on this!)

I mentioned our huge performance increase with php 5.5 and mysql 5.5 on ubuntu 12.04,
and that most settings were standard. I believed this to be the case,
but it turns out that the upgrade kept the my.cnf settings I had entered.

These are optimized after inspiration from various sources like stackexchange,
drupal.org, and by trying out adjustments while keeping an eye on the mysql throughput measurements
and xhprof + devel module query time output,
I have not had time (nor competence) to do this thoroughly scientific.

table_cache   = 2024                                
table_definition_cache = 2024                        
table_open_cache = 16384                            

query_cache_limit = 2M
query_cache_size        = 64M
innodb_flush_method            = O_DIRECT   << I just heard this is not better than the default.
innodb_log_files_in_group      = 2                  
innodb_log_buffer_size = 4M                          
innodb_additional_mem_pool_size = 4M                
#innodb_log_file_size           = 128M              
innodb_flush_log_at_trx_commit = 2                  
innodb_file_per_table          = 1                  
innodb_buffer_pool_size        = 4G                  
                                                     
skip-name-resolve                                    
                                                                                                         
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: mm_webform_file_download query very slow on large databases

Adam Franco
Administrator
In reply to this post by grahamtk
Thanks for the feedback all. A patch is now available at https://www.drupal.org/node/2423831
Please post any feedback and I'll add the patch to 7.x-1.x in a few days if there is none.

By the way, it doesn't look like mm_webform runs any queries that might use an index on file_managed.fid. That can likely be removed.


Adam

--

Adam Franco
Senior Software Developer
Information Technology Services
Middlebury College
Middlebury, VT 05753
[hidden email]
802.443.2244

On Thu, Feb 5, 2015 at 11:14 AM, grahamtk <[hidden email]> wrote:
Hi!
More on performance, and my attempt to stay on the topic of this mail
thread:

About the non indexed join in the mm_webform_file_download query:
Slightly increasing this might have helped the non-indexed join a bit:
join_buffer_size = 6M
Getting the index on is ofc. better.

*My notes on mysql tuning: *
( would love to get comments on this!)

I mentioned our huge performance increase with php 5.5 and mysql 5.5 on
ubuntu 12.04,
and that most settings were standard. I believed this to be the case,
but it turns out that the upgrade kept the my.cnf settings I had entered.

These are optimized after inspiration from various sources like
stackexchange,
drupal.org, and by trying out adjustments while keeping an eye on the mysql
throughput measurements
and xhprof + devel module query time output,
I have not had time (nor competence) to do this thoroughly scientific.

table_cache   = 2024
table_definition_cache = 2024
table_open_cache = 16384

query_cache_limit = 2M
query_cache_size        = 64M
innodb_flush_method            = O_DIRECT   << I just heard this is not
better than the default.
innodb_log_files_in_group      = 2
innodb_log_buffer_size = 4M
innodb_additional_mem_pool_size = 4M
#innodb_log_file_size           = 128M
innodb_flush_log_at_trx_commit = 2
innodb_file_per_table          = 1
innodb_buffer_pool_size        = 4G

skip-name-resolve




--
View this message in context: http://monster-menus.2910260.n2.nabble.com/mm-webform-file-download-query-very-slow-on-large-databases-tp7573036p7573044.html
Sent from the Monster Menus mailing list archive at Nabble.com.

---
You are currently subscribed to monster_menus as: [hidden email].
To unsubscribe click here: http://lists.middlebury.edu/u?id=685438.780c6126d238396bdd2f98c1d84c15c7&n=T&l=monster_menus&o=711555
or send a blank email to [hidden email]

---

You are currently subscribed to monster_menus as: [hidden email].

To unsubscribe click here: http://lists.middlebury.edu/u?id=685503.6b071f880fe6a965a128164e6d09ea81&n=T&l=monster_menus&o=711798

(It may be necessary to cut and paste the above URL if the line is broken)

or send a blank email to [hidden email]

Loading...