Archive for April, 2007

Ebay Archive chewing on MySQL server resources

Sunday, April 29th, 2007

I received an email from Dreamhost this morning with regards to the Ebay Archive database. It seems that the queries that it was performing wasn’t as optimised as I thought.

# Query_time: 0  Rows_examined: 156795
use  ebayarchive;
SELECT item_number
FROM items_pending
WHERE  item_number = 290108127963;

150k+ rows just for a search?
Sounds like it’s not even using the index I had…
I looked up how to find useful information about queries on google and it says I should put EXPLAIN before the query, so I did.

SQL query: EXPLAIN SELECT item_number
FROM  items_pending
WHERE item_number = 200103800946;

id       select_type       table       type       possible_keys       key       key_len       ref       rows       Extra
1     SIMPLE     items_pending     index     item_number     item_number     16     NULL     158537     Using where; Using index

But it was using the index… hmm, how interesting. How about we try to put the number in quotes?

SQL query: EXPLAIN SELECT item_number
FROM  items_pending
WHERE item_number = '200103800946';

id  select_type  table  type  possible_keys  key   key_len  ref  rows  Extra
1  SIMPLE  items_pending  ref   item_number  item_number  16  const  1  Using where; Using index

Ah ha! What a difference that makes… !

As for the other query that was taking up alot of resources:

EXPLAIN SELECT item_number, category, item_name, item_url, item_end_time
FROM items_pending
ORDER BY item_end_time
LIMIT 0 , 20;

id  select_type  table  type  possible_keys  key  key_len  ref  rows  Extra
1  SIMPLE  items_pending  index  NULL  item_end_time  4  NULL  158537

Doesn’t look promising. It seems like ORDER BY is causing the query not to perform as best it can. So we’ll have to rewrite this query in a different way and also cache the result as the file that uses this query processes each individual item.

Something like this works fine:

EXPLAIN SELECT item_number, category, item_name, item_url, item_end_time
FROM items_pending_disabled
WHERE item_end_time <= 1177143110

id  select_type  table  type  possible_keys  key  key_len  ref  rows  Extra
1  SIMPLE  items_pending_disabled  range  item_end_time  item_end_time  4  NULL  1714  Using where

We’ll have to cache all the rows and then increment the number by say 200-500 and repeat the query. Better a few rows than thousands. Now I’ve just got to wait for Dreamhost to hopefully re-enable the table they disabled.

Ebay Archive categories expanding

Tuesday, April 3rd, 2007

So the Ebay Archive is going fine. There were a few problems initially such as the date/time sorting not working as I decided to use varchar for the database and copy the Ebay date/time instead of using int for the database and convert the Ebay date/time to a unix timestamp; I guess this shows that it’s better to do things properly instead of doing things quickly.

Another issue is the fact that each item takes about 50Kb for the HTML file. This is pretty big and if you 27,000 entries so far, it’s already 1.3 GB… ouch! Lucky for me I’m with Dreamhost which offer 200GB diskspace plus 1GB extra gets added on each week. I’ve now made a header and footer page for the HTML pages which cuts the size by half to 25kb each. Now there’s the problem of going through 27,000 HTML files and removing the parts that aren’t needed.

Finally I’m hoping to expand the categories the Ebay Archive covers to include Cameras, & Accessories, Electrical and Mobile phones.

Edit: Those Ebay Categories above have been added.

phpBB Mod: Use Email as Username

Tuesday, April 3rd, 2007

I’ve created a simple phpBB Mod called Use Email as Username which allows you to insert the email address that a user registers with as their username.