Matthew Hutchinson

about

Matt is a web developer from N. Ireland. He currently runs Hiddenloop and works in Dublin. Want to find out just a little bit more ?

An audio feed is available for the latest articles at matthewhutchinson.net, find it here.

RadRails

posted on Wednesday, the 21st of December 2005 at 18:03 in , ,

If your developing Ruby on Rails web apps on Windows (or any platform really) why not try RadRails

The IDE itself is running on top of a very light version of Eclipse, it allows database queries to be fired off right in the console and has integrated support for working with subversion repositories. Its promising all you could ask for under one roof.

I came across this recently and first impressions are good. There is even a blog and podcast you can follow on its development.

No longer a chemist

posted on Tuesday, the 20th of December 2005 at 03:51 in ,

Ive just had my first blog comment spamming experience. Despite Typo’s best efforts – over the last week or so about 20 comments have been posted across all my blog articles selling all kinds of nasty stuff.

And much to your disappointment, I have removed these comments, so you will have to use another site to get your prescription drugs – or check your spam inbox.

For now you can only comment on articles less than 7 days old, until I jump into typo and sort this out a little better.

Rails 1.0

posted on Wednesday, the 14th of December 2005 at 01:22 in , ,

has been released! There’s even a new website for the framework, designed by the guys at 37Signals; quoted from the site;

15 months after the first public release, Rails has arrived at the big 1.0. What a journey! We’ve gone through thousands of revisions, tickets, and patches from hundreds of contributors to get here. I’m incredibly proud at the core committer team, the community, and the ecosystem we’ve raised around this framework.

Skribe Development Weekly

posted on Friday, the 9th of December 2005 at 08:41 in , , , ,

Welcome to the very first edition of the Skribe Development Weekly – a summary of the lastest work carried out on the Skribe Project over the last 7 days.

If you need a run-down on what Skribe is, check out this earlier posting.

Not much to report on this week, I spent a good few hours setting up our new (shared) Dreamhost server for Ruby on Rails Development using 2 subversion repositories (for production and development). (thanks go to steve for letting us use his new Dreamhost account)

I also posted a lengthy guide to the team on how to connect and work with this repository and set up a local Windows machine for Ruby on Rails Development, with Apache2/FastCGI/MySQL, and TortoiseSVN (as a subversion client)

I intend to summarise and improve this guide for posting on my Blog. Some (not so apparent) things cropped up when I was going through the whole process.

Tom continues to investigate Ruby, and is happily working away on the Top 12 Ruby on Rails Tutorials. Steve is performing a competitor review, looking at what we are up against and brainstorming ideas for our first release.

All-in-all a productive week – The todo list keeps getting bigger and bigger, and hopefully by next week, a few logo designs will ready for unvieling, stay tuned for the next installment.

PHP pushups

posted on Thursday, the 8th of December 2005 at 03:18 in , ,

I recently had the rather unpleasant task of writing some PHP to compare a CSV file (with some 22,000+ entries) with a mySQL database. With the CSV file holding the master copy of data, it would update/insert and delete from mySQL. The script needed to run as a daily Cron on my (shared) Dreamhost box.

This would normally be simple enough, using a status field on the CSV file to indicate fields that had been updated. Unfortunately there was no status field, and none could be added. In fact the CSV file could not be modified at all. The only way to check if a row had been modified was to do a field by field comparison on every row.

I started off with a single script that imported the CSV to an Array, and also extracted all rows from the db table. With some looping to search through all rows and all fields in each row, I got the script to work. Great! (I Thought)

But with 22,000 rows looping ~22,000 other rows, (22,000×22,000 = 484 million loops ) – in short the script took minutes to execute, and if left long enough it ate up 100% CPU usage (through php). Even using a exponential back-off search took too long.

On Dreamhost, if any script you run nears 100% CPU usage, its killed automatically. A major rethink was required. So I decided to split the script in two.

  • script 1 – would create a temporary table in the database and simply import the CSV file into it – row by row.
  • script 2 – running a few minutes later, would then compare the two tables using mysql queries (rather than a php loop search) – after performing all updates/inserts and deletes the temporary table would be destroyed.

The comparison script (2), works by looking for id matches between the two tables, and marking any rows found. If found – both rows are fetched and a field by field comparison is made to check if an UPDATE statement is needed.

Finally any rows not marked as found in the master CSV file were added, and any rows not marked as found in the DB were deleted.

Using 2 tables for the comparison rather than looping and searching in php, meant that the strain was now on mySQL (rather than PHP). Dreamhost seems to tolerate this, and the php script execution time is reduced from minutes to seconds.

And, why am I explaining all this ? – you ask,

1. So I can remember what on earth I did. and; 2. Im curious to know if anyone can think of a better way to do this. Bearing in mind the limiting factors, the CSV file CANNOT be altered in any way. It has to execute in seconds and the CPU usage cannot approach 100%