Decompressing .gz on a web server.

By jozhall ·
I have an issue that I have never faced therefore I am asking the following question on several forums in hopes that can find a solution.

I am building a database driven web site using PHP and MySQL. It?s a pretty basic setup nothing fancy or anything that I haven?t done before. However, the data that will go into the database is coming to me via comma delimited text files that are compressed in the Gzip format. There are around 100 of these gzip files that need to be pulled off a FTP server and loaded into database.

Now here?s the kicker: The database needs to be updated on a daily basis. Which means that the gzip files must be downloaded from the FTP server every morning. They then need to be decompressed into the comma delimited text files and then each one needs to be loaded into the MySQL database.

Now, if this was for just a few text files then following the procedure above wouldn?t be that big of a deal. But, the thing is, that there are over 100 of these comma delimited text files that have been compressed in gzip. All of these files need to be inserted into the database every morning ASAP, to stay competitive with our competition.

So, my question is: is there any server side utility that can decompress a large collection of gzip files and then insert them into a MySQL database? Or is there something that I can do (PHP script wise or other) to minimize this process?

Here is some info on the server and software that I am running:

Operating system Linux
Kernel version
Apache version 1.3.37
PERL version 5.8.7
PHP version 4.4.7
MySQL version 4.1.21-standard

This conversation is currently closed to new comments.

2 total posts (Page 1 of 1)  
| Thread display: Collapse - | Expand +

All Answers

Collapse -

Yes there is

by Saurondor In reply to Decompressing .gz on a we ...

I'd use a cron job that is triggered at the desired time

Make it run a shell script that FTPs to the server, downloads the gzip file, decompresses it and then inserts into the MySQL database.

The flow would be something like this

// get remote files
// you'll need a .netrc file with the login credentials and what to do in the file. google .netrc

// for all files downloaded, ungzip
for $file=files in destination directory
gzip -d $file
// if there are various files in the gzip then you'll need to nest the following command
// for all ungziped files insert into mysql
echo "%csv insert query here%" | mysql -u sqluser --password=sqlpassword targetDatabase

// clean directories if needed
rm $file or rm -Rf %target directory%

Your insert query will be of the type (see mysql manual)

INTO TABLE tbl_name
[CHARACTER SET charset_name]
[TERMINATED BY 'string']
[ESCAPED BY 'char']
[STARTING BY 'string']
[TERMINATED BY 'string']
[SET col_name = expr,...]

Back to Networks Forum
2 total posts (Page 1 of 1)  

Related Discussions

Related Forums