Question

  • Creator
    Topic
  • #2260356

    Decompressing .gz on a web server.

    Locked

    by jozhall ·

    I have an issue that I have never faced therefore I am asking the following question on several forums in hopes that can find a solution.

    I am building a database driven web site using PHP and MySQL. It?s a pretty basic setup nothing fancy or anything that I haven?t done before. However, the data that will go into the database is coming to me via comma delimited text files that are compressed in the Gzip format. There are around 100 of these gzip files that need to be pulled off a FTP server and loaded into database.

    Now here?s the kicker: The database needs to be updated on a daily basis. Which means that the gzip files must be downloaded from the FTP server every morning. They then need to be decompressed into the comma delimited text files and then each one needs to be loaded into the MySQL database.

    Now, if this was for just a few text files then following the procedure above wouldn?t be that big of a deal. But, the thing is, that there are over 100 of these comma delimited text files that have been compressed in gzip. All of these files need to be inserted into the database every morning ASAP, to stay competitive with our competition.

    So, my question is: is there any server side utility that can decompress a large collection of gzip files and then insert them into a MySQL database? Or is there something that I can do (PHP script wise or other) to minimize this process?

    Here is some info on the server and software that I am running:

    Operating system Linux
    Kernel version 2.6.17.11-grsechg
    Apache version 1.3.37
    PERL version 5.8.7
    PHP version 4.4.7
    MySQL version 4.1.21-standard

All Answers

  • Author
    Replies
    • #2621443

      Clarifications

      by jozhall ·

      In reply to Decompressing .gz on a web server.

      Clarifications

    • #2617967

      Yes there is

      by saurondor ·

      In reply to Decompressing .gz on a web server.

      I’d use a cron job that is triggered at the desired time

      Make it run a shell script that FTPs to the server, downloads the gzip file, decompresses it and then inserts into the MySQL database.

      The flow would be something like this

      // get remote files
      ftp mydataserver.com
      // you’ll need a .netrc file with the login credentials and what to do in the file. google .netrc

      // for all files downloaded, ungzip
      for $file=files in destination directory
      {
      gzip -d $file
      // if there are various files in the gzip then you’ll need to nest the following command
      // for all ungziped files insert into mysql
      echo “%csv insert query here%” | mysql -u sqluser –password=sqlpassword targetDatabase
      }

      // clean directories if needed
      rm $file or rm -Rf %target directory%

      Your insert query will be of the type (see mysql manual)

      LOAD DATA [LOW_PRIORITY | CONCURRENT] [LOCAL] INFILE ‘file_name’
      [REPLACE | IGNORE]
      INTO TABLE tbl_name
      [CHARACTER SET charset_name]
      [FIELDS
      [TERMINATED BY ‘string’]
      [[OPTIONALLY] ENCLOSED BY ‘char’]
      [ESCAPED BY ‘char’]
      ]
      [LINES
      [STARTING BY ‘string’]
      [TERMINATED BY ‘string’]
      ]
      [IGNORE number LINES]
      [(col_name_or_user_var,…)]
      [SET col_name = expr,…]

Viewing 1 reply thread