small medium large xlarge

Logo-plain_pragsmall
08 Feb 2012, 22:56
Lewis Parker (3 posts)

Bzip2 kept complaining about the output from wget. I switched to curl and all seemed to work. Specifically I used:

curl http://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pages-articles.xml.bz2 | bzcat | ${HBASE_HOME}/bin/hbase shell import_from_wikipedia.rb

Note: I also switched to bzcat which decompresses to stdout by default.

It seems to work just fine.

P.S. I’m on Mac OS X 10.7.3

Avatar_pragsmall
09 Feb 2012, 08:25
Jim R. Wilson (85 posts)

Hi Lew,

Thanks for letting us know! I wrote that part testing against Ubuntu, maybe there’s a difference in the wget implementation that ships with OS X? I’ll see if I can find something that works cross-platform and work it in.

Thanks again,

– Jim

Avatar_pragsmall
13 Feb 2012, 09:37
Jim R. Wilson (85 posts)

Thanks again Lew. Tested bzcat on ubuntu, works great! Next release will have a more streamlined pipe.

Logo-plain_pragsmall
14 Feb 2012, 04:44
Lewis Parker (3 posts)

Sweet! Happy to provide something helpful. Can’t wait for the next release.

You must be logged in to comment