small medium large xlarge

Generic-user-small
12 May 2016, 20:30
Bernard Kaiflin (14 posts)

On my machine (OS X El Capitan 10.11.4), the lib_trigrams example failed. Line 85 of lib_trigrams.erl calls zlib:gunzip

9> c(lib_trigrams).
{ok,lib_trigrams}
10> lib_trigrams:make_tables().
** exception error: data_error
     in function  zlib:call/3 
     in call from zlib:inflate/2 
     in call from zlib:gunzip/1 
     in call from lib_trigrams:for_each_trigram_in_the_english_language/2 (lib_trigrams.erl, line 85)
     in call from timer:tc/3 (timer.erl, line 197)
     in call from lib_trigrams:make_tables/0 (lib_trigrams.erl, line 20)
11> 

If I try to gunzip manually :

$ gunzip 354984si.ngl.gz 
gunzip: invalid compressed data--crc error
gunzip: 354984si.ngl.gz: uncompress failed
$ gunzip -V
Apple gzip 251

I have downloaded the original file from the Moby project (http://icon.shef.ac.uk/Moby/) last file of the page

Moby Words
610,000+ words and phrases. The largest word list in the world
mwords.tar.Z [4.0MB].

After decompression, the original 354984si.ngl file can be found in the words directory. Then adjust lib_trigrams.erl around line 85 :

for_each_trigram_in_the_english_language(F, A0) ->
%%    {ok, Bin0} = file:read_file("354984si.ngl.gz"),
    {ok, Bin} = file:read_file("354984si.ngl"),
%%    Bin = zlib:gunzip(Bin0),
    scan_word_list(binary_to_list(Bin), F, A0).

Now it works :

2> c(lib_trigrams). 
{ok,lib_trigrams}
3> lib_trigrams:make_tables().
Counting - No of trigrams=3357707 time/trigram=0.2026630078205156
Ets ordered Set size=19.0346696570414 time/trigram=0.9076205875021257
Ets set size=19.033922063358563 time/trigram=0.6052576952068778
Module sets size=9.433978132884777 time/trigram=2.0565415624412733
ok
4> 
You must be logged in to comment