pack_isam
is an extra utility that you get when you order more than 10
licenses or extended support. Since pack_isam
is distributed only in
binary form, pack_isam
is available only on some platforms.
Of course, all future updates to pack_isam
are included in the
price. pack_isam
may at some time be included as standard when
we get some kind of turnover for MySQL.
pack_isam
works by compressing each column in the table separately.
The information needed to decompress columns is read into memory when the
table is opened. This results in much better performance when accessing
individual records, since you only have to uncompress exactly one record, not
a much larger disk block like when using Stacker on MS-DOS.
Usually, pack_isam
packs the data file 40%-70%.
MySQL uses memory mapping (mmap()
) on compressed tables and
falls back to normal read/write file usage if mmap()
doesn't work.
There are currently two limitations with pack_isam
:
-
After packing, the table is read only.
-
It can't pack
BLOB
columns, yet.
Fixing these limitations is on our TODO list but with low priority.
pack_isam
is invoked like this:
shell> pack_isam [options] filename ...
Each filename should be the name of an index (`.ISM') file. If you
are not in the database directory, you should specify the pathname to the
file. It is permissible to omit the `.ISM' extension.
pack_isam
supports the following options:
-b, --backup
-
Make a backup of the table as
nom_table.OLD
.
-#, --debug=debug_options
-
Output debug log. The
debug_options
string often is
'd:t:o,filename'
.
-f, --force
-
Force packing of the table even if it becomes bigger or if the temporary file
exists. (
pack_isam
creates a temporary file named `nom_table.TMD'
while it compresses the table. If you kill pack_isam
, the `.TMD'
file may not be deleted. Normally, pack_isam
exits with an error if
it finds that `nom_table.TMD' exists. With --force
,
pack_isam
packs the table anyway.
-?, --help
-
Display a help message and exit.
-j big_nom_table, --join=big_nom_table
-
Join all tables named on the command line into a single table
big_nom_table
. All tables that are to be combined
MUST be identical (same column names and types, same indexes, etc.)
-p #, --packlength=#
-
Specify the record length storage size, in bytes. The value should be 1, 2
or 3. (
pack_isam
stores all rows with length pointers of 1, 2 or 3
bytes. In most normal cases, pack_isam
can determine the right length
value before it begins packing the file, but it may notice during the packing
process that it could have used a shorter length. In this case,
pack_isam
will print a note that the next time you pack the same file,
you could use a shorter record length.)
-s, --silent
-
Silent mode. Write output only when errors occur.
-t, --test
-
Don't pack table, only test packing it.
-T dir_name, --tmp_dir=dir_name
-
Use the named directory as the location in which to write the temporary table.
-v, --verbose
-
Verbose mode. Write info about progress and packing result.
-V, --version
-
Display version information and exit.
-w, --wait
-
Wait and retry if table is in use.
If the
mysqld
server was invoked with the --skip-locking
option, it is not a good idea to invoke pack_isam
if the table might
be updated during the packing process.
The sequence of commands shown below illustrates a typical table compression
session:
shell> ls -l station.*
-rw-rw-r-- 1 monty my 994128 Apr 17 19:00 station.ISD
-rw-rw-r-- 1 monty my 53248 Apr 17 19:00 station.ISM
-rw-rw-r-- 1 monty my 5767 Apr 17 19:00 station.frm
shell> isamchk -dvv station
ISAM file: station
Isam-version: 2
Creation time: 1996-03-13 10:08:58
Recover time: 1997-02-02 3:06:43
Data records: 1192 Deleted blocks: 0
Datafile: Parts: 1192 Deleted data: 0
Datafile pointer (bytes): 2 Keyfile pointer (bytes): 2
Max datafile length: 54657023 Max keyfile length: 33554431
Recordlength: 834
Record format: Fixed length
table description:
Key Start Len Index Type Root Blocksize Rec/key
1 2 4 unique unsigned long 1024 1024 1
2 32 30 multip. text 10240 1024 1
Field Start Length Type
1 1 1
2 2 4
3 6 4
4 10 1
5 11 20
6 31 1
7 32 30
8 62 35
9 97 35
10 132 35
11 167 4
12 171 16
13 187 35
14 222 4
15 226 16
16 242 20
17 262 20
18 282 20
19 302 30
20 332 4
21 336 4
22 340 1
23 341 8
24 349 8
25 357 8
26 365 2
27 367 2
28 369 4
29 373 4
30 377 1
31 378 2
32 380 8
33 388 4
34 392 4
35 396 4
36 400 4
37 404 1
38 405 4
39 409 4
40 413 4
41 417 4
42 421 4
43 425 4
44 429 20
45 449 30
46 479 1
47 480 1
48 481 79
49 560 79
50 639 79
51 718 79
52 797 8
53 805 1
54 806 1
55 807 20
56 827 4
57 831 4
shell> pack_isam station.ISM
Compressing station.ISM: (1192 records)
- Calculating statistics
normal: 20 empty-space: 16 empty-zero: 12 empty-fill: 11
pre-space: 0 end-space: 12 table-lookups: 5 zero: 7
Original trees: 57 After join: 17
- Compressing file
87.14%
shell> ls -l station.*
-rw-rw-r-- 1 monty my 127874 Apr 17 19:00 station.ISD
-rw-rw-r-- 1 monty my 55296 Apr 17 19:04 station.ISM
-rw-rw-r-- 1 monty my 5767 Apr 17 19:00 station.frm
shell> isamchk -dvv station
ISAM file: station
Isam-version: 2
Creation time: 1996-03-13 10:08:58
Recover time: 1997-04-17 19:04:26
Data records: 1192 Deleted blocks: 0
Datafile: Parts: 1192 Deleted data: 0
Datafilepointer (bytes): 3 Keyfile pointer (bytes): 1
Max datafile length: 16777215 Max keyfile length: 131071
Recordlength: 834
Record format: Compressed
table description:
Key Start Len Index Type Root Blocksize Rec/key
1 2 4 unique unsigned long 10240 1024 1
2 32 30 multip. text 54272 1024 1
Field Start Length Type Huff tree Bits
1 1 1 constant 1 0
2 2 4 zerofill(1) 2 9
3 6 4 no zeros, zerofill(1) 2 9
4 10 1 3 9
5 11 20 table-lookup 4 0
6 31 1 3 9
7 32 30 no endspace, not_always 5 9
8 62 35 no endspace, not_always, no empty 6 9
9 97 35 no empty 7 9
10 132 35 no endspace, not_always, no empty 6 9
11 167 4 zerofill(1) 2 9
12 171 16 no endspace, not_always, no empty 5 9
13 187 35 no endspace, not_always, no empty 6 9
14 222 4 zerofill(1) 2 9
15 226 16 no endspace, not_always, no empty 5 9
16 242 20 no endspace, not_always 8 9
17 262 20 no endspace, no empty 8 9
18 282 20 no endspace, no empty 5 9
19 302 30 no endspace, no empty 6 9
20 332 4 always zero 2 9
21 336 4 always zero 2 9
22 340 1 3 9
23 341 8 table-lookup 9 0
24 349 8 table-lookup 10 0
25 357 8 always zero 2 9
26 365 2 2 9
27 367 2 no zeros, zerofill(1) 2 9
28 369 4 no zeros, zerofill(1) 2 9
29 373 4 table-lookup 11 0
30 377 1 3 9
31 378 2 no zeros, zerofill(1) 2 9
32 380 8 no zeros 2 9
33 388 4 always zero 2 9
34 392 4 table-lookup 12 0
35 396 4 no zeros, zerofill(1) 13 9
36 400 4 no zeros, zerofill(1) 2 9
37 404 1 2 9
38 405 4 no zeros 2 9
39 409 4 always zero 2 9
40 413 4 no zeros 2 9
41 417 4 always zero 2 9
42 421 4 no zeros 2 9
43 425 4 always zero 2 9
44 429 20 no empty 3 9
45 449 30 no empty 3 9
46 479 1 14 4
47 480 1 14 4
48 481 79 no endspace, no empty 15 9
49 560 79 no empty 2 9
50 639 79 no empty 2 9
51 718 79 no endspace 16 9
52 797 8 no empty 2 9
53 805 1 17 1
54 806 1 3 9
55 807 20 no empty 3 9
56 827 4 no zeros, zerofill(2) 2 9
57 831 4 no zeros, zerofill(1) 2 9
The information printed by pack_isam
is described below:
normal
-
The number of columns for which no extra packing is used.
empty-space
-
The number of columns containing
values that are only spaces; these will occupy 1 bit.
empty-zero
-
The number of columns containing
values that are only binary 0's; these will occupy 1 bit.
empty-fill
-
The number of integer columns that don't occupy the full byte range of their
type; these are changed to a smaller type (for example, an
INTEGER
column may be changed to MEDIUMINT
).
pre-space
-
The number of decimal columns that are stored with leading space. In this
case, each value will contain a count for the number of leading spaces.
end-space
-
The number of columns that have a lot of trailing space. In this case, each
value will contain a count for the number of trailing spaces.
table-lookup
-
The column had only a small number of different values, and that were
converted to an
ENUM
before Huffman compression.
zero
-
The number of columns for which all values are zero.
Original trees
-
The initial number of Huffman trees.
After join
-
The number of distinct Huffman trees left after joining
trees to save some header space.
After a table has been compressed, isamchk -dvv
prints additional
information about each field:
Type
-
The field type may contain the following descriptors:
constant
-
All rows have the same value.
no endspace
-
Don't store endspace.
no endspace, not_always
-
Don't store endspace and don't do end space compression for all values.
no endspace, no empty
-
Don't store endspace. Don't store empty values.
table-lookup
-
The column was converted to an
ENUM
.
zerofill(n)
-
The most significant
n
bytes in the value are always 0 and are not
stored.
no zeros
-
Don't store zeros.
always zero
-
0 values are stored in 1 bit.
Huff tree
-
The Huffman tree associated with the field
Bits
-
The number of bits used in the Huffman tree.