Audio::MPEG - Encoding and decoding of MPEG Audio (MP3)
use Audio::MPEG;
Audio::MPEG is a Perl interface to the LAME and MAD MPEG audio Layers I, II, and III encoding and decoding libraries.
I have been building a fairly extensive MP3 library, and decided to write some software to help manage the collection. It's turned out to be a rather cool piece of software (incidentally, I will be releasing it under the GPL shortly), with both a web and command line interface, good searching, integrated ripping, archive statistics, etc.
However, I also wanted to be able to stream audio, and verify the integrity of files in the archive. It is certainly possible to stream audio (even with re-encoding at a different bitrate) without resorting to writing interface glue like this module, but verification of the files was clumsy at best (e.g. scanning stdout/err for strings), and useless at worst.
Thus, Audio::MPEG was born.
This is arguably the best quality MPEG encoder available (certainly the best GPL encoder). Portions of the code have been optimized to take advantage of some of the advanced features for Intel/AMD processors, but even on non-optimized machines, such as the PowerPC, it performs quite well (faster than real-time on late 90's (and later) machines).
This is a relatively new MPEG decoding library. I chose it after struggling
to clean up the MPEG decoding library included with LAME (which is based
on Michael Hipp's mpg123(1) implementation). In the end, I was very pleased
with the results. MAD performs it's decoding with an internal precision
of 24 bits (pro-level quality) with fixed-point arithmetic. The code
is very clean, and seems rock-solid. Although it may seem that it should
be faster than the mpg123(1) library due to the use of fixed-point arithmetic,
it is in fact about 60% or so of the speed (due to the higher resolution
audio). However, the ease of coding against MAD, and the higher
precision of the output more than makes up for the slower decoding.
Audio::MPEG can export the data at it's highest precision for programs that wish to manipulate the data at the higher resolution.
I have only tested this on a Linux 2.4.x system so far, but I see no reason why it should not work on any Un*x variant. In fact, it may actually even work on a Windoze box (the underlying LAME and MAD libraries apparently compile somehow on them). I am doing no special magic with the interface, so presumably it will work under Windows. As you can probably tell, I don't really care if it does (I'll may start caring if M$ releases the source code to Windows under GPL, BSD, or Artistic licenses...). But, for you poor, misguided souls that insist upon running Windows, I expect that there should be little problem getting it to work.
You would think that with encoding/decoding audio, which is quite a compute-intensive task, Perl would be much slower than the equivalent pure C programs. Surprise... it is only about 3% slower (!) Even with the mechanism I use here (Perl->C->Perl for every frame, Perl 5.6.1 and Linux 2.4.4 (PowerPC 7500) performs just fantastic. So, the moral of this paragraph is to run your own performance tests, but there's no need to think of your own Perl encoder/decoder will be inferior to a pure C/C++ implementation. The only drawback is that, depending upon how much buffer space you use for reading, memory usage will be at least 3 times as much (eh... RAM is cheap...)
This is simply the package that bootstraps the XS library, and there is no external interface.
Once a stream has started to be decoded, the object may only be used for that stream (due to state information kept in the object).
Method returns the length of data, in bytes, that has not be decoded yet.
Method returns 1 if a frame was decoded (successfully or not), and 0 if it ran out of data before finishing decoding.
Upon return, program should interrogate $obj-err>. If it is > 0, then a decoding error has occurred, and no PCM synthesis is possible (i.e. frame should be skipped). See the EXAMPLES section later in this document.
Method returns 1 if a frame was parsed (successfully or not), and 0 if it ran out of data before finishing parsing.
Upon return, program should interrogate $obj-err>. If it is > 0, then a decoding error has occurred (i.e. frame should be skipped). See the EXAMPLES section later in this document.
If the second parameter is 1, a full decoding of the MP3 file will occur. If undef, it will only decode the frame headers and not the data as well.
This may be further tuned by passing a third parameter that indicates the number of errors to be found before declaring the verification a failure.
Method returns 1 if file is OK, undef if damaged.
This creates a new object. Each object has it's own private context, so it is possible to have more than one object created at a time.
The parameters for new are as follows:
All PCM formats are in the native (i.e. big or small endian) format of the machine that generates the output. Default is 5.
Currently, only Sun mulaw and WAV formats have headers.
The parameters for new are as follows:
Note that this setting is independent of the input sample frequency: LAME will resample if required.
Values of the samples may range from -1.0 to +1.0.
The output is a scalar that contains zero or more MP3 frames.
Values of the samples may range from -32768 to +32767.
The output is a scalar that contains zero or more MP3 frames.
Important note: the file handle must have been opened read/write (e.g. open(FILE, ``+>$filename''))
Below are a few examples to show how to use this module.
This will take an input MP3 file, and create an output WAV file that is fixed to a sample rate of 44.1 kHz and 2 channels (it will resample the input if needed). This would be is in creating WAV files for burning an audio CD (where it is necessary to have all input at 44.1 kHz, 2 channels).
use Audio::MPEG;
my $in_file = shift || "test.mp3"; my $out_file = shift || "test.wav";
open(IN, "<$in_file") || die "$in_file: $!"; open(OUT, ">$out_file") || die "$out_file: $!";
my $mp3 = Audio::MPEG::Decode->new;
my ($in, $wav, $wav_len);
while (my $read_bytes = read(IN, $in, 40_000)) {
$mp3->buffer($in);
while ($mp3->decode_frame) {
if (not $mp3->err_ok) {
printf("Frame: %u: %s\n", $mp3->current_frame,
$mp3->errstr);
next;
}
$mp3->synth_frame;
if (not $wav) {
$wav = Audio::MPEG::Output->new({ type => 'wave' });
print OUT $wav->header;
}
my $out = $wav->encode($mp3->pcm);
$wav_len += length($out);
print OUT $out;
}
}
if (seek(OUT, 0, 0)) {
print OUT $wav->header($wav_len);
}
This will take an input MP3 file, and create an output MP3 file that is a VBR encoded (128kbps average is default).
use Audio::MPEG;
my $in_file = shift || "test.mp3"; my $out_file = shift || "test2.mp3";
open(IN, "<$in_file") || die "$in_file: $!";
# Important: OUT is opened r/w if it is a real file open(OUT, "+>$out_file") || die "$out_file: $!";
my $mp3_in = Audio::MPEG::Decode->new;
my ($in, $pcm);
while (my $read_bytes = read(IN, $in, 40_000)) {
$mp3_in->buffer($in);
while ($mp3_in->decode_frame) {
if (not $mp3_in->err_ok) {
printf("Frame: %u: %s\n", $mp3_in->current_frame,
$mp3_in->errstr);
next;
}
$mp3_in->synth_frame;
if (not $pcm) {
$pcm = Audio::MPEG::Output->new({
out_sample_rate => $mp3_in->sample_rate,
out_channels => $mp3_in->channels
});
}
my $pcm_stream = $pcm->encode($mp3_in->pcm);
if (not $mp3_out) {
$mp3_out = Audio::MPEG::Encode->new({
vbr => "vbr",
in_sample_rate => $mp3_in->sample_rate,
in_channels => $mp3_in->channels
});
}
print OUT $mp3_out->encode_float($pcm_stream);
}
}
print OUT $mp3_out->encode_flush; $mp3_out->encode_vbr_flush(*OUT);
If it is desired to perform audio processing on a PCM stream, it is a simple matter of converting the output scalar from Audio::MPEG::Output to an array. Processing can then be done on this array, and it can be transformed back into an opaque scalar for input into Audio::MPEG::Encode (if the output is to be encoded as an MP3).
Additionally, if the processing is to be accomplished by a C routine, all
that is required is for the C program to know the format of the scalar
(and the usual Perl XS SvPV() routine can be used to access the data).
The output of Audio::MPEG::Output is a (possibly) interleaved PCM stream. What this means is that if the output is 2 channels, the first sample is the left channel, the second is the right channel, the third the left channel, etc. If the output is a single channel, all samples are the left channel.
The format is in the native endian of the machine the program ran on (with the exception of the WAVE format - this is always little-endian).
my @a = unpack('C*', $out);
@a = unpack('s*', $out);
@a = unpack('l*', $out);
@a = unpack('f*', $out);
After unpacking, the next step is to demultiplex the interleaved array. If the number of channels is 1, you are done (it is a mono signal). If the number of channels is 2, then $a[0] is the first left channel sample, $a[1] is the first right channel sample, $a[2] is the second left channel sample, etc.
As for the range of values for each element, these are determined by the byte size (except of course for float). For example, pcm16 will be in the range of -32768 to +32767. Keep this in mind when coding your analysis routines.
The input to Audio::MPEG::Encode is a (possibly) interleaved PCM stream. If a Perl array contains data that you wish to encode into an MP3, it must be transformed into an opaque scalar representing the (possibly) interleaved PCM data. Please see the discussion above for details as to how the stream is formatted and scaled.
encode_float()$in = pack('f*', @a);
encode16()$in = pack('s*', @a);
Below is a simple example of reducing the volume of a sample by 2 (6 dB). Please note that, for production use, any signal processing should be written in C (and linked as an XS module) due to the much faster speed of C over Perl for this type of processing (remember that a typical song will contain millions of samples...)
my @a = unpack('f*', $out);
for (my $i = 0; $i < $#a; $i++) {
$a[$i] /= 2.0;
}
$out = pack('f*', @a);
mpg123(1) or madplay(1) before submitting a bug report.
Peter Timofejew <peter@timofejew.com>
The current version may always be found on CPAN, as well as at http://timofejew.com/audiompeg/
The libraries required to build and use Audio::MPEG can also be found at http://timofejew.com/audiompeg/
Copyright (c) 2001 Peter Timofejew. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
lame(1), madplay(1), mpg123(1), mp3blaster(1), sox(1), MP3::Info(3)