Soundfile¶
The soundfile module introduces featureflow.Node
subclasses that know
how to process low-level audio samples and common audio encodings.
Input¶
-
class
zounds.soundfile.
AudioMetaData
(uri=None, samplerate=None, channels=None, licensing=None, description=None, tags=None, **kwargs)[source]¶ Encapsulates metadata about a source audio file, including things like text descriptions and licensing information.
Parameters: - uri (requests.Request or str) – uri may be either a string representing
a network resource or a local file, or a
requests.Request
instance - samplerate (int) – the samplerate of the source audio
- channels (int) – the number of channels of the source audio
- licensing (str) – The licensing agreement (if any) that applies to the source audio
- description (str) – a text description of the source audio
- tags (str) – text tags that apply to the source audio
- kwargs (dict) – other arbitrary properties about the source audio
Raises: ValueError
– when uri is not provided- uri (requests.Request or str) – uri may be either a string representing
a network resource or a local file, or a
Chunksize¶
-
class
zounds.soundfile.
ChunkSizeBytes
(samplerate, duration, channels=2, bit_depth=16)[source]¶ A convenience class to help describe a chunksize in bytes for the
featureflow.ByteStream
in terms of audio sample batch sizes.Parameters: - samplerate (SampleRate) – The samples-per-second factor
- duration (numpy.timedelta64) – The length of desired chunks in seconds
- channels (int) – Then audio channels factor
- bit_depth (int) – The bit depth factor
Examples
>>> from zounds import ChunkSizeBytes, Seconds, SR44100 >>> chunksize = ChunkSizeBytes(SR44100(), Seconds(30)) >>> chunksize ChunkSizeBytes(samplerate=SR44100(f=2.2675736e-05, d=2.2675736e-05)... >>> int(chunksize) 5292000
Processing Nodes¶
-
class
zounds.soundfile.
AudioStream
(sum_to_mono=True, needs=None)[source]¶ AudioStream expects to process a raw stream of bytes (e.g. one produced by
featureflow.ByteStream
) and produces chunks ofAudioSamples
Parameters: - sum_to_mono (bool) – True if this node should return a
AudioSamples
instance with a single channel - needs (Feature) – a processing node that produces a byte stream (e.g.
ByteStream
Here’s how’d you typically see
AudioStream
used in a processing graph.import featureflow as ff import zounds chunksize = zounds.ChunkSizeBytes( samplerate=zounds.SR44100(), duration=zounds.Seconds(30), bit_depth=16, channels=2) @zounds.simple_in_memory_settings class Document(ff.BaseModel): meta = ff.JSONFeature( zounds.MetaData, store=True, encoder=zounds.AudioMetaDataEncoder) raw = ff.ByteStreamFeature( ff.ByteStream, chunksize=chunksize, needs=meta, store=False) pcm = zounds.AudioSamplesFeature( zounds.AudioStream, needs=raw, store=True) synth = zounds.NoiseSynthesizer(zounds.SR11025()) samples = synth.synthesize(zounds.Seconds(10)) raw_bytes = samples.encode() _id = Document.process(meta=raw_bytes) doc = Document(_id) print doc.pcm.__class__ # returns an AudioSamples instance
- sum_to_mono (bool) – True if this node should return a
-
class
zounds.soundfile.
Resampler
(samplerate=None, needs=None)[source]¶ Resampler expects to process
AudioSamples
instances (e.g., those produced by aAudioStream
node), and will produce a new stream ofAudioSamples
at a new sampling rate.Parameters: - samplerate (AudioSampleRate) – the desired sampling rate. If none is
provided, the default is
SR44100
- needs (Feature) – a processing node that produces
AudioSamples
Here’s how you’d typically see
Resampler
used in a processing graph.import featureflow as ff import zounds chunksize = zounds.ChunkSizeBytes( samplerate=zounds.SR44100(), duration=zounds.Seconds(30), bit_depth=16, channels=2) @zounds.simple_in_memory_settings class Document(ff.BaseModel): meta = ff.JSONFeature( zounds.MetaData, store=True, encoder=zounds.AudioMetaDataEncoder) raw = ff.ByteStreamFeature( ff.ByteStream, chunksize=chunksize, needs=meta, store=False) pcm = zounds.AudioSamplesFeature( zounds.AudioStream, needs=raw, store=True) resampled = zounds.AudioSamplesFeature( zounds.Resampler, samplerate=zounds.SR22050(), needs=pcm, store=True) synth = zounds.NoiseSynthesizer(zounds.SR11025()) samples = synth.synthesize(zounds.Seconds(10)) raw_bytes = samples.encode() _id = Document.process(meta=raw_bytes) doc = Document(_id) print doc.pcm.samplerate.__class__.__name__ # SR11025 print doc.resampled.samplerate.__class__.__name__ # SR22050
- samplerate (AudioSampleRate) – the desired sampling rate. If none is
provided, the default is
-
class
zounds.soundfile.
OggVorbis
(needs=None)[source]¶ OggVorbis expects to process a stream of raw bytes (e.g. one produced by
featureflow.ByteStream
) and produces a new byte stream where the original audio samples are ogg-vorbis encodedParameters: needs (Feature) – a feature that produces a byte stream (e.g. featureflow.Bytestream
)Here’s how you’d typically see
OggVorbis
used in a processing graph.import featureflow as ff import zounds chunksize = zounds.ChunkSizeBytes( samplerate=zounds.SR44100(), duration=zounds.Seconds(30), bit_depth=16, channels=2) @zounds.simple_in_memory_settings class Document(ff.BaseModel): meta = ff.JSONFeature( zounds.MetaData, store=True, encoder=zounds.AudioMetaDataEncoder) raw = ff.ByteStreamFeature( ff.ByteStream, chunksize=chunksize, needs=meta, store=False) ogg = zounds.OggVorbisFeature( zounds.OggVorbis, needs=raw, store=True) synth = zounds.NoiseSynthesizer(zounds.SR11025()) samples = synth.synthesize(zounds.Seconds(10)) raw_bytes = samples.encode() _id = Document.process(meta=raw_bytes) doc = Document(_id) # fetch and decode a section of audio ts = zounds.TimeSlice(zounds.Seconds(2)) print doc.ogg[ts].shape # 22050