Skip to content Skip to sidebar Skip to footer

Send Audio Data Represent As Numpy Array From Python To Javascript

I have a TTS (text-to-speech) system that produces audio in numpy-array form whose data type is np.float32. This system is running in the backend and I want to transfer the data fr

Solution 1:

Convert wav array of values to bytes

Right after synthesis you can convert numpy array of wav to byte object then encode via base64.

import io
from scipy.io.wavfile import write

bytes_wav = bytes()
byte_io = io.BytesIO(bytes_wav)
write(byte_io, sr, wav)
wav_bytes = byte_io.read()

audio_data = base64.b64encode(wav_bytes).decode('UTF-8')

This can be used directly to create html audio tag as source (with flask):

<audio controls src="data:audio/wav;base64, {{ audio_data }}"></audio>

So, all you need is to convert wav, sr to audio_data representing raw .wav file. And use as parameter of render_template for your flask app. (Solution without sending)

Or if you send audio_data, in .js file where you accept response, use audio_data to construct url (would be placed as src attribute like in html):

// get audio_data from response

let snd = new Audio("data:audio/wav;base64, " + audio_data);
snd.play()

because:

Audio(url) Return value: A new HTMLAudioElement object, configured to be used for playing back the audio from the file specified by url.The new object's preload property is set to auto and its src property is set to the specified URL or null if no URL is given. If a URL is specified, the browser begins to asynchronously load the media resource before returning the new object.


Solution 2:

Your sample as is does not work out of the box. (Does not play)

However with:

  • StarWars3.wav: OK. retrieved from cs.uic.edu
  • your sample encoded in PCM16 instead of PCM32: OK (check the wav metadata)

Flask

from flask import Flask, render_template, json
import base64

app = Flask(__name__)

with open("sample_16.wav", "rb") as binary_file:
    # Read the whole file at once
    data = binary_file.read()
    wav_file = base64.b64encode(data).decode('UTF-8')

@app.route('/wav')
def hello_world():
    data = {"snd": wav_file}
    res = app.response_class(response=json.dumps(data),
        status=200,
        mimetype='application/json')
    return res

@app.route('/')
def stat():
    return render_template('index.html')

if __name__ == '__main__':
    app.run(debug = True)

js


  <audio controls></audio>
  <script>
    ;(async _ => {
      const res = await fetch('/wav')
      let {snd: b64buf} = await res.json()
      document.querySelector('audio').src="data:audio/wav;base64, "+b64buf;
    })()
  </script>

Original Poster Edit

So, what I ended up doing before (using this solution) that solved my problem is to:

  • First, change the datatype from np.float32 to np.int16:
wav = (wav * np.iinfo(np.int16).max).astype(np.int16)
  • Write the numpy array into a temporary wav file using scipy.io.wavfile:
from scipy.io import wavfile
wavfile.write(".tmp.wav", sr, wav)
  • Read the bytes from the tmp file:
# read the bytes
with open(".tmp.wav", "rb") as fin:
    wav = fin.read()
  • Delete the temporary file
import os
os.remove(".tmp.wav")

Post a Comment for "Send Audio Data Represent As Numpy Array From Python To Javascript"