49
ELEC484 Phase Vocoder Kelley Fea

ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase Time Stretching Pitch Shifting Robotization Whisperation

  • View
    223

  • Download
    5

Embed Size (px)

Citation preview

Page 1: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

ELEC484Phase Vocoder

Kelley Fea

Page 2: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Overview

Analysis Phase Synthesis Phase Transformation Phase

Time Stretching Pitch Shifting Robotization Whisperation To Do

Denoising Stable/Transient Components Separation

Page 3: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Analysis Phase

Page 4: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Analysis Phase

Based on Bernardini’s documentpv_analyze.m

Inputs: inx, w, Ra Uses hanningz.m to create window Modulates signal with window Performs FFT and fftshift Outputs: Mod_y, Ph_y

(Moduli and Phase)

Page 5: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

pv_analyze.m

function [Mod_y, Ph_y] = pv_analyze(inx, w, Ra)% pv_analyze.m for ELEC484 Project Phase 1% Analysis phase... based on Bernardini% inx = original signal% w = desired window size% Ra = analysis hop size % Get size of inx; store rows and columns separately[xrow, xcolumn] = size(inx); % Create Hanning window% using the hanningz code found in Bernardiniwin = hanningz(w);

Page 6: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

pv_analyze.m

% Figure out the number of windows requirednum_win = ceil( (xrow - w + Ra) / Ra ); % Matrix for storing time slices (ts)ts = zeros(w, num_win); % Modulation of the signal with the window happens herecount = 1;for i = 0:num_win% the frame ends... frame_end = w - 1;

Page 7: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

pv_analyze.m

% checks to see where the end of the frame should be% if the count + frame_end goes outside of the size limitations do... if ( count + frame_end >= size(inx,1)) frame_end = size(inx,1) - count; end% determine where the end of the window is win_end = frame_end+1;% Set value of the time slice to match the windowed segment ts = inx( count : count + frame_end ) .* win( 1 : win_end );

Page 8: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

pv_analyze.m

% FFT value of ts using fftshift which moves zero frequency component

Y( 1 : win_end,i+1 ) = fft( fftshift(ts) );% Increment count by hop size count = count + Ra;end % End for loop

% Set output values for Moduli and Phase and return the matricesMod_y = abs(Y);Ph_y = angle(Y);end % End ph_analyze.m

Page 9: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Synthesis Phase

Page 10: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Synthesis Phase

Also based on Bernardini’s documentpv_synthesize.m

Inputs: Mod_y, Ph_y, w, Rs, Ra Uses hanningz.m to create window Calculates difference between actual and target

phases (delta phi) Recombines Moduli and Phase into Array of

complex numbers

Page 11: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Synthesis Phase

Performs IFFT and Overlap add Sum all samples using tapering window Final result is divided by absolute of the maximum

value Output: outx

Page 12: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

pv_synthesize.m

function outx = pv_synthesize( Mod_y, Ph_y, w, Rs, Ra )% pv_synthesize.m for ELEC484 Project Phase 1 % Set number of bins and frames based on the size of the phase

matrix[ num_bins, num_frames ] = size (Ph_y);% Set matrix delta_phi to roughly the same size as the phase matrixdelta_phi = zeros( num_bins, num_frames-1 );% PF same size as Ph_yPF = zeros( num_bins, num_frames );% Create tapering windowwin = hanningz(w);

Page 13: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

pv_synthesize.m

% Phase unwrapping to recover precise phase value of each bin% omega is the normal phase increment for Ra for each binomega = 2 * pi * Ra * [ 0 : num_bins - 1 ]' / num_bins; for idx = 2 : num_frames ddx = idx-1;% delta_phi is the difference between the actual and target phases% pringcarg is a separate function delta_phi(:,ddx) = princarg(Ph_y(:,idx)-Ph_y(:,ddx)-omega);% phase_inc = the phase increment for each bin phase_inc(:,ddx)=(omega+delta_phi(:,ddx))/Ra;end % End for loop

Page 14: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

pv_synthesize.m

% Recombining the moduli and phase...% the initial phase is the samePh_x(:,1) = Ph_y(:,1); for idx = 2:num_frames ddx = idx - 1; Ph_x(:,idx) = Ph_x(:,ddx) + Rs * phase_inc(:,ddx);end% Recombine into array of complex numbersZ = Mod_y .* exp( i * Ph_x );% IFFT and overlap add% Create X of specified sizeX = zeros( ( num_frames * Rs ) + w, 1);

Page 15: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

pv_synthesize.m

count = 1;for idx = 1:num_frames endx = count + w - 1; real_ifft = fftshift( real( ifft( Z(:,idx) ))); X( [count:endx] )= X(count:endx) + real_ifft .* win; count = count + Rs;end % sum of all samples multiplied by tapering windowk = sum( hanningz(w) .* win ) / Rs;X = X / k;% Dividing by the maximum keeps things in proportionoutx = X/abs(max(X));end % end ph_synthesize.m

Page 16: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

hanningz.m

Used because hann() gives incorrect periodicity:

w = .5*(1 - cos(2*pi*(0:n-1)'/(n)));

Page 17: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

princarg.m

Returns the principal argument of the nominal initial phase of each frame

a=Phasein/(2*pi);k=round(a);Phase=Phasein-k*2*pi;

Page 18: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Cosine Wave Test 1 (w = Ra = Rs)

Page 19: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Cosine Wave Test 1 (w = Ra = Rs)

0 100 200 300 400 500 600-100

-50

0

50

100input

Spectrum of Waveforms For Circular Convolution

0 100 200 300 400 500 600-4

-2

0

2

4Output

Page 20: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Cosine Wave (Ra = Rs = w/8)

0 100 200 300 400 500 600-1

-0.5

0

0.5

1input

Waveforms For Circular Convolution

0 100 200 300 400 500 600-1

-0.5

0

0.5

1Output

Page 21: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Cosine Wave – Zoom

300 350 400 450 500

-0.2

0

0.2

0.4input

Waveforms For Circular Convolution

300 350 400 450 500

-0.4

-0.2

0

0.2

Output

Page 22: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Toms Diner

0 0.5 1 1.5 2 2.5

x 105

-0.4

-0.2

0

0.2

0.4input

Waveforms For Circular Convolution

0 0.5 1 1.5 2 2.5

x 105

-0.4

-0.2

0

0.2

0.4Output

Page 23: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Piano

0 1 2 3 4 5 6 7

x 104

-1

-0.5

0

0.5

1input

Waveforms For Circular Convolution

0 1 2 3 4 5 6 7

x 104

-1

-0.5

0

0.5

1Output

Page 24: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Figure 8.1 (DAFX)

Page 25: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Time Stretching

Modify hop size ratio between analysis (Ra) and synthesis (Rs)

% Analysis function[Mod_y, Ph_y] = pv_analyze(inx, w, Ra);% Do Time Shifting here %% Modify hop size ratio hop_ratio = Rs / Ra;hop_ratio = 2;Rs = hop_ratio * Ra;% Synthesis functionX2 = pv_synthesize( Mod_y, Ph_y, w, Rs, Ra );

Page 26: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Ratio = Rs/Ra = 0.5

0 100 200 300 400 500 600-1

-0.5

0

0.5

1input

Waveforms For Time Stretching - 0.5

0 50 100 150 200 250 300-10

-5

0Output

Page 27: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Toms Diner

0 0.5 1 1.5 2 2.5

x 105

-0.4

-0.2

0

0.2

0.4input

Waveforms For Time Stretching - 0.5

0 2 4 6 8 10 12

x 104

-0.2

0

0.2

0.4

0.6Output

Page 28: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Piano

0 1 2 3 4 5 6 7

x 104

-1

-0.5

0

0.5

1input

Waveforms For Time Stretching - 0.5

0 0.5 1 1.5 2 2.5 3 3.5 4

x 104

-1

-0.5

0

0.5

1Output

Page 29: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Ratio = Rs/Ra = 2

0 100 200 300 400 500 600-1

-0.5

0

0.5

1input

Waveforms For Time Stretching - 2

0 100 200 300 400 500 600 700 800 900 1000-2

-1

0

1Output

Page 30: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Toms Diner

0 0.5 1 1.5 2 2.5

x 105

-0.4

-0.2

0

0.2

0.4input

Waveforms For Time Stretching - 2

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

x 105

-0.4

-0.2

0

0.2

0.4Output

Page 31: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Piano

0 1 2 3 4 5 6 7

x 104

-1

-0.5

0

0.5

1input

Waveforms For Time Stretching - 2

0 2 4 6 8 10 12 14

x 104

-1

-0.5

0

0.5

1Output

Page 32: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Pitch Shifting

Attempted to multiply a factor by the phase

Page 33: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Pitch Shifting

% Analysis function[Mod_y, Ph_y] = pv_analyze(inx, w, Ra);% Do Pitch Shifting here %Ph_y = princarg(Ph_y*1.5);% Synthesis functionX4 = pv_synthesize( Mod_y, Ph_y, w, Rs, Ra );

Page 34: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Pitch Shifting – Cosine

0 100 200 300 400 500 600-1

-0.5

0

0.5

1input

Waveforms For Pitch Shifting - 0.5

0 100 200 300 400 500 600-1

0

1

2Output

Page 35: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Pitch Shifting – Toms Diner

0 0.5 1 1.5 2 2.5

x 105

-0.4

-0.2

0

0.2

0.4input

Waveforms For Pitch Shifting - 0.5

0 0.5 1 1.5 2 2.5

x 105

-0.4

-0.2

0

0.2

0.4Output

Page 36: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Pitch Shifting – Piano

0 1 2 3 4 5 6 7

x 104

-1

-0.5

0

0.5

1input

Waveforms For Pitch Shifting - 0.5

0 1 2 3 4 5 6 7

x 104

-1

0

1

2Output

Page 37: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Robotization

Set phase (Ph_y) to zero

% Analysis function

[Mod_y, Ph_y] = pv_analyze(inx, w, Ra);

% Do Robotization here %

Ph_y = zeros(size(Ph_y));

% Synthesis function

xout = pv_synthesize( Mod_y, Ph_y, w, Rs, Ra );

Page 38: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Robotization – Cosine

0 100 200 300 400 500 600-1

-0.5

0

0.5

1input

Waveforms For Robotization

0 100 200 300 400 500 600-0.5

0

0.5

1Output

Page 39: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Robotization – Toms Diner

0 0.5 1 1.5 2 2.5

x 105

-0.4

-0.2

0

0.2

0.4input

Waveforms For Robotization

0 0.5 1 1.5 2 2.5

x 105

-0.5

0

0.5Output

Page 40: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Robotization – Piano

0 1 2 3 4 5 6 7

x 104

-1

-0.5

0

0.5

1input

Waveforms For Robotization

0 1 2 3 4 5 6 7

x 104

-1

-0.5

0

0.5

1Output

Page 41: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Whisperization

deliberately impose a random phase on a time-frequency representation

% Analysis function

[Mod_y, Ph_y] = pv_analyze(inx, w, Ra);

% Do Whisperization here %

Ph_y = ( 2*pi * rand(size(Ph_y, 1), size(Ph_y, 2)) );

% Synthesis function

xout = pv_synthesize( Mod_y, Ph_y, w, Rs, Ra );

Page 42: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Whisperization – Cosine

0 100 200 300 400 500 600-1

-0.5

0

0.5

1input

Waveforms For Whisperization

0 100 200 300 400 500 600-0.5

0

0.5

1Output

Page 43: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Whisperization – Toms Diner

0 0.5 1 1.5 2 2.5

x 105

-0.4

-0.2

0

0.2

0.4input

Waveforms For Whisperization

0 0.5 1 1.5 2 2.5

x 105

-0.2

-0.1

0

0.1

0.2Output

Page 44: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Whisperization – Piano

0 1 2 3 4 5 6 7

x 104

-1

-0.5

0

0.5

1input

Waveforms For Whisperization

0 1 2 3 4 5 6 7

x 104

-0.4

-0.2

0

0.2

0.4Output

Page 45: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Denoising

emphasize some specific areas of a spectrum

Page 46: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Stable Components Separation

Calculate the instantaneous frequency by making the derivative of the phase along the time axis.

Check if this frequency is within its “stable range”.

Use the frequency bin or not for the reconstruction.

Page 47: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Transient Components Separation

Page 48: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Conclusion

Rest of effects need to be properly implemented:Stable/Transient Components SeparationDenoising

Page 49: ELEC484 Phase Vocoder Kelley Fea Overview Analysis Phase Synthesis Phase Transformation Phase  Time Stretching  Pitch Shifting  Robotization  Whisperation

Questions?

Thank you!