Here we compare our speech editing system directly with Tan el.al on VCTK dataset. The decoding process is shown in the right figure.
original text: for that reason cover should not be given
original audio
target text: for that reason cover is impossible to be given
Tan el.al 2021
our $\text{A}^3\text{T}$
target text: for that theoretical and realistic reason cover should not be given
Tan el.al 2021
our $\text{A}^3\text{T}$
original text: some have accepted it as a miracle without physical explanation
original audio
target text: some have accepted it as an undeniable fact without physical explanation
Tan el.al 2021
our $\text{A}^3\text{T}$
target text: some have accepted it as a miracle never seen before without physical explanation
Tan el.al 2021
our $\text{A}^3\text{T}$
original text: the idea has potential for the future.