Overview

Among the many AI-driven innovations in the music industry, Singing Voice Synthesis (SVS) technology is generating significant interest but as with everything else, it’s not all sunshine and roses - some ethical dilemmas are giving folks a headache.

SVS has been around for more than two decades. The goal of it can be to learn human-like singing voices given input conditions, like speech, lyrics and melodies. Here we have many threads of focus like replicating existing vocals, refining singing expressions to improve performances or converting speech to voice. However, since the projects often involve the works of existing artists, debates have arisen around the unclear standards and practices regarding the ownership problems. Interestingly, public awareness about this topic is currently quite limited indicating a lack of clear methods to handle them.

Concerns

While we may still doubt the quality of AI’s musicality, there are still growing concerns that automation could drive music production costs to zero, thereby undermining artists’ value. Additional issues pop up when voice samples are combined with other samples to create new voices, resulting in the original voice data provider not benefiting monetarily. Consequently, people often worry about the future value of human musicians if AI takes over, rather than simply being a collaborative tool.

The potential misuse of this technology is another significant concern. The synthesized singing voices are sounding increasingly realistic and almost indistinguishable from human voices. Referencing deepfakes or potential scams(e.g., voice phishing) it may be necessary to think about the development of technologies that will counteract such misuse.

Also, AI replicating the voice of a deceased person happened to cause some public concerns. How right is it to use the personal records of a deceased person without their consent and profit from it? Although the family or close associates of the deceased artist may hold the rights, they are not the artists themselves, and this distinction raises further morality issues. Additionally, the authenticity of the work and respect for the artist’s legacy comes into question when AI-generated vocals are used in new productions.

Sunny side

Designers behind machine learning algorithms, primarily considered the collaboration between humans and AI as a big area of opportunity, and none believed that AI was a replacement for humans, partly because of the value attributed to human skills. Their perspective was focused on seeing AI’s role as augmenting human abilities, similar to how Autotune is commonly used to manipulate a singer’s voice.

While this synergy between AI and humans is often presented as ideal, it is still limited to encapsulating ethical concerns and considerations of the asymmetric relationship between machines and humans. Despite these challenges, AI can help improve singing expressions like converting speech to singing voices, allowing singers to explore new creative tools for their work and making performances more engaging and enjoyable for audiences.

Summary

To conclude, Singing Voice Synthesis offers both opportunities and challenges in music. While it has the potential to enhance performances and open up new avenues for artistic expression, simultaneously raises ethical dilemmas, intellectual property issues, the value of human musicians in an increasingly automated world, and potential misuse. To ensure a responsible and ethical integration of AI, it would be nice to think about some guidelines and methods for balancing creativity, collaboration, and ethics.

“One Note Samba” by SKYGGE using artificial intelligence tool Flow-Machines