Talkens transforms your digital assets from a generic image that resides inside a blockchain into an intelligent asset class infused with artificial intelligence. From an innovation standpoint, by combining synthetic voices, artificial intelligence and non-fungible tokens, Talkens paves the way for dynamic NFTs, the next evolutionary step for the digital collectables market.
Talkens, the first AI experiment focused on giving life to non-fungible tokens (NFTs), is gearing up to change the face of the NFT world forever with the help of artificial intelligence, the most powerful instrument available to humanity. Organized as a community first project, Talkens has made it its mission to save NFTs by giving them a voice, the gift of speech to create a new type of intelligent asset class that reimagines the concept of art and how we interact with it on a deeper level. To achieve its goal, Talkens has created 10 000 unique synthetic voices for NFTs that can speak your story, your ideas.
Born from Scale, Humans.ai’s launchpad, Talkens.ai leverages the innovative technology developed by Humans.ai to empower the NFT community to bring their digital assets to life by giving them a voice. Through Talkens, NFT holders can take their digital collectables to the next level, turning them from a jpeg image that resides on top of the blockchain into a dynamic and smart asset class capable of speaking with a unique synthetic voice. In the first phase, Talkens is compatible with the NFT collection from the Bored Apes Yacht Club and Humans.ai NFTs collection, but the list of compatible NFT collections will expand in the future. If it has a face and a mouth, Talkens will give it a voice.
Most people greeted the NFT phenomenon in its early days with a dismissive smirk, often suggesting that NFTs are a passing fad or a get-rich-quick scheme that brings little to no actual value. Talkens aims to dismantle any doubts surrounding the staying power and value of NFTs and to demonstrate to the whole world that the artworks of the future will no longer be static assets that hang on the wall or gather dust on a shelf. The artwork of the future will share a close and intimate relationship with technology to enable newfound levels of interactivity. Beauty will no longer be in the eye of the beholder. Beauty will stem from how we interact with our art and how we will use it to generate new content and value.
Talkens is leveraging Humans’ AI technology to transform NFTs from a static collectable with limited functionality to a human-centric and dynamic asset class. The NFT phenomenon took the market by storm in recent years, and it doesn’t show any signs of slowing down soon. At their core, NFTs are an abstract concept, usually represented by an identification number and a unique image.
The innovation behind NFTs stems from their connection with blockchain, a type of distributed ledger technology that acts as the backbone of cryptocurrencies like Bitcoin and Ethereum. But this general characterization applies to the first iteration of the NFT concept, why stop there? Talkens aims to make things interesting, to spice things up by mixing innovative AI technology with blockchain to save NFTs.
Talkens is all about innovations
Generating unique synthetic voices indistinguishable from real human voices starting from scratch is still an impossible task with the technology currently available, as human voices have a wide spectrum of natural variations and fluctuations that make it very difficult for a machine to mimic or reproduce.
The creation process of the Talkens voices started from a database of approximately 50 real vocal identities, equally distributed between males and females. To generate new voices from the existing dataset, the team behind Talkens worked on a complex AI algorithm that interpolates existing voices, combining their characteristics like pitch, tonality, intensity and frequency to generate new voice identities.
Talkens is the first project to investigate this topic on a large scale in the NFT domain, managing to generate 10 000 voices starting from an initial dataset of 50 vocal identities. A major point of innovation regarding speech synthesis is the fact that Talkens managed to achieve with its technology something referred to as pure voice synthesis. By leveraging complex AI neural networks and Natural Language Processing (NLP) techniques, Talkens can generate speech without requiring any prerecording of audio.
Besides giving a unique synthetic voice to regular NFTs, Talkens also adds motion to the equation by animating the jpeg image, giving it lip movements that accurately synchronize with the speech and subtle head movements to make everything look more natural. Of course, the first criterion that NFTs need to fulfil in order to be eligible for a makeover a la Talkens is that it needs to represent a humanoid that has a face and a mouth, similar to the Bored Apes collection.
Similar to the voices, the lip animation starts from sample videos of people pronouncing different words and pieces of text. After that, a neural network detects patterns like what the lips look like when the speaker says different vowels and consonants, as well as the subtle micro-expressions the face does when speaking.
This is also known as mapping, a process in which the neural network learns to distinguish between different phonemes and the movement of the lips. A mapping process is also performed on the jpeg image. For example, the neural network, aided by a human, maps the image, outlining its key features like eyes, nose, and ears, focusing especially on the lips. In the future, we hope to perfect the algorithm to completely automate the mapping process of NFTs. A mathematical algorithm is also employed to deform the image according to a series of offset key points provided by the neural network to simulate the sensation of motion.
When an AI Serum is used to inject life into a regular NFT, the neural networks work in tandem to ensure a seamless transition between different modalities and to generate an AI NFT, which in this case is an animated video of the NFT with speech added to it. When the text input is introduced by the user, the neural networks convert it into phonemes, the smallest unit of sound that distinguishes one word from another in a particular language, which in turn are transformed using the voice into speech. Once the audio part is generated, the neural network translates the audio segment into an animation that fully articulates the words, making the NFT move its lips in perfect synchronization with the words it says.
Voice DNA encapsulated inside Talkens AI NFTs
Technology is the vehicle that can make the seemingly impossible possible. The Talkens AI NFT is a new twist on the non-fungible token concept. Each Talkens AI NFT has a unique 3D design generated with the help of artificial intelligence. What makes Talkens AI NFTs truly unique, besides the awesome design, is the fact that each one encapsulates a voice DNA, a unique synthetic voice characterized by different pitches, tonalities and ways of expressing themselves that can be used to bring regular NFTs to life. With a total supply of 10 000 unique AI NFTs, Talkens.ai aims to give the community control over the digital source of life of the Metaverse and become the voice of a generation.
In the initial stages, Talkens will focus on using synthetic voices to preserve the anonymity of the users. So, the library of 10 000 voices cannot be linked to any real-life individuals to avoid ownership leakages. This design choice also addresses a sensitive topic — people misusing the voices of real humans to propagate wrong messages, misinformation, hate speech, and any other type of harmful content.
Talkens AI NFTs aren’t just a standard NFT with an audio file embedded in it, far from it. From a utility standpoint, Talkens AI NFTs act as miniature engines that can generate AI Serum, a consumable that can be combined with a standard NFT like the ones from the Bored Ape Yacht Club collection to inject a voice and bring them to life. Each AI Serum is one use only, meaning that after you create your AI NFT, the AI Serum gets burned. Your standard NFT stays in your collection, safe and sound, and you become the proud owner of an AI NFT. The rarity of a Talkens AI NFT determines how much AI Serum it can generate. This cap limit is set in place in order to avoid an inflation of voices which may, in the long run, affect the value of the collectable.
By paying one $HEART per word, NFT owners can animate their favourite NFT and generate a video that they can share with their friends or post on social media. To make an NFT talk, you simply need to introduce your desired text or audio input. The technology will do all the heavy lifting.
Compared to regular NFTs, Talkens aims to expand the concept, giving it a wider range of possible applications by focusing on the transition from one modality to another, namely text to audio and audio to video.
Talkens AI NFTs are able to take text input and use the synthetic voice encapsulated inside it to create an audio output. We manage to do this by translating the text and language into phonemes. Phonemes are the smallest unit of sound that distinguishes one word from another word in a language. Based on these phonemes, the underlying technology performs the transition from text to audio.
Like traditional NFTs, every Talkens AI NFT is unique due to a combination of frequencies, parameters and audio coefficients applied to each voice. This means that Talkens AI NFTs are highly tradeable.
The voices by themselves represent a feature, a characteristic that somebody can own and use for different types of scenarios. This is what Talkens did with its AI Serums, which at their core are neural networks that can be combined with static NFTs like the Bored Apes or other NFT collections to give static images the gift of speech, bringing them to life.
The broad innovation brought by Talkens is the fact that it empowers the NFT community and, by extension, every to be able to create content at scale. Furthermore, Talkens makes your digital collectables work for you, as you can use AI Serum to create unique AI NFTs and sell them. To make this process as natural as possible, Talkens incorporated all the modalities — text, audio and video and fine-tuned the technology to make the transition from these different modalities as seamless as possible. As such, Talkens gives users the ability to animate their NFTs and make them say things in different languages. Creating an end-to-end platform in which users can choose a voice (Talkens AI NFT), a face represented by a static NFT and make them talk only by inserting a text input, without needing to worry about the complex inner workings that take place in the backend is an innovation by itself.
The innovation beyond the voice
At first glance, the main innovation behind the Talkens experiment is the fact that it manages to bring NFTs to life, turning them from a generic image that resides inside a blockchain into an intelligent asset class that is infused with artificial intelligence. But if we look beyond the initial scope of Talkens, we can deduce that the project has other far-reaching implications like:
Innovating the NFT market
When it comes to innovation, people often have the wrong assumption that you need to invent something from scratch or else it doesn’t count. The best example that contradicts this faulty logic is blockchain, a technology that is perceived as a major disruptor for its ability to ensure trust, consensus, ownership, transparency and in-depth traceability to the information it stores.
Under closer scrutiny though, it becomes evident that blockchain is composed of multiple preexisting technologies and concepts, some of them decades older than the concept of blockchain. So, when it comes to innovation, it’s not only about developing new things but also about how you use existing concepts and technologies in an innovative way to deliver something unique.
Talkens.ai is currently the only project in the NFT market that brings multiple innovative technologies like artificial intelligence, machine learning, deep learning, natural language processing, audio processing, speech synthesis, and computer vision. This ensemble of technologies is put together to construct an end-to-end system that gives end-users the power to save their NFTs by giving them a new purpose and meaning.