Entering the Voice Information Era

SynqqVoiceInformationEraWe are entering an era where the voice is being transformed from audio to information. Contributing factors are behavioral changes in smartphones and voice-controlled speakers, advancements in infrastructure technology, and reduction in infrastructure pricing. Today we use products like Amazon Alexa, Apple Siri and Google Voice with voice commands, and get a response back with each request. Behind the scenes, these products convert voice into text via Automatic Speech Recognition (ASR) and use Natural Language Processing (NLP) to interpret and return the results either visually or as voice using Text to Speech (TTS). The rapid growth and ease of use of these products have instilled behavioral changes in consumers — making us more comfortable and more likely to use voice instead of traditional user interfaces.

Every major part of the technology infrastructure required to convert voice into information is available from cloud vendors: Amazon, Google, and Microsoft. The three major services they provide around voice are ASR, TTS, and NLP. For example, the ASR services from Amazon, Google and Microsoft are priced around 2.4 cents/minute. Going forward, all the cloud vendors are embarking on Deep Learning to reduce the training workload, to improve the accuracy of the transcribed text, and to scale the complexity of ASR. In parallel, the evolution of dropping hardware prices combined with the Nvidia GPU cloud infrastructure (for both training and inference) will dramatically reduce the prices of ASR services.

What can we do as the speech infrastructure services ASR, NLP, and TTS improve and the prices come down?

To make a prediction, let’s take a look at the evolution of voice communication services over the last decade. The technology infrastructure for voice communications like high-bandwidth codecs, Acoustic Echo Cancellation (AEC), and the broadband built the stage for complete solutions like Skype, WeChat, Line, WebEx, WhatsApp, and many others. These complete solutions offered a better experience and had a network effect to become the dominant players for voice communication.

Along the similar lines, we expect complete solutions to emerge by leveraging the speech infrastructure services to become the dominant players. At Synqq, we are excited to leverage the infrastructure available in this new era to develop the world’s first voice assistant for work. Our focus is to provide voice commands in order to access all your information faster and unlock the precious knowledge contained within the voice conversations that happen in everyday meetings. So next time, if you miss an important moment in your meeting, do not worry, just talk to Synqq!

Advertisements

Tapping into Voice at Work

ai image.png

At work, we’re always trying to be more productive: to get more out of doing less, whether it’s our time, our tools, or our method of communication. Digital tools like email, collaboration apps, and messengers have helped us do that, but there’s the one method of communication everyone uses at work that’s never been optimized: voice. Voice, which remains the most used, most effective, and fastest form of communication at work, is also the most difficult to store and consume.  What if we could “see” what was said in our conversations instead? We’re best at exchanging information with each other with voice, but consume information best with visuals.

The advancements in Automatic Speech Recognition (ASR) from Google, Microsoft, and Baidu have reached almost 4.9% word error rate. However the Speech-to-Text transcription for conversations at work require another level of innovation. Firstly the ASR’s are not aware of the context of our conversations and such cannot interpret the keywords, names and the entities we refer. As such they have poor accuracy of the words recognized. Secondly ASRs have no concept of who said what in the conversation. This is required to preserve the structure of the conversations. Typically after most conversations, we would like to know what one person about a given topic. And even short conversations contain paragraphs of text when transcribed, which are a pain to read through to get a few sentences of useful information.

history of asr.png

speech recognition wer.png

 

 

 

Now that we’re in the era of Infinite Computing and Artificial Intelligence, it’s finally possible to address this by developing the voice NLP infrastructure. We need something that not only takes everything down in our meetings, but a voice NLP that knows our specific context, and knows what’s important to us. Something that will lets us be engaged in our meetings, while keeping everyone on the same page after. And that will let us get more out of doing less. This is the future of voice in enterprise we’re building at Synqq.

Future of Voice in Enterprise

Voice-recognition-640x373

We talk with our co-workers, customers, partners, and other stakeholders in business every day. In every  meeting, whether in-person, web conference or over the phone, we use voice as the primary medium of communication. Be it project meetings, sales calls, support calls or interviews, precious information is contained in these conversations. Today, most of this information is lost, as it is neither feasible nor practical to capture voice conversations with existing technology. And even if we are able to capture the voice conversations, we can’t search what is inside them, we can’t see who said what, and we can’t get to the key moments. Have you ever tried to find a single sentence from an hour-long recording? Voice conversations are like dark matter.

Why? First, it isn’t easy to handle voice in all situations, and the devices we use to capture voice determines the quality. For the right quality, voice needs to be handled differently in all kinds of conversations. It’s easy to capture voice in a Web Conference, but how can it be done for in-person meetings or phone calls? Once voice is captured, how can we uniquely identify each speaker? How can we make sure each speaker’s voice is heard equally well?

Second, transforming voice into a visual, searchable stream of information is hard. The words we use in our conversations depend on the context. They also depend on our business domain. For example, the word “checkin” in software context means putting software in a repository, but in airline context it is two words “check-in”. The intent and entities are different based on the context, user and the domain. And each of us pause differently between our sentences, that makes it hard to segment speech into sentences.

At Synqq, we have pioneered new technology to handle voice in all situations and have made seminal advancements in Natural Language Processing — that is going to change the way we handle voice conversations in enterprise. The current era of infinite computing makes it affordable. Voice conversations will no longer be the dark matter of the past. Voice conversations can become the searchable record for every enterprise.

Era of Infinite Computing

ThinkstockPhotos-482461333

We’re living in the era of Infinite Computing with the creation of Cloud Platforms like AmazonGoogle, and Microsoft. Our daily lives are transformed by services like GoogleFacebook, and Uber. The phone revolution leverages the cloud services to talk to anyone on the planet, get to any place, order anything, and get entertained without taking up all your storage. These services are at the forefront of the future services to come in the era of Infinite Computing.

Why is it so important? As we enter into the era of Infinite Computing, technology companies will be able to build sophisticated AI models to organize, classify, and predict things that are limited only by our imagination. For example, the 1.3 trillion pictures we took on our phones in the last one year are automatically organized and classified by the social networks.  The transportation industry has been transformed by services like UberLyft, and Didi, and in the future, we will see fleets of self-driving vehicles feasible only in the era of Infinite Computing. Every industry will be transformed in the era of Infinite Computing.

What can the era of Infinite Computing do for our daily work? Can it save us from the daily chores of capturing all the information we need and organize it for us? And enable us to recall the snippet that matters at any time with a tap or by using voice? Can we gain superhuman memory for ourselves and our teams? We believe all this is possible. This era of Infinite Computing enables us to develop personalized machine learning models to do all the heavy lifting so we and our teams can achieve superhuman things.

“Age of Accelerations”

51fjVSfSfpL._SY344_BO1,204,203,200_

I recently read Tom Friedman’s book “Thank You for Being Late”. One of the key takeaways is that we live in the “age of accelerations” due to the technology industry. Wireless networks and phones enable us to capture information easily and transfer it onto the Internet, which is then distributed by social networks and search engines for consumption in near real time. Information is flowing at a high velocity compared to earlier ages.

We are living in this new age when it comes to exchanging personal and, dare we say, more “fun” information, but the business world seems to be operating at a much slower pace set in the era of personal computers.

There are several problems with the current flow of information at work.

Firstly, the tools we currently use, such as the Office Suite or Google Docs, still live in a PC era. For the average user, these applications are mostly one-dimensional, used to simply to take raw text, voice, or images, but they don’t transform that information.

Secondly, the information captured is subject to the interpretations by the person capturing the information. When a photo is taken and shared on social media platforms, though filters may be added, essentially the integrity of the image is not compromised. Working with a written file immediately has this flaw because the end product can only be created by someone actively trying to create it.

Thirdly, systems currently in place in the workspace require users to manually organize information and/or enter into a workflow, which is a dated process and doesn’t live up to the “age of accelerations”.  Our expectations have changed with products like Google that automatically organize the Internet and Facebook that get all the information from our friends and media in a jiffy!

In this new age, shouldn’t we be able to capture, organize, share and recall information as quickly and as effortlessly as we are used to with search engines and social networks?And Artificial Intelligence and voice should enable us to take actions and follow thru to get things done at an accelerated pace to compete in the “age of accelerations”.

What did you hear?

The average working adult spends 31.5% of their time listening. According to research by Adler, Rosenfeld, and Proctor, in their seminal research published in the 8th edition of their book, “Interplay: The process of interpersonal communicating”, we spend 70 % of our time communicating and 45% of that time listening.  We only spend 21% of our time speaking so you would assume we would become much better at listening.

Screen Shot 2017-08-10 at 3.25.37 PM

In her latest book, “WHAT? Did You Really Say What I Think I Heard?” NY Times Business Bestselling author Sharon Drew Morgen says listening and correctly hearing what is intended is not always easy.  “The problem is our brain. As listeners, we think there is a direct transmission between words spoken and our interpretation. But the reality is far murkier: just as our eyes take in light and our brains interpret captured images, our ears take in sound and our brains interpret meaning. That means we all see and hear the world uniquely, according to our mental models and filters, and are at effect of what our brains allow us to hear, not necessarily what’s said.”

So, what if we don’t hear what was really said, or intended? Deals can be lost, relationships can be ruined. The cost of misunderstanding can be significant or even incalculable. With so much research and professional help, there are ways to improve. But humans are imperfect and thankfully, technology can help.

Cloud, mobile, Voice, and AI technologies are evolving to the point that we can capture everything that was said, segment it into a structure that is easily organized, searchable, and sharable so that you can find what matters from any conversation.  We’re not talking about simple voice recordings or complete transcription of voice recordings to text.  No one wants to have to re-listen to everything that was said, nor do they always want to read everything that was said. But being able to go back to just what was said about any topic would be a game changer.  It’s possible now, and we will share it with the world on soon!

“What did you say?”

Millions of people still write notes on paper. According to research published by Mueller (Princeton) and Oppenheimer (UCLA) “taking notes by hand is better than taking notes on a laptop for remembering conceptual information long term.” But it’s hard to take notes by hand and simultaneously contribute to business discussions.  And most people can only write legibly at the rate or 15-20 words per minute, while spoken communication averages over 110 words per minute.

Trendy journal maker, Moleskine, says the number of young, tech savvy entrepreneurs who use traditional leather-bound journals is on the rise.  They reported sales of over 17 million journals in 2015, and have developed a kind of cult following! While it helps you remember to write things down, finding what you need later in these beautiful journals and note pads can be arduous.

Digital note-taking apps like Evernote and OneNote also seem to have a cult-like following. These apps are more than just a digital version of the same fundamental process. And most people can type faster than they can write with a pen on paper. However they have the same basic challenge: You either suffer the distraction of taking notes during the meeting or discussion, or you create notes after the fact.  And you still have to organize the notes manually.

We make some of our most significant contributions to work in the form of conversations with others — using our voice. So, it makes sense to have a digital record of what we say.  But the act of taking notes arguably lessens our effectiveness as participants, and it is still hard to search and find the gems of the conversations.

What if you could have a perfect record of what was said or shared without re-reading and re-listening?  Perhaps blending AI, voice and cloud can change the world of meetings. We think so! Watch for our announcement on September 18. Meetings will never be the same, we promise.

TiEcon 2017 names Synqq as a TiE50 Winner

We are excited to announce that Synqq has been selected as a “2017 TiE50 Winner” for the prestigious TiE50 Awards Program recognizing the world’s most innovative tech startups. Winners were announced on Friday, May 5th at the Santa Clara Convention Center during TiE Silicon Valley’s annual tech entrepreneurship conference, TiEcon.

“TiE50 has become a global brand that attracts thousands of companies worldwide. We screened more than 1,303 companies this year and selected the most innovative 50 companies as TiE50 Winners. TiE50 company presentations were well received at the conference. As a not-for-profit, our vigorous screening and judging adopt multiple regression techniques to ensure process integrity,” said program co-chair Daniel Zimmer.

“This highly successful program is in its ninth year and has been a major draw at TiEcon. The TiE50 Program is one of TiE Silicon Valley’s most successful programs,” said Sanjay Shirole, program co-chair. This year, the screening committee included 31 accomplished domain experts and influencers who participated. Our judges included senior executives, venture capitalists, and marquee tech entrepreneurs.”

About TiEcon:

TiEcon is an annual tech entrepreneurship conference put on by TiE Silicon Valley, a not-for-profit organization in its 25th year. The conference attracts loyal participation from top technology companies, leading venture capital firms, and global service providers. TiEcon 2017 attracted 6,000+ attendees from across the world – including CEOs of established companies to first-time entrepreneurs creating new companies, to leading investment professionals and corporate executives. TiEcon was listed by Worth Magazine as one of the 10 best conferences for ideas and entrepreneurship along with TED and the World Economic Forum.

About TiE:

The Indus Entrepreneurs (TiE) is a non-profit founded in 1992 in the Silicon Valley by a group of successful entrepreneurs, corporate executives, and senior professionals. TiE is a 320,000+ network with 14,000+ members and operates cohesively through 61 chapter locations in 18 countries.

Thank you for arriving “Just in Time”

shutterstock_499443283 (2)

Are millennials better prepared for the “Age of Accelerations”?

In his latest book “Thank You for Being Late”, Thomas Friedman asserts that we live in the “age of accelerations” – due to Moore’s Law, market globalization, and climate change. The problem is that our ability to adapt is only improving at a linear scale, while technology is advancing exponentially. According to Dr. Eric “Astro” Teller, CEO of Google X, technology is already outpacing our ability to adapt, and we are going to have to figure out how to learn faster.

Millennials, and especially the younger half of this latest and largest generation to join the US workforce, might be the most well equipped to thrive in this new era. Twenty-somethings, dubbed by Goldman Sachs as the “world’s first digital natives” are the best educated, and most technically savvy generation of all time.

Every generation builds on the knowledge, experience, and strengths of those who came before them. However, this is the first generation to grow up with smartphones in hand, and two related observations lead me to the theory that they are uniquely equipped to thrive in the age of accelerations:

  1. Millennials are extremely agile and resourceful with respect to finding the information needed to solve problems. In fact, twenty-somethings grew up with Google and internet search and seem to work in more of a “just-in-time” rather than a “just-in-case” model. This is important because, in a world that changes so much faster, we need to know that we are equipped with the latest facts, information, and opinions to produce the best outcome.
  2. Millennials are conditioned to share what they know, observe and think. They are also quick to turn to friends when they need to know more or get the advice of someone they trust.

They are also often very efficient and effective communicators. They share and consume content in a style that is short form, with a fluid multimodal adaptability that makes use of the best form of any content to have the best impact.

Of course, rapidly advancing technology enables better and easier sharing and communication. We are just now beginning to see a new wave of apps, inspired and developed by millennials that will literally transform the way we work.  In the age of accelerations and Moore’s Law, we can now deliver a safe, low-cost, cloud-based, AI-powered network that learns about you, your schedule and the people with whom you interact. It predicts who you need to share with and manages the organization necessary to make this form and frequency of “micro-sharing” highly useful for work (see Synqq for a great, if biased example)!

We are going to see significant benefits from the 94 million millennials and their rising influence in our US workplace… and equally on a global scale. More sharing, openness and greater transparency are in fact the foundation of strong families, organizations, communities, and societies. It’s also the foundation of learning and adapting to technological, market, and environmental change.

This generation has arrived just-in-time with a uniquely valuable style of work that all of us will need to adopt so we can thrive in this new era.

Let me know what you think!

 

Why dont we share more at work?

Why can’t we share with our work colleagues the way we do with our friends?

shutterstock_408452584 (1)

Face it: social networking can cause problems at work. At some workplaces, it is outright banned.  Meanwhile, the sharing we do in our private lives does not directly equate to the sharing that would be helpful at work. And yet, collaboration builds trust and improves the speed and effectiveness of execution for any team at work. How can these apparent contradictions be resolved?

One obvious challenge is that the technology we use to create, capture and share information at work was first designed in the PC era.  Word, PowerPoint, and Excel documents are large, feature-rich applications that are intended for long-form content – not for quick interaction the sharing of real-time information.

Furthermore, when we create long-form content, we share it by placing files in a folder and sending a link or an attachment via email. Not only is finding these documents problematic, but the medium is a poor fit to a work defined by texting.

We are conditioned to share often with short-form, just-in-time, and multimedia content. We take our phone or laptop to every meeting and every other kind of encounter.  We are comfortable typing a short text note, taking a picture of a whiteboard, or recording an audio note. But finding this content, and then sharing it, does not work with the old-school distribution method of email.

We need another way. But do we need yet another network or app, even if it offers better sharing, organization, and retrieval of short-form, just-in-time content for work?

A recent article, written by Jay Greene in The Wall Street Journal, declared that “Collaboration Is Great, But One Tool Is Enough.” Greene makes the case that there are too many different collaboration platforms — and instead of investing in yet another one, most companies would be better off consolidating around one.

But sticking with a long-form content platform is not the right answer. Even Microsoft has revealed a new, short form, ephemeral messaging app (called Microsoft Team) to address part of this challenge — and to cut back on the insane volume of email. The problem with this strategy is that it now relies upon two inefficient methods of distribution of the shared content; neither of which are optimized for retrieval.

At Synqq, we start by enabling easy and efficient utilization of phone and browser features for creating, capturing and sharing short-form, just-in-time, multimedia content.  And then we make it work for you and your business through our AI model and context from the phone and browser. Synqq is so powerful because it automatically organizes and retrieves shared content with just a tap or two on your phone or with a simple voice command.

More than any other product, Synqq enables you to share more and to share more often. You can create new content or organize your long form content by adding a link to a Google or Word doc to any Synqq note.  You’ll never again have to search through endless email trails.  Instead, you stay in ‘Synqq’ with everyone and everything that matters.  You’ll save time, be your best, and your business relationships will thrive.