Archive for the ‘Artificial intelligence’ Category

Top 5 Limitations of Machine Learning in an Enterprise Setting

Astounding technological breakthroughs in the field of Artificial Intelligence (AI) and its sub-field Machine Learning (ML) have been made in the last couple of years. Machines can now be trained to behave like humans enabling them to mimic complex cognitive functions like informed decision-making, deductive reasoning, and inferences. Robots behaving like humans is no longer science fiction, but a reality in multiple industry practices today. As a matter of fact, human society is gradually becoming more reliant on smart machines to solve day to day challenges and make decisions. A good example of a simple use case for machine learning that has completely permeated our day-to-day lives is spam filters, which intrinsically determine whether a message is junk based on how closely it matches emails with a similar tag.

“A.I … is more profound than … electricity or fire”
– Sundar Pichai

However, these basic applications have evolved into ‘deep learning’ enabling software to complete complex tasks with significant implications for the way business is conducted. In all the hype surrounding these game-changing technologies, the reality that often times gets lost amidst both the fears and the headline victories like Cortana, Alexa, Google Duplex, Waymo, and AlphaGo, is that AI technologies have several limitations that will still need a substantial amount of effort to overcome. This post explores some of those limitations.

i. Machine Learning Algorithms Require Massive Stores of Training Data

AI systems are ‘trained’, not programmed. This means that they require enormous amounts of data to perform complex tasks at the level of humans. Despite the fact that data is being created at an accelerated pace and the robust computing power needed to efficiently process it is available; massive data sets are not simple to create or obtain for most business use cases. Deep learning utilizes an algorithm called backpropagation that adjusts the weights between nodes, to ensure an input translates to the right output. Supervised learning occurs when neural nets are trained to recognize photographs, for example, using millions or billions of previous labeled examples. And every slight variation in an assigned task calls for another large data set to conduct additional training. The major limitation is that neural networks simply require too much ‘brute force’ to function at a level similar to human intellect.

This limitation can be overcome by coupling deep learning with ‘unsupervised’ learning techniques that don’t heavily rely on labeled training data. For example, deep reinforcement learning models ideally learn via trial and error as opposed to via example. The model is optimized over multiple steps by penalizing unfavorable steps and incentivizing effective steps.

ii. Labeling Training Data Is a Tedious Process

Supervised machine learning using deep neural networks forms the basis for AI. Labeling is a requisite stage of data processing in supervised learning. This model training style utilizes predefined target attributes from historical data. Data labeling is simply the process of cleaning up raw data and organizing it for cognitive systems (machines) to ingest. Deep learning requires lots of labeled data, and while labeling is not rocket science, it is still a complex task to complete. If unlabeled data is fed into the AI, it is not going to get smart over time. An algorithm can only develop the ability to make decisions, perceive, and behave in a way that is consistent with the environment within which it is required to navigate in the future if a human mapped target attributes for it.

To establish what is in the data, a time-consuming process of manually spotting and labeling items is required. However, promising new techniques are coming up, like in-stream supervision, where data is labeled during natural usage. App designers can accomplish this by ‘sneaking in’ features in the design that inherently grow training data. High-quality data collection from users can be used to enhance machine learning over time.

iii. Machines Cannot Explain Themselves

Researchers at MIT hypothesize that the human brain has an intuitive physics engine. This basically means that the information we are able to collect via our sense is noisy and imprecise; however, we make conclusions about what we think will likely happen. For decades, common sense has been the most difficult challenge in the field of Artificial Intelligence. A large majority of AI-based models currently deployed is based on statistical machine learning that relies on tons of training data to build a statistical model. This is the main reason why adoption of some AI tools is still low in areas where explainability is crucial. A good example is in regulations such as GDPR, which requires a ‘right to explanation’.

Whether the decision is good or bad, having visibility into how/ why it was made is crucial, so that the human expectation can be brought in line with how the algorithm actually behaves. There are techniques that can be used to interpret complicated machine learning models like neural networks. A nascent approach is Local Interpretable Model-Agnostic Explanations (LIME), which attempts to pinpoint the parts of input data a trained ML model depends on most to create predictions, by feeding inputs similar to the initial ones and observing how these predictions vary.

iv. There is Bias in the Data

As AI and machine learning algorithms are deployed, there will likely be more instances in which potential bias finds its way into algorithms and data sets. In some instances, models that are seemingly performing well maybe actually picking up noise in the data. As much as transparency is important, unbiased decision making builds trust. The infallibility of an AI solution is based on the quality of its inputs. For example, facial recognition has had a large impact on social media, human resources, law-enforcement and other applications. But biases in the data sets provided by facial recognition applications can lead to inexact outcomes. If the training data is not neutral the outcomes will inherently amplify the discrimination and bias that lies in the data set. The most ideal way to mitigate such risks is by collecting data from multiple random sources. A heterogeneous dataset limits the exposure to bias and results in higher quality ML solutions.

v. A.I Algorithms Don’t Collaborate

Despite the multiple breakthroughs in deep learning and neural networks, AI models still lack the ability to generalize conditions that vary from the ones they encountered in training. AI models have difficulty transferring their experiences from one set of circumstances to the other. This means that anything a model has achieved for a specific use case will only be applicable to that use case. As a result, organizations are forced to continuously commit resources to train other models, even when the use cases are relatively similar. A solution to this scenario comes in the form of transfer learning. Knowledge obtained from one task can be used in situations where little labeled data is available. As this and other generalized approaches mature, organizations will have the ability to build new applications more rapidly.



Author: Gabriel Lando

Can Artificial Intelligence (AI) Enrich Content Collaboration? Or Is It Just a Lipstick?

Artificial Intelligence (AI) enrich Content Collaboration

Is Artificial Intelligence (AI) the new lipstick? Sure, it is being put on many pigs. Can artificial intelligence improve Enterprise File Sharing and Sync (EFSS), Enterprise Content Management (ECM) and Collaboration?  We want to explore if we could find some obvious collaboration use cases that can be improved using machine learning. In this article, we will not venture into AI techniques or impact of AI or evolution of AI. We are interested in exploring how EFSS benefits from “machine learning” – a technique that allows systems to learn by digesting data. The key is ‘learning’ – a system that can learn and evolve vs. explicitly programmed.  Machine learning is not a new technology; many applications, such as search engines (Google, Bing), already use machine learning.

In the past year, many large players, such as Google, Amazon, and Microsoft, have started offering AI tools and infrastructure as service. With many of the basic building blocks available, developers can focus on building right models and applications. Here are a few scenarios in Enterprise File Sharing and Sync (EFSS), and Enterprise Content Collaboration, where we can apply machine learning soon.


Search is a significant part of our everyday life. Google and Amazon have made the search box the center of navigation. For instance, a decade ago, the top half of the Amazon homepage was filled with links, which is now replaced by a search box at the top.  However, search hasn’t taken a significant role in enterprise collaboration, yet. Every day, we search for files that don’t fit in a simple search criteria. Think of search that goes ‘looking for a design proposal from a vendor x I received six months back.’ Today, we manually sort through files to find an image that satisfies the above search criteria.  We could use a simple query processing,  a crawler, and a sophisticated ranker to surface file search results, based on estimated relevance. Such a search feature can continue to learn and improve to provide better results each time. Already, we have many such machine learning algorithms and techniques available to index files, identify relevance, and rank search results per relevance. Hence, applying to enterprise scenarios requires a focused effort from the solution providers.

Predict and organize relevant content

A technique in machine learning, called unsupervised learning, involves building a model by supplying it with many examples, but without telling it what to look for. The model learns to recognize logical groups (cluster), based on certain unspecified factors, revealing patterns within a data set.  Imagine your files are automatically organized, based on the projects you are working on. Any file will have a set of related files just one click away. Won’t such a feature have a significant productivity boost?


Collaboration across different languages will be simplified with many advanced translation tools available today. Google Cloud Translation API provides a straightforward API to translate a string from and to many languages. Translation of user comments and meta data, such as tags, image information, can be very useful for any large organization that involves working with partners and vendors across the globe. With translation combined with machine learning, translation within an enterprise can improve by learning domain knowledge (medical, law, technology etc.) and internal jargon. Systems can extract right meta data, apply domain knowledge, and translate them for employees, partners, and customers, so they easily communicate and collaborate.

User Interface

Interaction with EFSS applications need not be just clicks and texts.  Users can have more engaging user experiences that include conversational interactions, e.g., users could just say “open the sales report that I shared with my manager last week.” Personal assistants, such Siri, Cortana, and Alexa, already provide such conversational interfaces for many personal and home scenarios. Though it sounds complex, some of the technology, such as automatic speech recognition for converting speech to text and natural language understanding, are available in Amazon APIs. Converting the conversation into an actual query might not be as complex as it sounds.

Security and Risk Assessment

Machine learning has an excellent application in monitoring network traffic patterns to spot abnormal activities that might be caused by cyber-attack or malicious activities. Solutions like FileCloud use some of these techniques to identify ransomware and highlight potential threats. Similar techniques can identify compliance risks to analyze if any documents being shared have any personal identifiable information (credit card) PII or personal health information PHI. Systems can predict and warn security risks before the breach happens.

These ideas are just a linear extrapolation of the near future. Even these simple linear extrapolations look promising and interesting. Many predict that, within a few years, almost every device and service will have intelligence embedded in them. In future, the concept of file and folders might be replaced by some other form of data abstraction. As AI and collaboration continue to evolve, resulting applications evolve exponentially better than our linear extrapolations, and our current thoughts could appear naïve. Hope it doesn’t evolve, as Musk puts it, “with artificial intelligence, we’re summoning the demon.”

The Intelligent Cloud : Artificial intelligence (AI) Meets Cloud Computing

intelligent cloud

If you thought that mobile communications and the Internet have drastically changed the world, just wait. Coming years will prove to be even more disruptive and mind-blowing.  Over the last few years, cloud computing has been lauded as the next big disruption in technology and true to the fact it has become a mainstream element of modern software solutions just as common as databases or websites; but is there a next phase for cloud computing? is it an intelligent cloud?

Artificial intelligence (AI) is the type of technology with the capacity to not only enhance current cloud platform incumbents but also power an entirely new generation of cloud computing technologies. AI is moving beyond simple chat applications like scheduling support and customer service, to impact the enterprise in more profound ways; as automation and intelligent systems further develop to serve the purpose of critical enterprise functions. AI is bound to become ubiquitous in every industry where decision-making is being fundamentally transformed by ‘Thinking Machines’. The need for smarter and faster decision making and the management of big data is the driving factor behind the trend.

Remember Moore’s Law? In 1965, Intel’s co-founder, Gordon Moore observed that the transistors per square inch on integrated circuits had doubled in number each year since their invention. For the next 50 years, Moore’s Law was maintained. In the process, multiple sectors like robotics and biotechnology saw remarkable innovation because machines that ran on computers and computing power all became faster and smaller with time as the transistors on the integrated circuits became more efficient. Now, something even more extraordinary is happening. Accelerating technologies such as big data and artificial intelligence are converging to trigger the next major wave of change. This ‘digital transformation’ will reshape every aspect of the enterprise, including cloud computing.

Artificial intelligence (AI) is expected to burgeon in the enterprise in 2017. Several IT players, including today’s top IT companies, have heavily invested in the space with plans to increase efforts in the foreseeable future.

Despite the fact that AI has been around since the 60’s, advances in networking and graphic processing units, along with demand for big data, have put it back at the forefront of several companies’ minds and strategies. Given the recent explosion of data from Internet of Thing (IoT) and applications, and the necessity for quicker, real-time decision making, AI is well on its way to becoming a key differentiator and requirement for major cloud providers.

AI-First Enterprises

In a market that has for the longest time been dominated by four major companies – IBM, Amazon, Microsoft, and Google –an AI first approach has the potential to disrupt the current dynamic.

“I think we will evolve in computing from a mobile-first to an AI-first world.”

-Sundar Pichai, Chief executive of Google

The consumer world is not new to AI-based systems; products like Siri, Cortana and Alexa have been making our lives easier for a while now. However, the enterprise applications for AI are completely different. An AI first enterprise approach should be designed to allow business leaders and data professionals to organize, collect, secure and govern data efficiently so they can gain the insights they require to become a cognitive business. In order to maintain a competitive advantage, businesses today have to get insights from data; however, acquiring those insights is complex and requires work from skilled data scientists. The ability to predict strategic and tactical purposes has evaded enterprises due to prohibitive resource requirements.

Cloud computing solves the two largest hurdles for AI in the enterprise; abundant, low cost computing and a means to leverage large volumes of data.


Today, this new breed of Platform as a Service (AIaaS) can be applied on all the data that enterprises have been collecting. Major cloud providers are making AI more accessible “as-a-service” via open source platforms. For enterprises with an array of complex issues to solve, the need for disparate platforms working together can’t be ignored. This is why making machine learning and other variations of AI applications and technology available via open source is critical to the enterprise. By leveraging AI-as-a-service, businesses can innovate solutions that solve infinite problems.

As machine learning becomes more popular as a service, organizations will have to decide the level at which they want to be involved. While the power of cognitive intelligence is undeniably high, wanting to use it and being able to use it are two completely different things. For this reason, most companies will opt to use a PaaS vendor to manage their entire cycle of data intelligence as opposed to an in-house attempt, allowing them to focus on powering and developing their applications. When looking for an AI provider, you have to ask the right questions. The ideal vendor should be in a position to elucidate both how they handle data and how they intend to solve your specific enterprise problem.

There are multiple digital trends that have the potential to be disruptive; the only way to guarantee smarter business processes, more agility, and increased productivity is by planning ahead for the change and impact that is coming. The main differentiating factor between competing vendors in this space will be how the technology is applied to improve business processes and strategies.

Author: Gabriel Lando

Image Courtesy: