Data Knowledge Graph

When you are building data products and filtering data files, it is important to keep track of what you have combined to make a new data set and what you have removed. This feature has saved us countless hours.

From an audit perspective we can build a complete history of a dataset – when it was added to the platform, how it was processed and when/who/where it was delivered / downloaded. This takes a removes a time-draining communication burden from our teams.

We can also add commentary and narratives to a data set. This helps us build transparency and persistent-state knowledge about data.

AI: A Working Assumption

Building a system that is 100% autonomous and makes its own decisions is both hard and high risk. Given that Amazon, with all its resources and smarts, uses human input for the low/no consequence AI built into Alexa, it is fairly safe to assume that *all* other firms making AI claims have a human involved in at least one critical step.

On Maxima: Search, Life and Software.

Until recently, I have wrestled with why people I knew growing up in a small village in the UK stayed in the village when there was a whole world of opportunities awaiting discovery. I have come to realize that life is a search process. A search for purpose, contentment and security. As with most search algorithms, some are better than others. Some peoples’ search algorithm stop when they discover a local maxima – such as village life in the UK. Other algorithms drive people to travel much further.

Software development follows similar principles to a search algorithm. While we might think that we are heading towards a peak when we start out building an application, we soon discover that the landscape we are searching is evolving. If we rush too quickly to a peak we might find that we settle on a local rather than a global maxima. Facebook is a good example of the impact of search speed. The reason that Facebook prevailed is that the many social networking sites that came before it provided the company with a long list of technical and business mistakes to avoid. A major lesson was controlled growth – in other words, a slow search algorithm. Avoiding the strong temptation, especially when a social network is concerned, to grow very rapidly.

This is an example of a good search process and how it has to be a slow one for long term success. A slow search allows a process to find a stable solution. The Simulated Annealing Algorithm is a good example of this. The random perturbations applied to the search result diminish overtime as the solution gets closer to the optimum search result. The occasional randomness ensures the search doesn’t get stuck on a solution.

We have also been running our own, slow search algorithm as we build Knowledge Leaps. We have been at this for a long time. App development began five years ago, but we started its design at least eight years ago. While we wanted to go faster, we have been resource constrained. The advantage of this is that we have built a resilient and fault-tolerant application. The slow-development process has also helped foster our design philosophy, when we began we wanted to be super-technical and build a new scripting language for data engineering. Over time our beliefs have mellowed as we have seen the benefits of a No Code / Low Code solution. Incorporating this philosophy into Knowledge Leaps has made the application that much more user friendly and stable.

ALEXA: State-of-the-art AI

If you want to see how good Alexa is at answering people’s questions you should sign on to Alexa Answers and see the questions Alexa cannot answer. This site has gamified helping Alexa answer these questions. I spent a week doing this and figured out a pretty good work flow to stay in the top 10 of the leader board.

The winning strategy is to use Google. You copy the question in to Google and paste the answer Google back in to the Alexa Answers website for it to played back to the person who asked it. The clever thing is that since it is impossible to legally web-scrape Google.com at a commercially viable rate, Amazon have found a way of harnessing the power of Google without a) having to pay, b) violating Google.com’s TOS, and c) getting caught stealing Google’s IP.

After doing this for a week, the interesting thing to note is why Alexa could not answer these questions. Most of them are interpretation errors. Alexa misheard the question (e..g connor virus, coronda virus, instead of coronavirus). The remainder of the errors are because the question assumes Alexa’s knowledge of the context (e.g. Is fgtv dead? – he’s a youtube star) and without the subject of the question being a known entity in Alexa’s knowledge graph, the results are ambiguous. Rather than be wrong, Alexa declines to answer.

Obviously this is where the amazing pattern matching abilities of the human brain come in. We can look at the subject of the question and the results and choose the most probable correct answer. Amazon can then augment Alexa’s knowledge graph using these results. This is probably in violation of Google’s IP if Amazon intentionally set out to do this.

Having a human being perform the hard task in a learning loop is something that we have also employed in building our platform. Knowledge Leaps can take behavioral data and tease out price sensitivity signals, using purchase data, as well as semantic signals in survey data.

High Praise

On a demo of our application to a prospective customer, the instant feedback was “this looks easier to use than Alteryx”. We’ll take that sort of compliment any day of the week.

Human Analysts Guarantee Bias

During an interview between Shane Parrish and Daniel Kahneman, one of the many interesting comments made was around how to make better decisions. Kahneman said that despite studying decision-making for many years, he was still prone to his own biases. Knowing about your biases doesn’t make them any easier to overcome.

His recommendation to avoid bias in your decision making is to devolve as many decisions as you can to an algorithm. Translating what he is saying to analytical and statistical jobs suggests that no matter how hard we try, we always approach analysis with biases that are hard to overcome. Sometimes our own personal biases are exaggerated by external incentive models. Whether you are evaluating your bosses latest pet idea, or writing a research report for a paying client, delivering the wrong message can be costly, even if it is the right thing to do.

Knowledge Leaps has an answer. We have built two useful tools to overcome human bias in analysis. The first is a narrative builder that can be applied to any dataset to identify the objective narrative captured in the data. Our toolkit can surface a narrative without introducing human biases.

The second tool we built removes bias by using many pairs of eyes to analyze data and average out any potential analytical bias. Instead of a single human (i.e. bias prone analyst) looking at a data set our tool lets lots of people look at it simultaneously and share their individual interpretation of the data. Across many analysts, this tool will remove bias through collaboration and transparency.

Get in touch to learn more. doug@knowledgeleaps.com.

Sign Up For Our Newsletter

Occasionally we will send out an email newsletter containing some detailed and interesting case studies based on the data we have access to. Sign up below.

Building An Agile Market Research Tool

For the past five years we have been building our product Knowledge Leaps, an agile market research tool. We use it to power our own business serving some of the most demanding clients on the planet.

To build an innovative market research tool I had leave the industry. I spent 17 years working in market research and experienced an industry that struggled to innovate. There are many reasons why innovation failed to flourish, one of which lies in the fact that it is a service industry. Service businesses are successful when they focus their human effort on revenue generation (as it should be). Since the largest cost base in the research are people, there is no economic incentive to invest in the long term especially as the industry has come under economic pressure in recent years. The same could be said of many service businesses that have been disrupted by technology. Taxi drivers being a good example of this effect.

This wouldn’t be the first time market research innovations have come from firms that are outside of the traditional market research category definition. For example, SurveyMonkey was founded by a web developer with no prior market research experience. While, Qualtrics was founded by a business school professor and his son, again with no prior market research industry experience.

Stepping outside of the industry and learning how other types of businesses are managing data, using data and extracting information from it has been enlightening. It has also helped us build an abstracted-solution. While we can focus on market research use-cases, since we have built a platform that fosters analytics collaboration and an open-data philosophy finding new uses for it is a frequent occurrence.

To talk tech-speak what we have done is to productize a service. We have taken the parts of market research process which happen frequently and are expensive and turned them into a product. A product that delivers the story in data with bias. It does it really quickly too. Visit the site or email us support@knowledgeleaps.com to find out more.

Science Fiction and the No-Free-Lunch Theory

In a lot of science fiction films one, or more, of the following are true:

  1. Technology exists that allows you to travel through the universe at the “speed of light.”
  2. Technology that allows autonomous vehicles to navigate complicated 2-D and 3-D worlds exists.
  3. Technology exists that allows robots to communicate with humans in real-time detecting nuances in language.
  4. Handheld weapons have been developed that fire bursts of lethal high energy that require little to no charging.

Yet, despite these amazing technological advances the kill ratio is very low. While it is fiction, I find it puzzling that this innovation inconsistency persists in many films and stories.

This is the no-free-lunch theory in action. Machines are developed to be good at a specific task are not good at doing other tasks. This will have ramifications in many areas especially those that require solving multiple challenges. Autonomous vehicles for example need to be good at 3 things:

  1. Navigating from point A to B
  2. Complying with road rules and regulations.
  3. Negotiating position and priority with other vehicles on the road.
  4. Not killing, or harming, humans and animals.

Of this list 1) and 2) are low level. 3) is challenging to solve as it requires some programmed personality. Imagine if two cars using the same autonomous software meet at a junction at the very same time, one of them needs to give way to the other. This requires some degree of assertiveness to be built. I am not sure this is trivial to solve.

Finally, 4) is probably really hard to solve since it requires 99.99999% success in incidents that occur every million miles. There may never be enough training data.

AI Developer, A Job For Life.

Last year we wrote about No Free Lunch Theory (NFLT) and how it relates to AI (among other things). In this recent Wired article, this seems to be coming true. Deep Learning, the technology that helped AI make significant leaps in performance has limitations. These limitations, as reported in the article, cannot necessarily be overcome with more compute power.

As NFLT states (paraphrased): being good at doing X means an algorithm cannot also be good at Doing Not X. Deep Learning models that have success in one area is not a guarantee they will have success in other areas. In fact the opposite tends to be true. This is the NFLT in action and in many ways, specialized-instances of AI-based systems was an inevitability of this.

This has implications for the broader adoption of AI. For example, there can be no out-of-the box AI “system”. Implementing an AI solution based on the current-state-of-the-art is much like building a railway system. It needs to adapt to the local terrain. A firm can’t take a system from another firm or AI-solutions provider and hope it will be a turn-key operation. I guess it’s in the name, “Deep Learning”. The “Deep” refers to deep domain, i.e. specific use-case, an not necessarily deep thinking.

This is great news if you are an AI developer or have experience in building AI-systems. You are the house builder of the modern age and your talents will always be in demand – unless someone automates AI-system implementation.

UPDATE: A16Z wrote this piece – which supports my thesis.