Until recently, I have wrestled with why people I knew growing up in a small village in the UK stayed in the village when there was a whole world of opportunities awaiting discovery. I have come to realize that life is a search process. A search for purpose, contentment and security. As with most search algorithms, some are better than others. Some peoples' search algorithm stop when they discover a local maxima - such as village life in the UK. Other algorithms drive people to travel much further.
Software development follows similar principles to a search algorithm. While we might think that we are heading towards a peak when we start out building an application, we soon discover that the landscape we are searching is evolving. If we rush too quickly to a peak we might find that we settle on a local rather than a global maxima. Facebook is a good example of the impact of search speed. The reason that Facebook prevailed is that the many social networking sites that came before it provided the company with a long list of technical and business mistakes to avoid. A major lesson was controlled growth - in other words, a slow search algorithm. Avoiding the strong temptation, especially when a social network is concerned, to grow very rapidly.
This is an example of a good search process and how it has to be a slow one for long term success. A slow search allows a process to find a stable solution. The Simulated Annealing Algorithm is a good example of this. The random perturbations applied to the search result diminish overtime as the solution gets closer to the optimum search result. The occasional randomness ensures the search doesn't get stuck on a solution.
We have also been running our own, slow search algorithm as we build Knowledge Leaps. We have been at this for a long time. App development began five years ago, but we started its design at least eight years ago. While we wanted to go faster, we have been resource constrained. The advantage of this is that we have built a resilient and fault-tolerant application. The slow-development process has also helped foster our design philosophy, when we began we wanted to be super-technical and build a new scripting language for data engineering. Over time our beliefs have mellowed as we have seen the benefits of a No Code / Low Code solution. Incorporating this philosophy into Knowledge Leaps has made the application that much more user friendly and stable.
If you want to see how good Alexa is at answering people's questions you should sign on to Alexa Answers and see the questions Alexa cannot answer. This site has gamified helping Alexa answer these questions. I spent a week doing this and figured out a pretty good work flow to stay in the top 10 of the leader board.
The winning strategy is to use Google. You copy the question in to Google and paste the answer Google back in to the Alexa Answers website for it to played back to the person who asked it. The clever thing is that since it is impossible to legally web-scrape Google.com at a commercially viable rate, Amazon have found a way of harnessing the power of Google without a) having to pay, b) violating Google.com's TOS, and c) getting caught stealing Google's IP.
After doing this for a week, the interesting thing to note is why Alexa could not answer these questions. Most of them are interpretation errors. Alexa misheard the question (e..g connor virus, coronda virus, instead of coronavirus). The remainder of the errors are because the question assumes Alexa's knowledge of the context (e.g. Is fgtv dead? - he's a youtube star) and without the subject of the question being a known entity in Alexa's knowledge graph, the results are ambiguous. Rather than be wrong, Alexa declines to answer.
Obviously this is where the amazing pattern matching abilities of the human brain come in. We can look at the subject of the question and the results and choose the most probable correct answer. Amazon can then augment Alexa's knowledge graph using these results. This is probably in violation of Google's IP if Amazon intentionally set out to do this.
Having a human being perform the hard task in a learning loop is something that we have also employed in building our platform. Knowledge Leaps can take behavioral data and tease out price sensitivity signals, using purchase data, as well as semantic signals in survey data.
On a demo of our application to a prospective customer, the instant feedback was "this looks easier to use than Alteryx". We'll take that sort of compliment any day of the week.
During an interview between Shane Parrish and Daniel Kahneman, one of the many interesting comments made was around how to make better decisions. Kahneman said that despite studying decision-making for many years, he was still prone to his own biases. Knowing about your biases doesn't make them any easier to overcome.
His recommendation to avoid bias in your decision making is to devolve as many decisions as you can to an algorithm. Translating what he is saying to analytical and statistical jobs suggests that no matter how hard we try, we always approach analysis with biases that are hard to overcome. Sometimes our own personal biases are exaggerated by external incentive models. Whether you are evaluating your bosses latest pet idea, or writing a research report for a paying client, delivering the wrong message can be costly, even if it is the right thing to do.
Knowledge Leaps has an answer. We have built two useful tools to overcome human bias in analysis. The first is a narrative builder that can be applied to any dataset to identify the objective narrative captured in the data. Our toolkit can surface a narrative without introducing human biases.
The second tool we built removes bias by using many pairs of eyes to analyze data and average out any potential analytical bias. Instead of a single human (i.e. bias prone analyst) looking at a data set our tool lets lots of people look at it simultaneously and share their individual interpretation of the data. Across many analysts, this tool will remove bias through collaboration and transparency.
Get in touch to learn more. email@example.com.
Occasionally we will send out an email newsletter containing some detailed and interesting case studies based on the data we have access to. Sign up below.
For the past five years we have been building our app Knowledge Leaps, an agile market research tool. We use it to power our own business serving some of the most demanding clients on the planet.
To build an innovative market research tool I had leave the industry. I spent 17 years working in market research and experienced an industry that struggled to innovate. There are many reasons why innovation failed to flourish, one of which lies in the fact that it is a service industry. Service businesses are successful when they focus their human effort on revenue generation (as it should be). Since the largest cost base in the research are people, there is no economic incentive to invest in the long term especially as the industry has come under economic pressure in recent years. The same could be said of many service businesses that have been disrupted by technology. Taxi drivers being a good example of this effect.
This wouldn't be the first time market research innovations have come from firms that are outside of the traditional market research category definition. For example, SurveyMonkey was founded by a web developer with no prior market research experience. While, Qualtrics was founded by a business school professor and his son, again with no prior market research industry experience.
Stepping outside of the industry and learning how other types of businesses are managing data, using data and extracting information from it has been enlightening. It has also helped us build an abstracted-solution. While we can focus on market research use-cases, since we have built a platform that fosters analytics collaboration and an open-data philosophy finding new uses for it is a frequent occurrence.
To talk tech-speak what we have done is to productize a service. We have taken the parts of market research process which happen frequently and are expensive and turned them into a product. A product that delivers the story in data with bias. It does it really quickly too. Visit the site or email us firstname.lastname@example.org to find out more.
In a lot of science fiction films one, or more, of the following are true:
- Technology exists that allows you to travel through the universe at the "speed of light."
- Technology that allows autonomous vehicles to navigate complicated 2-D and 3-D worlds exists.
- Technology exists that allows robots to communicate with humans in real-time detecting nuances in language.
- Handheld weapons have been developed that fire bursts of lethal high energy that require little to no charging.
Yet, despite these amazing technological advances the kill ratio is very low. While it is fiction, I find it puzzling that this innovation inconsistency persists in many films and stories.
This is the no-free-lunch theory in action. Machines are developed to be good at a specific task are not good at doing other tasks. This will have ramifications in many areas especially those that require solving multiple challenges. Autonomous vehicles for example need to be good at 3 things:
- Navigating from point A to B
- Complying with road rules and regulations.
- Negotiating position and priority with other vehicles on the road.
- Not killing, or harming, humans and animals.
Of this list 1) and 2) are low level. 3) is challenging to solve as it requires some programmed personality. Imagine if two cars using the same autonomous software meet at a junction at the very same time, one of them needs to give way to the other. This requires some degree of assertiveness to be built. I am not sure this is trivial to solve.
Finally, 4) is probably really hard to solve since it requires 99.99999% success in incidents that occur every million miles. There may never be enough training data.
Last year we wrote about No Free Lunch Theory (NFLT) and how it relates to AI (among other things). In this recent Wired article, this seems to be coming true. Deep Learning, the technology that helped AI make significant leaps in performance has limitations. These limitations, as reported in the article, cannot necessarily be overcome with more compute power.
As NFLT states (paraphrased): being good at doing X means an algorithm cannot also be good at Doing Not X. Deep Learning models that have success in one area is not a guarantee they will have success in other areas. In fact the opposite tends to be true. This is the NFLT in action and in many ways, specialized-instances of AI-based systems was an inevitability of this.
This has implications for the broader adoption of AI. For example, there can be no out-of-the box AI "system". Implementing an AI solution based on the current-state-of-the-art is much like building a railway system. It needs to adapt to the local terrain. A firm can't take a system from another firm or AI-solutions provider and hope it will be a turn-key operation. I guess it's in the name, "Deep Learning". The "Deep" refers to deep domain, i.e. specific use-case, an not necessarily deep thinking.
This is great news if you are an AI developer or have experience in building AI-systems. You are the house builder of the modern age and your talents will always be in demand - unless someone automates AI-system implementation.
UPDATE: A16Z wrote this piece - which supports my thesis.
The tools available to produce charts and visualize data are sadly lacking in a critical area. While much focus has been placed on producing interesting visualizations, one problem has yet to be solved: it is all too easy to separate the Data layer from the Presentation layer in a chart. It is easy for the context of a chart to be lost when it becomes separated from its source. When that happens we lose meaning and we potentially introduce bias and ambiguity.
In plain english, when you produce a chart in Excel or Google Sheets, the source data is in the same document. When you embed that chart in a PowerPoint or Google slide deck you lose some of the source information. When you convert that presentation into a PDF and email it to someone, you risk losing all connections to the source. Step by step it becomes too easy to remove context from a chart.
Yes, you can label the chart. You can cite your source but neither are foolproof methods. These are like luggage tags, while they are attached they work but they are all too easy to remove.
In analytics, reproducibility and transparency are critical to building a credible story. Where did the data come from, could someone else remake the chart following these instructions (source, series information, filters applied, etc). Do the results stand up to objective scrutiny?
At Knowledge Leaps, we are building a system that ensures reproducibility and transparency by binding the context of the data and its "recipe" to the chart itself. This is built into the latest release of our application.
When charts are created we bind them to their source data (easy) and we bind the "recipe". We then make them easily searchable and discoverable, unhindered by any silo information i.e. slide, presentation, folder, etc.
The end-benefit data and charts can be shared without loss of the underlying source information. People not actively involved in creating the chart can interpret and understand its content without any ambiguity.
Today we rolled out our new charting feature. This release marks an important milestone in the development of Knowledge Leaps (KL).
Our vision for the platform has always been to build a data analysis application platform that lets a firm harness the power of distributed computing and a distributed workforce.
Charts and data get siloed in organisations because they are buried in containers. Most charts are contained on a slide in a PowerPoint presentation that sits in a folder on a server somewhere in your company's data center.
We have turned this on its head in our latest release. Charts that are produced in KL remain open and accessible to all users. We have also built in a collaborative interpretation feature where a group of people spread across locations can interpret data as part of a team rather than alone. This shares the burden of work and build more resilient insights since people with different perspectives can build the best-in-class narrative.