We Eat Our Own Dog Food

I had heard about the phrase "do the dogs eat the dog food" from a start-up podcast I had listened. The idea being if your firm is building a product for customers, does your firm also use it.

I then read this adaption of the phrase and thought it applies to us. We ship features and code that help our customers and that help us do our jobs better. We make "dog food" and we eat it. So, if the UI for a new feature is clunky or an implementation doesn't quite hit the mark, we know about because our team will tell us.

Feature Release: July 13

Today, we released a new set of features. The primary feature is a new auditing tool that helps data engineers quickly profile a data set in terms of column cardinality, row count and the constituent file count. This simple feature gives a quick snapshot of a dataset and identifies any potential data issues. In a production pipeline this prevents corrupted data being dispatched.

Data Audit Icon

Clicking the icon performs the audit. Once completed all information is viewed in the information page for each data set.

Forecasting Using Prior Distributions

We have been building some product forecasting models using Monte Carlo methods. Sales distributions are often skewed right. Using normal approximations tends to over inflate forecast estimates, since the distribution is not centered around the mean. Further more the standard deviation of skewed distributions tends to produce estimates with very wide variances - by definition.

To overcome this, we use a Monte Carlo simulator - that draws from the sales distribution at random. Creating a sample of many estimates not only gives a more accurate estimate, it is also helps us calculate more realistic margins of error.

Feature Release: July 3rd

New features rolled out this week:

  • Apply filters and mapping files to other filters and mapping files. This feature helps create randomized lists and sub filters based on new criteria. For example, extract a list of userIDs from a data file, apply gender from a look up table. Then filter this list by gender to create a specific list of users. This new file can then be sampled randomly to create a new list of random userIDs that meet a specific criteria.