Today, along with Joel Spolsky and Jude Allred, I’m excited to introduce HASH, the company we founded a little over a year ago. We believe that most bad things in the world are the product of some form of information failure. From economic collapse and the outbreak of war and disease, to choosing the right life partner or university degree, we’re on a mission to help everybody make the right decisions and overcome information failure.
Brilliant innovators have sought to organize the world’s information and make it accessible to all, and the next step on this journey is to make that information understandable and usable to everybody.
While high-tech, highly-funded organizations like hedge-funds are able to process vast swathes of the world’s information efficiently for minute gains and millisecond edges in economic trades, the vast majority of businesses and individuals have no systematic way of parsing the wealth of signals contained in the world around them.
Simulation has the power to unlock a better world: advancing our understanding and appreciation of the world around us. Not only are simulations useful cognitive tools for humans, but they have the potential to be rich, machine-readable representations of real world problems as well. As such, they are universal interfaces for both humans and AI — and in our view the best bet we have for growing connective tissue that bridges both human and machine learning.
We hope to enable better human and automated decision-making: bringing about the rational resolution of conflict, the reduction and elimination of market failures, and supporting people to achieve happier, healthier lives. We don’t want to wait for this.
If you can’t wait to get started either, sign up now – or read on to find out more.
I used to run a digital consultancy in London that developed sites, software, and ran data-driven campaigns for a wide range of clients: from private equity firms and startups through to government clients of the largest size.
From time to time we’d encounter really interesting problems, such as how to track the spread of a behaviorally-driven disease (such as a sexually transmitted infection), assess the effectiveness of interventions against it (e.g. informational advertising campaigns), and optimize ad-spend (i.e. target influencing nodes in networks most likely to stymie disease spread.)
It turns out that a single gold standard exists in both epidemiology and behavioral advertising for answering these types of questions, which is ‘agent-based modeling’ (ABM). ABMs work as follows…
ABMs can be constructed from ground-up first principles, and are useful for counterfactual “what if” hypothesis testing, enabling safe exploration of the digital twins of real-world systems. That makes multi-agent simulations useful for a whole lot more than predicting the spread of disease and information through networks.
A whole range of complex systems problems defy attempts to be predictably modeled. These are typically problems categorized by nonlinearity, emergence, adaptation, interdependence and feedback loops between agents. The resulting “black swan events” are by definition not reflected in existing patterns and historical data, and are therefore missed entirely.
No systems exist in true isolation — all are part of our complex real world — and as such all business, policy, and human problems are ultimately problems of understanding complex systems. Smart abstraction enables us to discount most of the extraneous world, most of the time, but it’s hard to know what might be interesting when, and under which circumstances.
In some systems this doesn’t matter, but in answering other questions like how we can contribute to a more stable economy or good foreign relations, they can be matters of life and death. In order to fully understand these high-impact, critical-risk problems, we need to generatively search the space around them based on the observable dynamics of those systems. Pattern recognition and analysis of historical outputs alone are good for cheap base-casing, but provide little understanding of problems’ tails.
Because the space around problems representing all possible configurations of the world is so much greater than the historical space in which problems have been observed, there is a temptation to sometimes write off proper scientific simulation as infeasible. But simulation properly used doesn’t seek to simulate every possible version of the world that might ever occur (infinite, of course) — but rather help its users understand which versions are likely to occur, and bring attention to possible novel scenarios which might not previously have occurred to human analysts alone due to their emergent nature.
Crises like the 07/08 financial crash became disasters precisely because decision-makers didn’t understand or account for the underlying dynamics of complex systems — in this case the economy. Well-intentioned pieces of regulation such as Basel II put in place capital reserve requirements, which when combined with mark-to-market accounting practices led to asset fire sales, with market participants forced to sell into declining markets, deepening the trough.
While historical and present value data can be used to pre-populate and backtest agent-based models; it’s not required to construct ABMs, which opens the door to explicit formal modeling in a wide variety of domains where machine learning cannot be readily applied today.
Moreover, simulations combine the benefits of formal modelling with the richness of qualitative description, making them highly explainable and easy to understand by humans. In contrast to often black-box models, agent-based simulations are inherently inspectable and users can step through time to see exactly how outcomes are arrived at, and what factors contribute.
So why then do they remain so unspoken about, unappreciated and underutilized?
Simulations are time-consuming and costly to build, as well as expensive to maintain, run and support. They require knowledge of specialist tools, frameworks, and even weird proprietary programming languages. The resulting simulations are often not particularly portable or repurposeable, and where simulation logic is the product of conjecture or lacks calibration, this can lead to a false sense of confidence or security which may compound existing poor decision-making.
Although simulations claim prominent users across the worlds of supply chain, manufacturing, finance, defense, and more, market-leading agent-based modeling software packages today run north of $10k+/user/year, and are based on dated technologies and paradigms which don’t lend themselves well to distributed computation at real scale. Their user interfaces haven’t been touched since the 1990s, the developer experience they offer are equally dated, they don’t run in the browser at all, they can’t be used on mobile devices, and users often need to deploy special software just to access them.
For the most part these simulations are toy models, built to showcase specific dynamics, and lack interoperability. Once built, models are siloed, and there’s little sharing or building on the work of others. Most models built are so scoped down to ensure they run in a timely fashion that they capture only a fraction of the dynamics within the systems they represent. Rather than build rich virtual worlds and selectively include relevant parts on a per-experiment basis, cheap toy abstractions are created which fail to inspire confidence amongst users and are much less easily explored. There’s deep, justifiable skepticism as to whether toy models are truly ‘scientific’, and on the flip side that more complex models can be appropriately calibrated and parameterized.
Throw into the mix problems finding appropriately granular agent-level data, difficulties translating domain expertise into code, and a wide range of structural barriers to creating ABMs and it’s not hard to see why general purpose simulation remains out of favor and rarely used in business today.
When faced with a lot of systemic problems, we want to build system level solutions. HASH aims to ‘solve simulation’ by vertically-integrating the entire stack, providing a unified platform for building, running, and learning from simulations.
Today we’re launching publicly two parts of HASH:
All HASH simulations consist of agents (represented by descriptive schemas), and behaviors (which are generally pure functions). Behaviors drive agents, datasets can be used to instantiate or update agents within simulations based on real-world observations, or used to backtest and calibrate models. Behaviors and datasets are mapped to appropriate subjects and schema, making them easily discoverable by model-builders using H-Index, and in future cross-linkable within H-Core.
Simulations, datasets and behaviors are all accessible on the H-Index. Today, everything within the H-Index is available free of charge. Envisaged to be something of a cross between GitHub and a package manager, in the future, H-Index will be extended to become a marketplace in addition, facilitating the purchase and sale of paid behaviors, datasets and simulations. We imagine consultancies publishing components for free to establish credibility and expertise, then selling more complete simulations and consultancy services atop.
Our future plans for H-Index involve explicit Git-like support for forking, branching, reporting issues and making pull requests — functionality that, like the use of package managers, are now second nature to most software developers today.
The impact of these changes to developer workflow are significant: as H-Index grows, domain experts with limited programming knowledge will be able to fork and adapt, or wholesale incorporate existing behaviors into simulations, enabling them to model complex dynamics without the need to write vast swathes of custom code from scratch.
Our current lineup is not, however, complete. Although our blazing fast HASH Engine enables simulation at unparalleled speed, it is currently only available through our H-Core web-based IDE, which necessarily constrains it to the memory and CPU available to the browser tab, which is in many cases is severely limited. This has meant that while H-Engine is designed to handle truly world-sized simulations, our early beta users have been limited to building relatively small-scale models in our platform. This makes H-Core in its current iteration comparable to something like NetLogo, the academic agent-based modeling tool: useful for illustrating the impact of heterogeneous agents within complex systems, and helpful in explaining to users the dynamics of these systems, but limited in its capacity to model real-world environments with a high degree of fidelity or at scale. Because of these current constraints, tools for running optimization experiments (parameter sweeps, Monte Carlo simulations, and more exotic reinforcement learning) have been hidden away for now — but are very much priorities for us.
To this end, today we’re releasing our roadmap for unlocking these features and the use of simulation for everyday ‘real world’ decision-making:
You can find out more about our upcoming feature roadmap public at hash.ai/roadmap
We started as just two people a little over a year ago, and are now a team of ~10. I’m incredibly proud of the team we’ve built, and what we’ve achieved in this time.
We’re excited to meet users of HASH and have launched a Slack community which can be accessed via the icon in the bottom-right of any page at hash.ai — we’ll be around to help you build your models, answer your questions, and take your feature suggestions and bug reports.
To eliminate information failure, we need to build tools that have never been created before to solve problems that can’t be solved today. We need to give people superpowers, and that’s what we’re on a mission to do.
If you’d like to build a model with HASH, you can sign up at hash.ai/signup
If you want to join us on our mission of helping everybody make the right decisions, you can help publish simulations, behaviors and data to H-Index, or apply for any one of our open roles at hash.ai/careers
And finally, if you’re a business decision-maker interested in learning how HASH can be applied to help you, get in touch at hash.ai/contact
We’re grateful to HASH’s early investors for their support: amazing community creators such as Stack Overflow founder Joel Spolsky, and Kaggle founder Anthony Goldbloom, as well as Ash Fontana and Lee Edwards from Zetta Venture Partners and Root Ventures. We’re excited to kick off our public mission.
Founder and CEO of HASH