Mission

We believe that most bad things in the world are the product of some form of information failure; from economic collapse and the outbreaks of war and disease, to choosing the right life partner or university degree. We’re on a mission to improve imagination, and help everybody make the right decisions, overcoming information failure once and for all.

The well-trodden path

Brilliant innovators have sought to organize the world’s information and make it accessible to all, and the next step on this journey is to make that information understandable to and usable by everybody.

While high-tech, highly-funded organizations like hedge-funds are able to process vast swathes of the world’s information efficiently for minute gains and millisecond edges in economic trades, the vast majority of businesses and individuals have no systematic way of parsing the wealth of signals contained in the world around them.

Simulation has the power to unlock a better world: advancing our understanding and appreciation of the environments around us. They can alert us to possibilities never considered, highlight hidden relationships between the agents within them, and force us to make explicit our assumptions about how the world works.

Not only are simulations useful cognitive tools for humans, but they have the potential to be rich, machine-readable representations of real world problems as well. As such, we see simulation as a universal interface for both humans and AI — and in our view the best bet we have for growing connective tissue that bridges both human and machine learning.

We hope to enable better human and automated decision-making: bringing about the rational resolution of conflicts of all kinds, reducing and eliminating market failures, and supporting people to achieve happier, healthier lives. We don’t want to wait for this.

How Do Multi-Agent Simulations Work?

Multi-agent simulations of the sort HASH helps people create and explore work as follows…

  • Agents represent actors: be they individuals, companies, households, machinery in a factory, or any other thing. Different models look at systems in differing levels of granularity. In theory an ‘agent’ could be a molecule.
  • Agents have properties: these are values attached to agents which vary between them. In the case of a person, a property might be a boolean like ‘is registered voter’ (Y/N), ‘party affiliation’ (multi-choice), or ‘annual income’ (some numeric value).
  • Agents exist in environments — and often multiple at once — e.g. geospatially and on a network graph.
  • Agents are driven by behaviors: behaviors are essentially code that explain how agents should interact with and react to the world around them.

These simulations can be constructed from ground-up first principles, and are useful for counterfactual “what if” hypothesis testing, enabling safe exploration of the digital twins of real-world systems. For example, in epidemiology this can be used to forecast the spread of disease throughout society, and in public health how information spreads in response. This can be used to implement better policies, and engage in more effective advertising, reducing the ultimate burden of disease.

Solving Problems Traditional Data Science Can’t

A whole range of complex systems problems defy attempts to be predictably modeled. These are typically problems categorized by nonlinearity, emergence, adaptation, interdependence and feedback loops between agents. Similarly “black swan events” are by definition not reflected in existing patterns and historical data, and are therefore missed entirely by many traditional approaches that simply seek to detect anomalies in existing data, or regress out from it into the future.

No real-world system truly exists in isolation — all are part of our complex universe — and as such all business, policy, and human problems are ultimately problems of understanding complex systems. Smart abstraction enables us to discount much of the extraneous world, most of the time, but it’s frequently hard to conceive what might be interesting when, and under which set(s) of circumstances.

In some systems this doesn’t matter, but in answering other questions like how we can contribute to a more stable economy or good foreign relations, they can be matters of life and death. In order to fully understand these high-impact, critical-risk problems, we need to generatively search the space around them based on the observable dynamics of those systems. Pattern recognition and analysis of historical outputs alone are good for cheap base-casing, but provide little understanding of problems’ tails.

Because the space around problems representing all possible configurations of the world is so much greater than the historical space in which problems have been observed, there is a temptation to sometimes write off proper scientific simulation as infeasible. But simulation properly used doesn’t seek to simulate every possible version of the world that might ever occur (infinite, of course) — but rather help its users understand which versions are likely to occur, and bring attention to possible novel scenarios which might not previously have been considered by human analysts or experts due to their emergent nature.

Crises like the 07/08 financial crash became disasters precisely because decision-makers didn’t understand or account for the underlying dynamics of complex systems — in this case the economy. Well-intentioned pieces of regulation such as Basel II put in place capital reserve requirements, which when combined with mark-to-market accounting practices led to asset fire sales, with market participants forced to sell into declining markets, deepening the trough.

While historical and present value data can be used to pre-populate and backtest agent-based models; it’s not required to construct ABMs, which opens the door to explicit formal modeling in a wide variety of domains where machine learning cannot be readily applied today.

Moreover, simulations combine the benefits of formal modeling with the richness of qualitative description, making them highly explainable and easy to understand by humans. In contrast to often black-box models, agent-based simulations are inherently inspectable and users can step through time to see exactly how outcomes are arrived at, and what factors contribute.

So why then do they remain so unspoken about, unappreciated and underutilized?

Problems with Multi-Agent Simulation Today

Simulations are time-consuming and costly to build, as well as expensive to maintain, run and support. They require knowledge of specialist tools, frameworks, and even weird proprietary programming languages. The resulting simulations are often not particularly portable or repurposeable, and where simulation logic is the product of conjecture or lacks calibration, this can lead to a false sense of confidence or security which may compound existing poor decision-making.

Although simulations claim prominent users across the worlds of supply chain, manufacturing, finance, defense, and more, market-leading agent-based modeling software packages today run north of $10k+/user/year, and are based on dated technologies and paradigms which don’t lend themselves well to distributed computation at real scale. Their user interfaces haven’t been touched since the 1990s, the developer experience they offer are equally dated, they don’t run in the browser at all, they can’t be used on mobile devices, and users often need to deploy special software just to access them.

For the most part these simulations are toy models, built to showcase specific dynamics, and lack interoperability. Once built, models are siloed, and there’s little sharing or building on the work of others. Most models built are so scoped down to ensure they run in a timely fashion that they capture only a fraction of the dynamics within the systems they represent. Rather than build rich virtual worlds and selectively include relevant parts on a per-experiment basis, cheap toy abstractions are created which fail to inspire confidence amongst users and are much less easily explored. There’s deep, justifiable skepticism as to whether toy models are truly ‘scientific’, and on the flip side that more complex models can be appropriately calibrated and parameterized. 

Throw into the mix problems finding appropriately granular agent-level data, difficulties translating domain expertise into code, and a wide range of structural barriers to creating ABMs and it’s not hard to see why general purpose simulation remains out of favor and rarely used in business today.

Simulation For Everybody

When faced with a lot of systemic problems, we want to build system level solutions. HASH aims to ‘solve simulation’ by vertically-integrating the entire stack, providing a unified platform for building, running, and learning from simulations.

So far we have publicly launched the first two parts of HASH’s platform for multi-agent simulation:

  1. HASH Core: a web-based developer environment and viewer for simulations.
  2. HASH Index: a collection of simulations and modular component parts.

All HASH simulations consist of agents (represented by descriptive schemas), and behaviors (which are generally pure functions). Behaviors drive agents, datasets can be used to instantiate or update agents within simulations based on real-world observations, or used to backtest and calibrate models. Behaviors and datasets are mapped to appropriate subjects and schema, making them easily discoverable by model-builders using hIndex, and in future cross-linkable within hCore.

Simulations, datasets and behaviors are all accessible on the hIndex. Today, everything within the hIndex is available free of charge. Envisaged to be something of a cross between GitHub and a package manager, in the future, hIndex will be extended to become a marketplace in addition, facilitating the purchase and sale of paid behaviors, datasets and simulations. We imagine consultancies publishing components for free to establish credibility and expertise, then selling more complete simulations and consultancy services atop.

Our future plans for hIndex involve explicit Git-like support for forking, branching, reporting issues and making pull requests — functionality that, like the use of package managers, are now second nature to most software developers today.

The impact of these changes to developer workflow are significant: as hIndex grows, domain experts with limited programming knowledge will be able to fork and adapt, or wholesale incorporate existing behaviors into simulations, enabling them to model complex dynamics without the need to write vast swathes of custom code from scratch.

Our current lineup is not, however, complete.

Upcoming Plans

Although our blazing fast HASH Engine enables simulation at unparalleled speed, it is currently only available through our hCore web-based IDE, which necessarily constrains it to the memory and CPU available to the browser tab, which is in many cases is severely limited. This has meant that while hEngine is designed to handle truly world-sized simulations, our early beta users have been limited to building relatively small-scale models in our platform. This makes hCore in its current iteration comparable to something like NetLogo, the academic agent-based modeling tool: useful for illustrating the impact of heterogeneous agents within complex systems, and helpful in explaining to users the dynamics of these systems, but limited in its capacity to model real-world environments with a high degree of fidelity or at scale. Because of these current constraints, tools for running optimization experiments (parameter sweeps, Monte Carlo simulations, and more exotic reinforcement learning) have been hidden away for now — but are very much priorities for us.

You can learn more over on our roadmap regarding our plans to unlock these features along with simulation for everyday use in ‘real world’ decision-making.

Two more parts of the HASH platform will be released in 2020, expanding our reach from 7,500+ registered ‘hobbyist’ users to the entire open-source community, and enterprise users specifically.

2020Q3 – HASH Cloud:

  • hCloud is the part of our platform that enables users to run simulations in the cloud with just one click, from within HASH’s existing hCore authoring and viewing interface (and upon hEngine’s open-source launch via the command line as well).
  • Alongside this we’ll be exposing an ‘experiments’ interface in hCore that unlocks the door to deriving commercial insight from simulations at scale.
  • Through hCloud users will be able access simulation and experiment results programmatically to drive algorithms and applications outside of HASH.

2020Q4 – HASH Engine:

  • We’ll be open-sourcing HASH Engine, the simulation engine at the heart of HASH, later this year.
  • Written in Rust, with bindings today in existence for JavaScript and Python, hEngine is the ultra-fast actor system that underpins all computation in HASH.
  • Our goal is to make the platform accessible to everybody, and enabling folks to run hEngine locally and within closed systems is a significant part of this.
  • We’re currently aiming to release a public version of hEngine under an open-source license by the end of 2020. 

We’re excited to meet users of HASH and have launched a Slack community which can be accessed via the icon in the bottom-right of any page at hash.ai — we’ll be around to help you build your models, answer your questions, and take your feature suggestions and bug reports.

To eliminate information failure, we need to build tools that have never been created before to solve problems that can’t be solved today. We need to give people superpowers, and that’s what we’re on a mission to do.

If you’d like to build a model with HASH, you can sign up at hash.ai/signup

If you want to join us on our mission of helping everybody make the right decisions, you can help publish simulations, behaviors and data to hIndex, or apply for any one of our open roles at hash.ai/careers

And finally, if you’re a business decision-maker interested in learning how HASH can be applied to help you, get in touch at hash.ai/contact

We’re grateful to HASH’s early investors for their support: in particular Stack Overflow founder Joel Spolsky, Kaggle founder Anthony Goldbloom, Ash Fontana from Zetta Venture Partners and Lee Edwards from Root Ventures.