AO's mainnet launch spawned a new challenge: how do you efficiently query and analyze millions of messages across the network's history? Arweave's GQL is great for retrieving raw data, but building analytics or explorers on top of it can be a battle, needing client-side aggregation, and handling the quirks of multiple protocol formats. With messages buried inside Arweave blocks, the ecosystem needed a way to make them easily surfacable.

Atlas v0.2.0 solves this by indexing every single AO mainnet message into a queryable database, making light work of queries that used to be impossible.

Today, Atlas is the API powering Lunar - AO's only mainnet explorer - serving as critical infra for the AO ecosystem.

The first version of the Atlas explorer we announced focused mainly on fair launch project data and the related pre-bridge oracles. The v0.1.0 covered 2 angles: the Atlas Rust framework, and the explorer’s dashboard. Today with v0.2.0 we are focusing on a new part of the framework: networks statistics and full-mainnet indexing.

What can you build with Atlas?

Atlas is to AO as the Etherscan API is to Ethereum. This opens up a lot of use cases for devs in the eco:

Data analysis dashboards and explorers

Build AO block explorers (like Lunar!), transaction viewers, or debugging tools without complex logic or handling protocol format differences.

Notification systems

Watch for specific tags or message patterns in real-time as blocks are finalized. Alert users when their processes receive messages, or trigger workflows based on on-chain events.

Data pipelines and research tools

Export historical AO data for analysis or training. Run complex aggregations server-side rather than downloading and processing millions of messages locally.

Financial analysis tools

Build tax reporting toolings and per-dApp spend analysis – useful for both personal finance as a user, and ecosystem-wide revenue calculations.

Legacy (ao.TN.1) Statistics Indexer

Before taking on the features introduced with ao.N.1, let’s visit the new Legacy-related features: ao.TN.1 network statistics indexer:

(source: lunar.ar.io)

The API server now has support for 3 new endpoints designed specifically for AO explorer statistics:

  • GET /explorer/blocks?limit=100 - emits the last N indexed blocks.
  • GET /explorer/day?day=YYYY-MM-DD - per-block unique counts + summed-over-block totals for the given date (defaults to today).
  • GET /explorer/days?limit=N - same payload as /explorer/day, aggregated for the last N days (defaults to 7).

These statistics power the live metrics you see on Lunar: block times, message counts, and network activity. Without Atlas, this data would require constant polling and client-side aggregation across thousands of GQL queries.

Mainnet (ao.N.1) Indexer

Just like Atlas provides network statistics for ao.TN.1, Atlas does the same for mainnet:

  • GET /mainnet/explorer/blocks?limit=100 - emits the last N indexed blocks.
  • GET /mainnet/explorer/day?day=YYYY-MM-DD - per-block unique counts + summed-over-block totals for the given date (defaults to today).
  • GET /mainnet/explorer/days?limit=N - same payload as /explorer/day, aggregated for the last N days (defaults to 7).

And these endpoints are also used by Lunar for mainnet statistics widgets:

However the coolest new feature introduced specifically for ao.N.1 is indexing every single message [ ^^] - yes you heard it right, every single mainnet message, not just Arweave blocks summaries.

Mainnet Messages Indexer

As Arweave blocks are finalized, all AO messages contained in that block are written into a dedicated messages table, while their tags are extracted into a separate tags table. This keeps message metadata compact and makes tag-based queries fast and flexible. Indexer progress is tracked independently so ingestion can resume safely and deterministically if interrupted.

In each Arweave block, Atlas scans for AO mainnet messages under their current 2 distinct protocol tags case sensitivity, tracking each different data protocol from their “genesis”, as below (internally we label them as A and B):

// ao mainnet data protocols  
// the mainnet have 2 type of tags for mainnet txs,  
// type A follows lower-case tags key format  
// type B follows Header-Case tags key format  
pub const DATA_PROTOCOL_A_START: u32 = 1_594_020; // Jan 22 2025  
pub const DATA_PROTOCOL_B_START: u32 = 1_616_999; // Feb 25 2025  

All data is stored in a self-hosted (baremetal :D) ClickHouse cluster using the ReplacingMergeTree engine. This allows the indexer to reprocess blocks when needed without introducing duplicates, while still converging toward a single canonical view of mainnet data, and saving a monthly 4 figures bill if it was deployed on SaaS Clickhouse.

A glimpse into performance

Atlas runs on a self-hosted ClickHouse cluster using the ReplacingMergeTree engine, which makes full mainnet message indexing both fast and inexpensive.

High ingest throughput

Atlas comfortably handles hundreds of thousands of rows per second on modest hardware, which is enough to index every ao.N.1 message in real time without batching or lag.

Fast queries at scale

Typical block-range or tag-filtered queries scan millions of rows in milliseconds to low seconds, even as historical data grows.

Safe reindexing

ReplacingMergeTree allows blocks to be replayed cheaply and deterministically, without expensive deduplication passes or data rewrites.

Atlas Indexer REST API

In v0.2.0 Atlas server have the following REST API methods available, however the list is growing given the complexity of relational indexed data, and possibility to have many REST API query methods for that data:

  • GET /mainnet/messages/recent - returns recently indexed ao mainnet messages.
  • GET /mainnet/messages/block/{height} - returns the indexed ao messages at a given Arweave blockheight (settled messages)
  • GET /mainnet/messages/tags?key=<TAG_NAME>&value=<TAG_VALUE>&protocol=<A|B>&limit= - (case sensitive) returns the ao messages for the given tag KV filter, and data protocol (A|B).
  • GET /mainnet/info - returns AO mainnet indexer info

Atlas vs Arweave GQL (ao.N.1 queries)

Arweave GQL is used as the source of truth for AO messages data queries. Atlas exists because querying that data at scale (especially across the full lifetime of mainnet) is inefficient and complex.

With Arweave GQL

  • Queries are over blocks, not datasets
  • Results are capped (~100 txs per request)
  • Pagination is mandatory
  • Queries are network-bound
  • Aggregations happen client-side
  • Two tag formats (A | B) mean duplicated queries
  • Block-height conditionals leak into application logic

With Atlas

  • Direct tags and messages queries with different conditions
  • Arbitrary block or date ranges in one request
  • Tags are normalized and indexed (A|B)
  • Aggregations are native (count, uniq, sums)
  • One query covers all historical blocks
  • ao messages data protocols quirks are absorbed at ingest time via an ingest-time abstraction layer.

Concrete example

“Scan ao.N.1 messages for a tag across mainnet history”

  • GQL: two queries per block (A + B), paginated, merged client-side
  • Atlas: one indexed query

Why does this matter?

GQL is excellent for retrieving and verifying raw data (Arweave blocks -> ao messages dataitems).

Atlas is built for asking questions of that data with an abstracted interface: repeatedly, cheaply, and without protocol-specific logic.

What’s next?

As we can see, Atlas is steadily evolving as an open source data API for AO, + FLP data indexing and analytics framework, serving data utilities to the permaweb ecosystem, at scale. Naturally we will be working on actively maintaining the Atlas hosted version and keeping its indexer in parity with Arweave’s blockheight tip, adding more REST API queries method, and working towards further UIs integrations for richer data visualization. Stay tuned for more!