How to run a data availability server (DAS)
Description
The Data Availability Server, daserver
, allows storage and retrieval of transaction data batches for Arbitrum AnyTrust chains. It can be run in two modes: either committee member, or mirror.
Committee members accept time-limited requests to store data batches from an Arbitrum AnyTrust sequencer, and if they store the data then they return a signed certificate promising to store that data. Committee members and mirrors both respond to requests to retrieve the data batches.
The data batches are content addressed with a keccak256 tree-based hashing scheme called dastree
. The hash is part of the Data Availability Certificate placed on L1 and that hash is used by the Nitro node to retrieve the data from daservers
.
Mirrors exist to replicate and serve the data to provide resiliency to the network in the case of committee members going down, and to make it so committee members don't need to serve requests for the data directly. Mirrors may also provide archived data beyond the limited time that committee members are required to store the data.
This document gives sample configurations for daserver
in committee member and mirror mode.
Interfaces
There are two main interfaces, a REST interface supporting only GET operations and intended for public use, and an RPC interface intended for use only by the AnyTrust sequencer. Mirrors listen on the REST interface only and respond to queries on /get-by-hash/<hex encoded data hash>
. The response is always the same for a given hash so it is cacheable; it contains a cache-control
header specifying the object is immutable and to cache for up to 28 days. The REST interface has a health check on /health
which will return 200 if the underlying storage is working, otherwise 503.
Committee members listen on the REST interface and additionally listen on the RPC interface for das_store
RPC messages from the sequencer. The sequencer signs its requests and the committee member checks the signature. The RPC interface also has a health check that checks the underlying storage that responds requests with RPC method das_healthCheck
.
IPFS is an alternative interface serving batch retrieval. A mirror can be configured to sync and pin batches to its local IPFS repository, then act as a node in the IPFS peer-to-peer network. A Nitro node that is configured to use IPFS that is syncing an AnyTrust chain will use the batch hashes from L1 to find the batch data on the IPFS peer-to-peer network. Depending on network configuration, that Nitro node may then also act as an IPFS node serving the batch data.
Storage
daserver
can be configured to use one or more of four storage backends; S3, files on local disk, database on disk, and IPFS. If more than one is selected, store requests must succeed to all of them for it to be considered successful, and retrieve requests only require one to succeed.
Please give us feedback if there are other storage backends you would like supported.
Caching
An in-memory cache can be enabled to avoid needing to access underlying storage for retrieve requests.
Synchronizing state
daserver
also has an optional REST aggregator which, in the case that a data batch is not found in cache or storage, queries for that batch from a list other of REST servers, and then stores that batch locally. This is how committee members that miss storing a batch (not all committee members are required by the AnyTrust protocol to report success in order to post the batch's certificate to L1) can automatically repair gaps in data they store, and how mirrors can sync. A public list of REST endpoints is published online, which daserver
can be configured to download and use, and additional endpoints can be specified in configuration.
Image:
offchainlabs/nitro-node:v2.1.1-e9d8842
Usage of daserver
Options for both committee members and mirrors:
# Server options
--enable-rest enable the REST server listening on rest-addr and rest-port
--log-level int log level; 1: ERROR, 2: WARN, 3: INFO, 4: DEBUG, 5: TRACE (default 3)
--rest-addr string REST server listening interface (default "localhost")
--rest-port uint REST server listening port (default 9877)
# L1 options
--data-availability.l1-node-url string URL for L1 node, only used in standalone daserver; when running as part of a node that node's L1 configuration is used
--data-availability.sequencer-inbox-address string L1 address of SequencerInbox contract
# Storage options
--data-availability.local-db-storage.data-dir string directory in which to store the database
--data-availability.local-db-storage.discard-after-timeout discard data after its expiry timeout
--data-availability.local-db-storage.enable enable storage/retrieval of sequencer batch data from a database on the local filesystem
--data-availability.local-file-storage.data-dir string local data directory
--data-availability.local-file-storage.enable enable storage/retrieval of sequencer batch data from a directory of files, one per batch
--data-availability.s3-storage.access-key string S3 access key
--data-availability.s3-storage.bucket string S3 bucket
--data-availability.s3-storage.discard-after-timeout discard data after its expiry timeout
--data-availability.s3-storage.enable enable storage/retrieval of sequencer batch data from an AWS S3 bucket
--data-availability.s3-storage.object-prefix string prefix to add to S3 objects
--data-availability.s3-storage.region string S3 region
--data-availability.s3-storage.secret-key string S3 secret key
--data-availability.ipfs-storage.enable enable storage/retrieval of sequencer batch data from IPFS
--data-availability.ipfs-storage.profiles string comma separated list of IPFS profiles to use
--data-availability.ipfs-storage.read-timeout duration timeout for IPFS reads, since by default it will wait forever. Treat timeout as not found (default 1m0s)
# Cache options
--data-availability.local-cache.enable Enable local in-memory caching of sequencer batch data
--data-availability.local-cache.expiration duration Expiration time for in-memory cached sequencer batches (default 1h0m0s)
# REST fallback options
--data-availability.rest-aggregator.enable enable retrieval of sequencer batch data from a list of remote REST endpoints; if other DAS storage types are enabled, this mode is used as a fallback
--data-availability.rest-aggregator.online-url-list string a URL to a list of URLs of REST das endpoints that is checked at startup; additive with the url option
--data-availability.rest-aggregator.urls strings list of URLs including 'http://' or 'https://' prefixes and port numbers to REST DAS endpoints; additive with the online-url-list option
--data-availability.rest-aggregator.sync-to-storage.check-already-exists check if the data already exists in this DAS's storage. Must be disabled for fast sync with an IPFS backend (default true)
--data-availability.rest-aggregator.sync-to-storage.eager eagerly sync batch data to this DAS's storage from the rest endpoints, using L1 as the index of batch data hashes; otherwise only sync lazily
--data-availability.rest-aggregator.sync-to-storage.eager-lower-bound-block uint when eagerly syncing, start indexing forward from this L1 block. Only used if there is no sync state
--data-availability.rest-aggregator.sync-to-storage.retention-period duration period to retain synced data (defaults to forever) (default 2562047h47m16.854775807s)
--data-availability.rest-aggregator.sync-to-storage.state-dir string directory to store the sync state in, ie the block number currently synced up to, so that we don't sync from scratch each time
Options only for committee members:
--enable-rpc enable the HTTP-RPC server listening on rpc-addr and rpc-port
--rpc-addr string HTTP-RPC server listening interface (default "localhost")
--rpc-port uint HTTP-RPC server listening port (default 9876)
--data-availability.key.key-dir string the directory to read the bls keypair ('das_bls.pub' and 'das_bls') from; if using any of the DAS storage types exactly one of key-dir or priv-key must be specified
--data-availability.key.priv-key string the base64 BLS private key to use for signing DAS certificates; if using any of the DAS storage types exactly one of key-dir or priv-key must be specified
Options generating/using JSON config:
--conf.dump print out currently active configuration file
--conf.file strings name of configuration file
Options for producing Prometheus metrics:
--metrics enable metrics
--metrics-server.addr string metrics server address (default "127.0.0.1")
--metrics-server.port int metrics server port (default 6070)
--metrics-server.update-interval duration metrics server update interval (default 3s)
Some options are not shown because they are only used by nodes, or they are experimental/advanced. A complete list of options can be found by running daserver --help
Sample deployments
Sample committee member
Using daserver
as a committee member requires:
- A BLS private key to sign the Data Availability Certificates it returns to clients (the sequencer aka batch poster) requesting to Store data.
- The Ethereum L1 address of the sequencer inbox contract, in order to find the batch poster signing address.
- An Ethereum L1 RPC endpoint to query the sequencer inbox contract.
- A persistent volume to write the stored data to if using one of the local disk modes.
- A S3 bucket, and credentials (secret key, access key) of an IAM user that is able to read and write from it if you are using the S3 mode.
Once the DAS is set up, the local public key in das_bls.pub
should be communicated out-of-band to the operator of the chain, along with a protocol (http/https), host, and port of the RPC server that can be reached by the sequencer, so that it can be added to the committee keyset.
Set up persistent volume
This is the persistent volume for storing the DAS database and BLS keypair.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: das-server
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 200Gi
storageClassName: gp2
Generate key
The BLS keypair must be generated using the datool keygen
utility. It can be passed to the dasever
executable by file or on the command line.
In this sample deployment we use a k8s deployment to run datool keygen
to create it as a file on the volume that the DAS will use. After this deployment has run once, the deployment can be torn down and deleted.
apiVersion: apps/v1
kind: Deployment
metadata:
name: das-server
spec:
replicas: 1
selector:
matchLabels:
app: das-server
template:
metadata:
labels:
app: das-server
spec:
containers:
- command:
- bash
- -c
- |
mkdir -p /home/user/data/keys
/usr/local/bin/datool keygen --dir /home/user/data/keys
sleep infinity
image: offchainlabs/nitro-node:v2.1.1-e9d8842
imagePullPolicy: Always
resources:
limits:
cpu: "4"
memory: 10Gi
requests:
cpu: "4"
memory: 10Gi
ports:
- containerPort: 9876
protocol: TCP
volumeMounts:
- mountPath: /home/user/data/
name: data
volumes:
- name: data
persistentVolumeClaim:
claimName: das-server
Create committee member DAS deployment
This deployment sets up a DAS server using the Arbitrum Nova Mainnet. It uses the L1 inbox contract at 0x211e1c4c7f1bf5351ac850ed10fd68cffcf6c21b. For the Arbitrum Nova Mainnet you must specify a Mainnet Ethereum L1 RPC endpoint.
This configuration sets up two storage types. To disable any of them, remove the --data-availability.(local-db-storage|s3-storage).enable option, and the other options for that storage type can also be removed. If updating an existing deployment from that is using the local files on disk storage type, you should use at least local-file-storage
. It sets the storage backends to discard the data after timeout.
apiVersion: apps/v1
kind: Deployment
metadata:
name: das-server
spec:
replicas: 1
selector:
matchLabels:
app: das-server
strategy:
rollingUpdate:
maxSurge: 0
maxUnavailable: 50%
type: RollingUpdate
template:
metadata:
labels:
app: das-server
spec:
containers:
- command:
- bash
- -c
- |
mkdir -p /home/user/data/badgerdb
/usr/local/bin/daserver --data-availability.l1-node-url <YOUR ETHEREUM L1 RPC ENDPOINT>
--enable-rpc --rpc-addr '0.0.0.0' --enable-rest --rest-addr '0.0.0.0' --log-level 3 --data-availability.local-db-storage.enable --data-availability.local-db-storage.data-dir /home/user/data/badgerdb --data-availability.local-db-storage.discard-after-timeout --data-availability.s3-storage.enable --data-availability.s3-storage.access-key "<YOUR ACCESS KEY>" --data-availability.s3-storage.bucket <YOUR BUCKET> --data-availability.s3-storage.region <YOUR REGION> --data-availability.s3-storage.secret-key "<YOUR SECRET KEY>" --data-availability.s3-storage.object-prefix "YOUR OBJECT KEY PREFIX/" --data-availability.s3-storage.discard-after-timeout --data-availability.key.key-dir /home/user/data/keys --data-availability.local-cache.enable --data-availability.rest-aggregator.enable --data-availability.rest-aggregator.online-url-list "https://nova.arbitrum.io/das-servers" --data-availability.sequencer-inbox-address '0x211e1c4c7f1bf5351ac850ed10fd68cffcf6c21b'
image: offchainlabs/nitro-node:v2.1.1-e9d8842
imagePullPolicy: Always
resources:
limits:
cpu: "4"
memory: 10Gi
requests:
cpu: "4"
memory: 10Gi
ports:
- containerPort: 9876
hostPort: 9876
protocol: TCP
- containerPort: 9877
hostPort: 9877
protocol: TCP
volumeMounts:
- mountPath: /home/user/data/
name: data
readinessProbe:
failureThreshold: 3
httpGet:
path: /health/
port: 9877
scheme: HTTP
initialDelaySeconds: 5
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 1
volumes:
- name: data
persistentVolumeClaim:
claimName: das-server
Sample mirror
Using daserver
as a mirror requires:
- The Ethereum L1 address of the sequencer inbox contract, for syncing all batch data.
- An Ethereum L1 RPC endpoint to query the sequencer inbox contract.
- A persistent volume to write the stored data to if using one of the local disk modes.
- A S3 bucket, and credentials (secret key, access key) of an IAM user that is able to read and write from it if you are using the S3 mode.
The mirror does not require a BLS key since it will not be accepting store requests from the sequencer.
Once the mirror is set up, please communicate a URL to reach it to the chain operator so they can add it to the public mirror list.
Set up persistent volume
This is the persistent volume for storing the DAS database.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: das-mirror
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 200Gi
storageClassName: gp2
Create mirror DAS deployment
This deployment sets up a DAS server using the Arbitrum Nova Mainnet. It uses the L1 inbox contract at 0x211e1c4c7f1bf5351ac850ed10fd68cffcf6c21b. For the Arbitrum Nova Mainnet you must specify a Mainnet Ethereum L1 RPC endpoint.
This configuration sets up two storage types. To disable any of them, remove the --data-availability.(local-file-storage|s3-storage).enable option, and the other options for that storage type can also be removed. It sets the storage backends to keep all data forever.
apiVersion: apps/v1
kind: Deployment
metadata:
name: das-mirror
spec:
replicas: 1
selector:
matchLabels:
app: das-mirror
strategy:
rollingUpdate:
maxSurge: 0
maxUnavailable: 50%
type: RollingUpdate
template:
metadata:
labels:
app: das-mirror
spec:
containers:
- command:
- bash
- -c
- |
mkdir -p /home/user/data/badgerdb
mkdir -p /home/user/data/syncState
/usr/local/bin/daserver --data-availability.l1-node-url <YOUR ETHEREUM L1 RPC ENDPOINT>
--enable-rest --rest-addr '0.0.0.0' --log-level 3 --data-availability.local-db-storage.enable --data-availability.local-db-storage.data-dir /home/user/data/badgerdb --data-availability.s3-storage.enable --data-availability.s3-storage.access-key "<YOUR ACCESS KEY>" --data-availability.s3-storage.bucket <YOUR BUCKET> --data-availability.s3-storage.region <YOUR REGION> --data-availability.s3-storage.secret-key "<YOUR SECRET KEY>" --data-availability.s3-storage.object-prefix "YOUR OBJECT KEY PREFIX/" --data-availability.local-cache.enable --data-availability.rest-aggregator.enable --data-availability.rest-aggregator.urls "http://your-committee-member.svc.cluster.local:9877" --data-availability.rest-aggregator.online-url-list "https://nova.arbitrum.io/das-servers" --data-availability.rest-aggregator.sync-to-storage.eager --data-availability.rest-aggregator.sync-to-storage.eager-lower-bound-block 15025611 --data-availability.sequencer-inbox-address '0x211e1c4c7f1bf5351ac850ed10fd68cffcf6c21b' --data-availability.rest-aggregator.sync-to-storage.state-dir /home/user/data/syncState
image: offchainlabs/nitro-node:v2.1.1-e9d8842
imagePullPolicy: Always
resources:
limits:
cpu: "4"
memory: 10Gi
requests:
cpu: "4"
memory: 10Gi
ports:
- containerPort: 9877
hostPort: 9877
protocol: TCP
volumeMounts:
- mountPath: /home/user/data/
name: data
readinessProbe:
failureThreshold: 3
httpGet:
path: /health/
port: 9877
scheme: HTTP
initialDelaySeconds: 5
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 1
volumes:
- name: data
persistentVolumeClaim:
claimName: das-mirror
Create IPFS mirror DAS deployment
This deployment sets up a DAS server using the Arbitrum Nova Mainnet. It uses the L1 inbox contract at 0x211e1c4c7f1bf5351ac850ed10fd68cffcf6c21b. For the Arbitrum Nova Mainnet you must specify a Mainnet Ethereum L1 RPC endpoint.
This configuration sets up the daserver
as an IPFS node. Port 4001 for the server should be exposed to the internet for IPFS p2p communication to work.
If this is the first IPFS mirror set up, it will take a very long time to sync due to trying to find the data in IPFS first. Add this configuration option to skip this step and sync from the REST endpoints: --data-availability.rest-aggregator.sync-to-storage.check-already-exists=false
apiVersion: apps/v1
kind: Deployment
metadata:
name: das-mirror
spec:
replicas: 1
selector:
matchLabels:
app: das-mirror
strategy:
rollingUpdate:
maxSurge: 0
maxUnavailable: 50%
type: RollingUpdate
template:
metadata:
labels:
app: das-mirror
spec:
containers:
- command:
- bash
- -c
- |
mkdir -p /home/user/data/ipfsRepo
mkdir -p /home/user/data/syncState
/usr/local/bin/daserver --data-availability.l1-node-url <YOUR ETHEREUM L1 RPC ENDPOINT> --enable-rest --rest-addr '0.0.0.0' --log-level 3 --data-availability.ipfs-storage.enable --data-availability.ipfs-storage.repo-dir /home/user/data/ipfsRepo --data-availability.rest-aggregator.enable --data-availability.rest-aggregator.urls "http://your-committee-member.svc.cluster.local:9877" --data-availability.rest-aggregator.online-url-list "https://nova.arbitrum.io/das-servers" --data-availability.rest-aggregator.sync-to-storage.eager --data-availability.rest-aggregator.sync-to-storage.eager-lower-bound-block 15025611 --data-availability.sequencer-inbox-address '0x211e1c4c7f1bf5351ac850ed10fd68cffcf6c21b' --data-availability.rest-aggregator.sync-to-storage.state-dir /home/user/data/syncState
image: offchainlabs/nitro-node:v2.1.1-e9d8842
imagePullPolicy: Always
resources:
limits:
cpu: "4"
memory: 10Gi
requests:
cpu: "4"
memory: 10Gi
ports:
- containerPort: 9877
hostPort: 9877
protocol: TCP
- containerPort: 4001
hostPort: 4001
protocol: TCP
volumeMounts:
- mountPath: /home/user/data/
name: data
readinessProbe:
failureThreshold: 3
httpGet:
path: /health/
port: 9877
scheme: HTTP
initialDelaySeconds: 5
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 1
volumes:
- name: data
persistentVolumeClaim:
claimName: das-mirror
Testing
Basic validation: Health check data is present
In the docker image there is the datool
utility that can be used to Store and Retrieve messages from a DAS. We will take advantage of a data hash that will always be present if the health check is enabled.
From the pod:
$ /usr/local/bin/datool client rest getbyhash --url http://localhost:9877 --data-hash 0x8b248e2bd8f75bf1334fe7f0da75cc7c1a34e00e00a22a96b7a43d580d250f3d
Message: Test-Data
If you do not have the health check configured yet, you can trigger one manually as follows:
$ curl http://localhost:9877/health
Using curl to check the REST endpoint
$ curl https://<HOST>/<PATH>/get-by-hash/8b248e2bd8f75bf1334fe7f0da75cc7c1a34e00e00a22a96b7a43d580d250f3d
{"data":"VGVzdC1EYXRh"}
Further validation: Using store interface directly
The Store interface of daserver
validates that requests to store data are signed by the Batch Poster's ECDSA key, identified via a call to the Sequencer Inbox contract on L1. It can also be configured to accept Store requests signed with another ECDSA key of your choosing. This could be useful for running load tests, canaries, or troubleshooting your own infrastructure. Using this facility, a load test could be constructed by writing a script to store arbitrary amounts of data at an arbitrary rate; a canary could be constructed to store and retrieve data on some interval.
Generate an ECDSA keypair:
$ /usr/local/bin/datool keygen --dir /dir-of-your-choice/ --ecdsa
Then add the following configuration option to daserver
:
--data-availability.extra-signature-checking-public-key /dir-of-your-choice/ecdsa.pub
OR
--data-availability.extra-signature-checking-public-key 0x<contents of ecdsa.pub>
Now you can use the datool
utility to send Store requests signed with the ecdsa private key:
$ /usr/local/bin/datool rpc store --url http://localhost:9876 --message "Hello world" --signing-key /tmp/ecdsatest/ecdsa
OR
$ /usr/local/bin/datool client rpc store --url http://localhost:9876 --message "Hello world" --signing-key "0x<contents of ecdsa>"
The above command outputs the Hex Encoded Data Hash:
which can be used to retrieve the data:
$ /usr/local/bin/datool client rest getbyhash --url http://localhost:9877 --data-hash 0x052cca0e379137c975c966bcc69ac8237ac38dc1fcf21ac9a6524c87a2aab423
Message: Hello world
The retention period defaults to 24h but can be configured for datool client rpc store
with the option:
--das-retention-period
Deployment recommendations
The REST interface is cacheable, consider using a CDN or caching proxy in front of your REST endpoint.
If you are running a mirror, the REST interface on your committee member does not have to be exposed publicly. Your mirrors can sync on your private network from the REST interface of your committee member and other public mirrors.
Metrics
If metrics are enabled in configuration, then several useful metrics are available at the configured port (default 6070), at path debug/metrics
or debug/metrics/prometheus
.
Metric | Description |
---|---|
arb_das_rest_getbyhash_requests | Count of REST GetByHash calls |
arb_das_rest_getbyhash_success | Successful REST GetByHash calls |
arb_das_rest_getbyhash_failure | Failed REST GetByHash calls |
arb_das_rest_getbyhash_bytes | Bytes retrieved with REST GetByHash calls |
arb_das_rest_getbyhash_duration (p50, p75, p95, p99, p999, p9999) | Duration of REST GetByHash calls (ns) |
arb_das_rpc_store_requests | Count of RPC Store calls |
arb_das_rpc_store_success | Successful RPC Store calls |
arb_das_rpc_store_failure | Failed RPC Store calls |
arb_das_rpc_store_bytes | Bytes retrieved with RPC Store calls |
arb_das_rpc_store_duration (p50, p75, p95, p99, p999, p9999) | Duration of RPC Store calls (ns) |