Deploy Akash Provider with kubeadm, containerd, gvisor
This write-up follows you through the Akash Provider deployment using kubeadm
12th of July 2021: originally published for Akash 0.12.0.
30th of October 2021: updated for Akash 0.14.0, multi-master/worker node setup. Have also added the HA support through a very simplistic round robin DNS A records.
5th of December 2021: update for Akash 0.14.1. Important provider updates are there.
This write-up follows you through the necessary configuration & setup steps required for you to run the Akash Provider on your own Linux distro. (I used x86_64 Ubuntu Focal).
The steps to register and activate Akash Provider are also included.
We are going to be using containerd so there is no need installing docker!
Neither I've used kubespray
as the official doc suggests. That is because I want to have more control over every gear in the system neither I want to install the docker.
Preparation
Hostname
Set hostname to something meaningful:
hostnamectl set-hostname akash-single.domainXYZ.com
If you are planning to use a recommended way with 3 master (control plane) nodes and N worker nodes, then you can chose the following hostnames:
If you are going to deploy multi-master (and worker) nodes, which is highly recommended, you can go with the following hostnames:
akash-master-01.domainXYZ.com
akash-master-02.domainXYZ.com
akash-master-03.domainXYZ.com
akash-worker-01.domainXYZ.com
...
akash-worker-NN.domainXYZ.com
In the examples below I've used my actual address *.ingress.nixaid.com
I am using on my Akash provider. In your case you will want to replace it with your domain name.
Enable netfilter and kernel IP forwarding (routing)
kube-proxy needs net.bridge.bridge-nf-call-iptables enabled
cat <<EOF | tee /etc/modules-load.d/k8s.conf
br_netfilter
EOF
cat <<EOF | tee /etc/sysctl.d/k8s.conf
net.ipv4.ip_forward=1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
modprobe br_netfilter
sysctl -p /etc/sysctl.d/k8s.conf
Swap file
It is recommended to disable and remove the swap file.
swapon -s
swapoff -a
sed -i '/swap/d' /etc/fstab
rm /swapfile
Better entropy, better performance
Low entropy is bad and can impact any process relying on it!
To improve performance, install haveged or rng-tools package.
apt -y install haveged
Back in the day, I remember I've seen dockerd
daemon would stall and not respond to the commands such as docker logs
, docker-compose up/logs
. I figured that was because the server did not have enough entropy, you can check it by running cat /proc/sys/kernel/random/entropy_avail
command.
Install cointainerd
wget https://github.com/containerd/containerd/releases/download/v1.5.7/containerd-1.5.7-linux-amd64.tar.gz
tar xvf containerd-1.5.7-linux-amd64.tar.gz -C /usr/local/
wget -O /etc/systemd/system/containerd.service https://raw.githubusercontent.com/containerd/containerd/v1.5.7/containerd.service
mkdir /etc/containerd
systemctl daemon-reload
systemctl start containerd
systemctl enable containerd
Install CNI plugins
Container Network Interface (CNI) - required for most pod network.
cd
mkdir -p /etc/cni/net.d /opt/cni/bin
CNI_ARCH=amd64
CNI_VERSION=1.0.1
CNI_ARCHIVE=cni-plugins-linux-${CNI_ARCH}-v${CNI_VERSION}.tgz
wget https://github.com/containernetworking/plugins/releases/download/v${CNI_VERSION}/${CNI_ARCHIVE}
tar -xzf $CNI_ARCHIVE -C /opt/cni/bin
Install crictl
Kubelet Container Runtime Interface (CRI) - required by kubeadm, kubelet.
INSTALL_DIR=/usr/local/bin
mkdir -p $INSTALL_DIR
CRICTL_VERSION="v1.22.0"
CRICTL_ARCHIVE="crictl-${CRICTL_VERSION}-linux-amd64.tar.gz"
wget "https://github.com/kubernetes-sigs/cri-tools/releases/download/${CRICTL_VERSION}/${CRICTL_ARCHIVE}"
tar -xzf $CRICTL_ARCHIVE -C $INSTALL_DIR
chown -Rh root:root $INSTALL_DIR
Update /etc/crictl.yaml
with the following lines:
Install runc
runc is the default OCF runtime used by non-Akash deployments (i.e. standard kubernetes containers such as kube, etcd, calico pods).
apt install -y runc
(Only on workers) Install gVisor (runsc) and runc
gVisor (runsc) is an application kernel for containers that provides efficient defense-in-depth anywhere.
See the container runtimes comparison is here.
apt -y install software-properties-common
curl -fsSL https://gvisor.dev/archive.key | apt-key add -
add-apt-repository "deb [arch=amd64,arm64] https://storage.googleapis.com/gvisor/releases release main"
apt update
apt install -y runsc
Configure containerd to use gVisor
Now that Kubernetes is going to use containerd (you will see this later, when we will bootstrap it using kubeadm), we will need to configure it to use gVisor runtime.
Remove the "runsc" (last two lines) on NoSchedule'able master nodes.
Update /etc/containerd/config.toml
:
And restart containerd service:
systemctl restart containerd
gVisor (runsc) isn't working with the systemd-cgroup nor cgroup v2 yet, there are two issues open if you wish to follow them up:
systemd-cgroup support #193
Support cgroup v2 in runsc #3481
Install Kubernetes
Install latest stable kubeadm, kubelet, kubectl and add a kubelet systemd service
INSTALL_DIR=/usr/local/bin
RELEASE="$(curl -sSL https://dl.k8s.io/release/stable.txt)"
cd $INSTALL_DIR
curl -L --remote-name-all https://storage.googleapis.com/kubernetes-release/release/${RELEASE}/bin/linux/amd64/{kubeadm,kubelet,kubectl}
chmod +x {kubeadm,kubelet,kubectl}
RELEASE_VERSION="v0.9.0"
curl -sSL "https://raw.githubusercontent.com/kubernetes/release/${RELEASE_VERSION}/cmd/kubepkg/templates/latest/deb/kubelet/lib/systemd/system/kubelet.service" | sed "s:/usr/bin:${INSTALL_DIR}:g" | tee /etc/systemd/system/kubelet.service
mkdir -p /etc/systemd/system/kubelet.service.d
curl -sSL "https://raw.githubusercontent.com/kubernetes/release/${RELEASE_VERSION}/cmd/kubepkg/templates/latest/deb/kubeadm/10-kubeadm.conf" | sed "s:/usr/bin:${INSTALL_DIR}:g" | tee /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
cd
systemctl enable kubelet
Deploy Kubernetes cluster using kubeadm
Feel free to adjust podSubnet & serviceSubnet and other control plane configuration to your needs.
For more flags refer to https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/control-plane-flags/
Make sure to set "kubernetesVersion" to the version you have downloaded the binaries for (https://dl.k8s.io/release/stable.txt)
You need to runkubeadm init
on 1 master only!
You will usekubeadm join
to join the other master (control plane) nodes and worker nodes later.
UncommentcontrolPlaneEndpoint
for multi-master deployment or if you plan to scale your master nodes.
SetcontrolPlaneEndpoint
to the same value you have set--cluster-public-hostname
to. That hostname should resolve to the public IP of the Kubernetes cluster.
Pro-tip: you can register the same DNS A record multiple times, pointing to multiple Akash master nodes. And then setcontrolPlaneEndpoint
to that DNS A record so it will get DNS round-robin balancing out-of-a-box! ;)
cat > kubeadm-config.yaml << EOF
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
cgroupDriver: cgroupfs
---
apiVersion: kubeadm.k8s.io/v1beta3
kind: InitConfiguration
nodeRegistration:
criSocket: unix:///run/containerd/containerd.sock # --cri-socket=unix:///run/containerd/containerd.sock
##kubeletExtraArgs:
##root-dir: /mnt/data/kubelet
imagePullPolicy: "Always"
localAPIEndpoint:
advertiseAddress: "0.0.0.0"
bindPort: 6443
---
kind: ClusterConfiguration
apiVersion: kubeadm.k8s.io/v1beta3
kubernetesVersion: "stable"
#controlPlaneEndpoint: "akash-master-lb.domainXYZ.com:6443"
networking:
podSubnet: "10.233.64.0/18" # --pod-network-cidr, taken from kubespray
serviceSubnet: "10.233.0.0/18" # --service-cidr, taken from kubespray
EOF
Download necessary dependencies to your master (control plane) node, run preflight check and pre-pull the images:
apt -y install ethtool socat conntrack
kubeadm init phase preflight --config kubeadm-config.yaml
kubeadm config images pull --config kubeadm-config.yaml
Now, if you are ready to initialize single-node configuration (where you have only a single master node which will also be running pods):
kubeadm init --config kubeadm-config.yaml
If you are planning to run multi-master deployment, make sure to add --upload-certs
to the kubeadm init command as follows:
kubeadm init --config kubeadm-config.yaml --upload-certs
You can always runkubeadm init phase upload-certs --upload-certs --config kubeadm-config.yaml
followed bykubeadm token create --config kubeadm-config.yaml --print-join-command
anytime later to get yourkubeadm join
command!
Do not need to runcommand for a single-node master deployment.
upload-certs
If you will see "Your Kubernetes control-plane has initialized successfully!" message, then everything went successfully and you now have your Kubernetes control-plane node at your service!
You will also see the kubeadm output kubeadm join
command with the --token
, keep it safe as this command is required for joining more nodes (worker nodes, data nodes) should you want to and depending on what type of architecture you want.
With multi-master node deployment, you will see kubeadm join
command with additional --control-plane --certificate-key
arguments! Make sure to use them when joining more master nodes to your cluster!
Check your nodes
You can either setKUBECONFIG
variable whenever you want to talk to your Kubernetes cluster using thekubectl
command OR you can make a symlink to~/.kube/config
.
Keep your/etc/kubernetes/admin.conf
safe as it's your Kuberentes admin key which lets you do everything with your K8s cluster.
And that config will be used by the Akash Provider service, you will see how later.
(Multi-master deployments) Newly joined master nodes will automatically receive theadmin.conf
file off of the source master node. So more backups for you! ;)
mkdir ~/.kube
ln -sv /etc/kubernetes/admin.conf ~/.kube/config
kubectl get nodes -o wide
Install Calico network
Kubernetes is of no use without the network plugin:
# kubectl describe node akash-single.domainXYZ.com | grep -w Ready
Ready False Wed, 28 Jul 2021 09:47:09 +0000 Wed, 28 Jul 2021 09:46:52 +0000 KubeletNotReady container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized
cd
curl https://docs.projectcalico.org/manifests/calico.yaml -O
kubectl apply -f calico.yaml
(Optional) Allow the master node schedule PODs
By default, your K8s cluster will not schedule Pods on the control-plane node (master) for security reasons. Either remove the taints on the master so that you can schedule pods on it using the kubectl taint nodes
command OR use kubectl join
to join worker nodes which will run calico (but first make sure to perform the preparation steps: install CNI plugins, install crictl, configure Kubernetes to use gVisor).
Remove the taints on the master if you are running a single-master deployment:
# kubectl describe node akash-single.domainXYZ.com |grep Taints
Taints: node-role.kubernetes.io/master:NoSchedule
# kubectl taint nodes --all node-role.kubernetes.io/master-
Check your nodes and pods
# kubectl get nodes -o wide --show-labels
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME LABELS
akash-single.domainXYZ.com Ready control-plane,master 4m24s v1.22.1 149.202.82.160 <none> Ubuntu 20.04.2 LTS 5.4.0-80-generic containerd://1.4.8 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=akash-single.domainXYZ.com,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers=
# kubectl describe node akash-single.domainXYZ.com | grep -w Ready
Ready True Wed, 28 Jul 2021 09:51:09 +0000 Wed, 28 Jul 2021 09:51:09 +0000 KubeletReady kubelet is posting ready status. AppArmor enabled
# kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-78d6f96c7b-kkszw 1/1 Running 0 3m33s
kube-system calico-node-ghgz8 1/1 Running 0 3m33s
kube-system coredns-558bd4d5db-2shqz 1/1 Running 0 4m7s
kube-system coredns-558bd4d5db-t9r75 1/1 Running 0 4m7s
kube-system etcd-akash-single.domainXYZ.com 1/1 Running 0 4m26s
kube-system kube-apiserver-akash-single.domainXYZ.com 1/1 Running 0 4m24s
kube-system kube-controller-manager-akash-single.domainXYZ.com 1/1 Running 0 4m23s
kube-system kube-proxy-72ntn 1/1 Running 0 4m7s
kube-system kube-scheduler-akash-single.domainXYZ.com 1/1 Running 0 4m21s
(Optional, almost) Install NodeLocal DNSCache
You do not have to install NodeLocal DNSCache if you have akash version with this patch https://github.com/arno01/akash/commit/5c81676bb8ad9780571ff8e4f41e54565eea31fd
PR https://github.com/ovrclk/akash/pull/1440
Issue https://github.com/ovrclk/akash/issues/1339#issuecomment-889293170
Use NodeLocal DNSCache for better performance,
https://kubernetes.io/docs/tasks/administer-cluster/nodelocaldns/
NodeLocal DNSCache service installation is simple:
kubedns=$(kubectl get svc kube-dns -n kube-system -o 'jsonpath={.spec.clusterIP}')
domain="cluster.local"
localdns="169.254.25.10"
wget https://raw.githubusercontent.com/kubernetes/kubernetes/v1.22.1/cluster/addons/dns/nodelocaldns/nodelocaldns.yaml
sed -i "s/__PILLAR__LOCAL__DNS__/$localdns/g; s/__PILLAR__DNS__DOMAIN__/$domain/g; s/__PILLAR__DNS__SERVER__/$kubedns/g" nodelocaldns.yaml
kubectl create -f nodelocaldns.yaml
Now you need to tell kubelet about the new NodeLocal DNSCache service. You have to do the following for each of your nodes of your Kubernetes cluster:
Modify clusterDNS
in your /var/lib/kubelet/config.yaml
to use 169.254.25.10
(NodeLocal DNSCache) instead of a default 10.233.0.10
and restart Kubelet service:
systemctl restart kubelet
To make sure you are using NodeLocal DNSCache, you can create a POD and check inside the nameserver should be 169.254.25.10
:
/ # cat /etc/resolv.conf
search default.svc.cluster.local svc.cluster.local cluster.local
nameserver 169.254.25.10
options ndots:5
Optional: IPVS mode.
NOTE: the cross-service communication (container X to service Y within same POD) does not work in the IPVS mode due to this line https://github.com/ovrclk/akash/blob/7c39ea403/provider/cluster/kube/builder.go#L599 in the "akash-deployment-restrictions" network policy. There might be another way to make it work though, one can try thekubespray
deployment with thekube_proxy_mode
toggle enabled and see if it gets to work that way.
https://www.linkedin.com/pulse/iptables-vs-ipvs-kubernetes-vivek-grover/
https://forum.akash.network/t/akash-provider-support-ipvs-kube-proxy-mode/720
If you want to run kube-proxy in the IPVS mode one day (instead of the default IPTABLES one), you would need to repeat the steps from the above section "Install NodeLocal DNSCache" except that to modify the nodelocaldns.yaml
file use the following command:
sed -i "s/__PILLAR__LOCAL__DNS__/$localdns/g; s/__PILLAR__DNS__DOMAIN__/$domain/g; s/,__PILLAR__DNS__SERVER__//g; s/__PILLAR__CLUSTER__DNS__/$kubedns/g" nodelocaldns.yaml
Switch kube-proxy to the IPVS mode by setting mode:
to "ipvs"
kubectl edit configmap kube-proxy -n kube-system
And restart the kube-proxy:
kubectl -n kube-system delete pod -l k8s-app=kube-proxy
Configure Kubernetes to use gVisor
Set up the gvisor (runsc) Kubernetes RuntimeClass.
The deployments created by Akash Provider will use it by default.
cat <<'EOF' | kubectl apply -f -
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
name: gvisor
handler: runsc
EOF
Check your gVisor and K8s DNS are working as expected
cat > dnstest.yaml << 'EOF'
apiVersion: v1
kind: Pod
metadata:
name: dnsutils
namespace: default
spec:
runtimeClassName: gvisor
containers:
- name: dnsutils
image: gcr.io/kubernetes-e2e-test-images/dnsutils:1.3
command:
- sleep
- "3600"
imagePullPolicy: IfNotPresent
restartPolicy: Always
EOF
# kubectl apply -f dnstest.yaml
# kubectl exec -i -t dnsutils -- sh
/ # dmesg
[ 0.000000] Starting gVisor...
[ 0.459332] Reticulating splines...
[ 0.868906] Synthesizing system calls...
[ 1.330219] Adversarially training Redcode AI...
[ 1.465972] Waiting for children...
[ 1.887919] Generating random numbers by fair dice roll...
[ 2.302806] Accelerating teletypewriter to 9600 baud...
[ 2.729885] Checking naughty and nice process list...
[ 2.999002] Granting licence to kill(2)...
[ 3.116179] Checking naughty and nice process list...
[ 3.451080] Creating process schedule...
[ 3.658232] Ready!
/ # ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope global dynamic
inet6 ::1/128 scope global dynamic
2: eth0: <UP,LOWER_UP> mtu 1480
link/ether 9e:f1:a0:ee:8a:55 brd ff:ff:ff:ff:ff:ff
inet 10.233.85.133/32 scope global dynamic
inet6 fe80::9cf1:a0ff:feee:8a55/64 scope global dynamic
/ # ip r
127.0.0.0/8 dev lo
::1 dev lo
169.254.1.1 dev eth0
fe80::/64 dev eth0
default via 169.254.1.1 dev eth0
/ # netstat -nr
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
169.254.1.1 0.0.0.0 255.255.255.255 U 0 0 0 eth0
0.0.0.0 169.254.1.1 0.0.0.0 UG 0 0 0 eth0
/ # cat /etc/resolv.conf
search default.svc.cluster.local svc.cluster.local cluster.local
nameserver 10.233.0.10
options ndots:5
/ # ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: seq=0 ttl=42 time=5.671 ms
^C
--- 8.8.8.8 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 5.671/5.671/5.671 ms
/ # ping google.com
PING google.com (172.217.13.174): 56 data bytes
64 bytes from 172.217.13.174: seq=0 ttl=42 time=85.075 ms
^C
--- google.com ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 85.075/85.075/85.075 ms
/ # nslookup kubernetes.default.svc.cluster.local
Server: 10.233.0.10
Address: 10.233.0.10#53
Name: kubernetes.default.svc.cluster.local
Address: 10.233.0.1
/ # exit
# kubectl delete -f dnstest.yaml
If you see "Starting gVisor..." that means Kubernetes is able to run the containers using the gVisor (runsc).
You are going to see 169.254.25.10 instead of 10.233.0.10 nameserver if you are using NodeLocal DNSCache.
The network test won't work (i.e. ping 8.8.8.8 will fail) once you apply network-policy-default-ns-deny.yaml. This is expected.
(Optional) Encrypt etcd
etcd is a consistent and highly-available key value store used as Kubernetes' backing store for all cluster data.
Kubernetes uses etcd to store all its data – its configuration data, its state, and its metadata. Kubernetes is a distributed system, so it needs a distributed data store like etcd. etcd lets any of the nodes in the Kubernetes cluster read and write data.
⚠️ Storing the raw encryption key in the EncryptionConfig only moderately improves your security posture, compared to no encryption. Please use kms provider for additional security.
️️⚠️ Make sure you have the same ENCRYPTION_KEY across all control plane nodes! (the ones running kube-apiserver). Just copy /etc/kubernetes/encrypt/config.yaml
file across them.
# Run this only once and remember the value of that key!
ENCRYPTION_KEY=$(head -c 32 /dev/urandom | base64)
mkdir /etc/kubernetes/encrypt
cat > /etc/kubernetes/encrypt/config.yaml <<EOF
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
- resources:
- secrets
providers:
- aescbc:
keys:
- name: key1
secret: ${ENCRYPTION_KEY}
- identity: {}
EOF
Update your /etc/kubernetes/manifests/kube-apiserver.yaml
in the following way so kube-apiserver knows where to read the secret from:
kube-apiserver
will automatically restart when you save /etc/kubernetes/manifests/kube-apiserver.yaml
file. (This can take a minute or two, be patient.)
# crictl ps | grep apiserver
10e6f4b409a4b 106ff58d43082 36 seconds ago Running kube-apiserver 0 754932bb659c5
Don't forget to do the same across all your Kubernetes nodes!
Encrypt all secrets using the encryption key you have just added:
kubectl get secrets -A -o json | kubectl replace -f -
(Optional) IPv6 support
If you wish to enable IPv6 support in your Kubernetes cluster, then please refer to this page https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/dual-stack-support/
Configure Kubernets for Akash Provider service
If you are updating Akash provider from 0.12 to 0.14, please make sure to follow these steps https://github.com/ovrclk/akash/blob/9e1a7aa5ccc894e89d84d38485b458627a287bae/script/provider_migrate_to_hostname_operator.md
mkdir akash-provider
cd akash-provider
wget https://raw.githubusercontent.com/ovrclk/akash/mainnet/main/pkg/apis/akash.network/v1/crd.yaml
kubectl apply -f ./crd.yaml
wget https://raw.githubusercontent.com/ovrclk/akash/mainnet/main/pkg/apis/akash.network/v1/provider_hosts_crd.yaml
kubectl apply -f ./provider_hosts_crd.yaml
wget https://raw.githubusercontent.com/ovrclk/akash/mainnet/main/_docs/kustomize/networking/network-policy-default-ns-deny.yaml
kubectl apply -f ./network-policy-default-ns-deny.yaml
wget https://raw.githubusercontent.com/ovrclk/akash/mainnet/main/_run/ingress-nginx-class.yaml
kubectl apply -f ./ingress-nginx-class.yaml
wget https://raw.githubusercontent.com/ovrclk/akash/mainnet/main/_run/ingress-nginx.yaml
kubectl apply -f ./ingress-nginx.yaml
# NOTE: in this example the Kubernetes node is called "akash-single.domainXYZ.com" and it's going to be the ingress node too.
# In the perfect environment that would not be the master (control-plane) node, but rather the worker nodes!
kubectl label nodes akash-single.domainXYZ.com akash.network/role=ingress
# Check the label got applied:
# kubectl get nodes -o wide --show-labels
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME LABELS
akash-single.domainXYZ.com Ready control-plane,master 10m v1.22.1 149.202.82.160 <none> Ubuntu 20.04.2 LTS 5.4.0-80-generic containerd://1.4.8 akash.network/role=ingress,beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=akash-single.domainXYZ.com,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers=
git clone --depth 1 -b mainnet/main https://github.com/ovrclk/akash.git
cd akash
kubectl apply -f _docs/kustomize/networking/namespace.yaml
kubectl kustomize _docs/kustomize/akash-services/ | kubectl apply -f -
cat >> _docs/kustomize/akash-hostname-operator/kustomization.yaml <<'EOF'
images:
- name: ghcr.io/ovrclk/akash:stable
newName: ghcr.io/ovrclk/akash
newTag: 0.14.1
EOF
kubectl kustomize _docs/kustomize/akash-hostname-operator | kubectl apply -f -
Get a wildcard DNS record
In my case I'm going to be using <anything>.ingress.nixaid.com
where it would resolve to the IP of my Kubernetes node(s). Preferably only the worker nodes!
A *.ingress.nixaid.com resolves to 149.202.82.160
And akash-provider.nixaid.com
is going to resolve to the IP of the Akash Provider service itself that I'm going to be running. (Akash Provider service is listening over 8443/tcp
port)
Pro-tip: you can register the same DNS A wildcard record multiple times, pointing to multiple Akash worker nodes so it will get DNS round-robin balancing out-of-a-box! ;)
Creating the Akash Provider on the Akash Blockchain
Now that we've got our Kubernetes configured, up & running, it's time to get the Akash Provider running.
NOTE: You don't have to run Akash Provider service on your Kubernetes cluster directly. You can run it anywhere. It only needs to be able to access your Kubernetes cluster over the internet.
Create Akash user
We are going to be running akash provider
under the akash
user.
useradd akash -m -U -s /usr/sbin/nologin
mkdir /home/akash/.kube
cp /etc/kubernetes/admin.conf /home/akash/.kube/config
chown -Rh akash:akash /home/akash/.kube
Install Akash client
su -s /bin/bash - akash
wget https://github.com/ovrclk/akash/releases/download/v0.14.1/akash_0.14.1_linux_amd64.zip
unzip akash_0.14.1_linux_amd64.zip
mv /home/akash/akash_0.14.1_linux_amd64/akash /usr/bin/
chown root:root /usr/bin/akash
Configure Akash client
su -s /bin/bash - akash
mkdir ~/.akash
export KUBECONFIG=/home/akash/.kube/config
export PROVIDER_ADDRESS=akash-provider.nixaid.com
export AKASH_NET="https://raw.githubusercontent.com/ovrclk/net/master/mainnet"
export AKASH_NODE="$(curl -s "$AKASH_NET/rpc-nodes.txt" | shuf -n 1)"
export AKASH_CHAIN_ID="$(curl -s "$AKASH_NET/chain-id.txt")"
export AKASH_KEYRING_BACKEND=file
export AKASH_PROVIDER_KEY=default
export AKASH_FROM=$AKASH_PROVIDER_KEY
Check the variables:
$ set |grep ^AKASH
AKASH_CHAIN_ID=akashnet-2
AKASH_FROM=default
AKASH_KEYRING_BACKEND=file
AKASH_NET=https://raw.githubusercontent.com/ovrclk/net/master/mainnet
AKASH_NODE=http://135.181.181.120:28957
AKASH_PROVIDER_KEY=default
Now create the default
key:
$ akash keys add $AKASH_PROVIDER_KEY --keyring-backend=$AKASH_KEYRING_BACKEND
Enter keyring passphrase:
Re-enter keyring passphrase:
- name: default
type: local
address: akash1nxq8gmsw2vlz3m68qvyvcf3kh6q269ajvqw6y0
...
Make sure to keep your mnemonic seed somewhere safe as it's the only way to recover your account and funds on it!
If you want to restore your key from your mnemonic seed, add--recover
flag afterakash keys add ...
command.
Configure Akash provider
$ cat provider.yaml
host: https://akash-provider.nixaid.com:8443
attributes:
- key: region
value: europe ## change this to your region!
- key: host
value: akash ## feel free to change this to whatever you like
- key: organization # optional
value: whatever-your-Org-is ## change this to your org.
- key: tier # optional
value: community
Fund your Akash provider's wallet
You will need about 10 AKT
(Akash Token) to get you started.
Your wallet must have sufficient funding, as placing a bid on an order on the blockchain requires a 5 AKT deposit. This deposit is fully refunded after the bid is won/lost.
Purchase AKT at one of the exchanges mentioned here https://akash.network/token/
To query the balance of your wallet:
# Put here your address which you've got when created one with "akash keys add" command.
export AKASH_ACCOUNT_ADDRESS=akash1nxq8gmsw2vlz3m68qvyvcf3kh6q269ajvqw6y0
$ akash \
--node "$AKASH_NODE" \
query bank balances "$AKASH_ACCOUNT_ADDRESS"
Denomination: 1 akt = 1000000 uakt (akt*10^6)
Register your provider on the Akash Network
$ akash tx provider create provider.yaml \
--from $AKASH_PROVIDER_KEY \
--keyring-backend=$AKASH_KEYRING_BACKEND \
--node=$AKASH_NODE \
--chain-id=$AKASH_CHAIN_ID \
--gas-prices="0.025uakt" \
--gas="auto" \
--gas-adjustment=1.15
If you want to change the parameters of yourprovider.yaml
then useakash tx provider update
command with the same arguments.
After registering your provider on the Akash Network, I was able to see my host there:
$ akash \
--node "$AKASH_NODE" \
query provider list -o json | jq -r '.providers[] | [ .attributes[].value, .host_uri, .owner ] | @csv' | sort -d
"australia-east-akash-provider","https://provider.akashprovider.com","akash1ykxzzu332txz8zsfew7z77wgsdyde75wgugntn"
"equinix-metal-ams1","akash","mn2-0","https://provider.ams1p0.mainnet.akashian.io:8443","akash1ccktptfkvdc67msasmesuy5m7gpc76z75kukpz"
"equinix-metal-ewr1","akash","mn2-0","https://provider.ewr1p0.mainnet.akashian.io:8443","akash1f6gmtjpx4r8qda9nxjwq26fp5mcjyqmaq5m6j7"
"equinix-metal-sjc1","akash","mn2-0","https://provider.sjc1p0.mainnet.akashian.io:8443","akash10cl5rm0cqnpj45knzakpa4cnvn5amzwp4lhcal"
"equinix-metal-sjc1","akash","mn2-0","https://provider.sjc1p1.mainnet.akashian.io:8443","akash1cvpefa7pw8qy0u4euv497r66mvgyrg30zv0wu0"
"europe","nixaid","https://akash-provider.nixaid.com:8443","akash1nxq8gmsw2vlz3m68qvyvcf3kh6q269ajvqw6y0"
"us-west-demo-akhil","dehazelabs","https://73.157.111.139:8443","akash1rt2qk45a75tjxzathkuuz6sq90jthekehnz45z"
"us-west-demo-caleb","https://provider.akashian.io","akash1rdyul52yc42vd8vhguc0t9ryug9ftf2zut8jxa"
"us-west-demo-daniel","https://daniel1q84.iptime.org","akash14jpkk4n5daspcjdzsrylgw38lj9xug2nznqnu2"
"us-west","https://ssymbiotik.ekipi365.com","akash1j862g3efcw5xcvn0402uwygrwlzfg5r02w9jw5"
Create the provider certificate
You must issue a transaction to the blockchain to create a certificate associated with your provider:
akash tx cert create server $PROVIDER_ADDRESS \
--chain-id $AKASH_CHAIN_ID \
--keyring-backend $AKASH_KEYRING_BACKEND \
--from $AKASH_PROVIDER_KEY \
--node=$AKASH_NODE \
--gas-prices="0.025uakt" --gas="auto" --gas-adjustment=1.15
Starting the Akash Provider
Akash provider will need the Kubernetes admin config. We have already moved it to /home/akash/.kube/config
before.
Create start-provider.sh
file which will be starting the Akash Provider.
But before, create the key-pass.txt
file with the password you have set when created the provider's key.
echo "Your-passWoRd" | tee /home/akash/key-pass.txt
Make sure to set--cluster-public-hostname
to the hostname that resolves to the public IP of the Kubernetes cluster. You will also setcontrolPlaneEndpoint
to the that hostname as you will see further.
cat > /home/akash/start-provider.sh << 'EOF'
#!/usr/bin/env bash
export AKASH_NET="https://raw.githubusercontent.com/ovrclk/net/master/mainnet"
export AKASH_NODE="$(curl -s "$AKASH_NET/rpc-nodes.txt" | shuf -n 1)"
cd /home/akash
( sleep 2s; cat key-pass.txt; cat key-pass.txt ) | \
/usr/bin/akash provider run \
--chain-id akashnet-2 \
--node $AKASH_NODE \
--keyring-backend=file \
--from default \
--fees 5000uakt \
--kubeconfig /home/akash/.kube/config \
--cluster-k8s true \
--deployment-ingress-domain ingress.nixaid.com \
--deployment-ingress-static-hosts true \
--bid-price-strategy scale \
--bid-price-cpu-scale 0.0011 \
--bid-price-memory-scale 0.0002 \
--bid-price-storage-scale 0.00009 \
--bid-price-endpoint-scale 0 \
--bid-deposit 5000000uakt \
--balance-check-period 24h \
--minimum-balance 5000000 \
--cluster-node-port-quantity 1000 \
--cluster-public-hostname akash-master-lb.domainXYZ.com \
--bid-timeout 10m0s \
--withdrawal-period 24h0m0s \
--log_level warn
EOF
Make sure it's executable:
chmod +x /home/akash/start-provider.sh
Create akash-provider.service
systemd service so Akash provider starts automatically:
cat > /etc/systemd/system/akash-provider.service << 'EOF'
[Unit]
Description=Akash Provider
After=network.target
[Service]
User=akash
Group=akash
ExecStart=/home/akash/start-provider.sh
KillSignal=SIGINT
Restart=on-failure
RestartSec=15
StartLimitInterval=200
StartLimitBurst=10
#LimitNOFILE=65535
[Install]
WantedBy=multi-user.target
EOF
Start the Akash provider:
systemctl daemon-reload
systemctl start akash-provider
systemctl enable akash-provider
Check the logs:
journalctl -u akash-provider --since '5 min ago' -f
Akash detects the node as following:
D[2021-06-29|11:33:34.190] node resources module=provider-cluster cmp=service cmp=inventory-service node-id=akash-single.domainXYZ.com available-cpu="units:<val:\"7050\" > attributes:<key:\"arch\" value:\"amd64\" > " available-memory="quantity:<val:\"32896909312\" > " available-storage="quantity:<val:\"47409223602\" > "
cpu units: 7050 / 1000
= 7 CPU (server actually's got 8 CPU, it must have reserved 1 CPU for whatever the provider node is running, which is a smart thing)
available memory: 32896909312 / (1024^3) = 30.63Gi (server's got 32Gi RAM)
available storage: 47409223602 / (1024^3) = 44.15Gi (here is a bit weird, I've got just 32Gi available on rootfs "/")
Deploying on our own Akash provider
In order to get your Akash client configured on your client side, please refer to the first 4 steps in https://nixaid.com/solana-on-akashnet/ or https://docs.akash.network/guides/deploy
Now that we have our own Akash Provider running, let's try to deploy something on it.
I'll deploy the echoserver
service which can return interesting information to the client once queried over the HTTP/HTTPS port.
$ cat echoserver.yml
---
version: "2.0"
services:
echoserver:
image: gcr.io/google_containers/echoserver:1.10
expose:
- port: 8080
as: 80
to:
- global: true
#accept:
# - my.host123.com
profiles:
compute:
echoserver:
resources:
cpu:
units: 0.1
memory:
size: 128Mi
storage:
size: 128Mi
placement:
akash:
#attributes:
# host: nixaid
#signedBy:
# anyOf:
# - "akash1365yvmc4s7awdyj3n2sav7xfx76adc6dnmlx63" ## AKASH
pricing:
echoserver:
denom: uakt
amount: 100
deployment:
echoserver:
akash:
profile: echoserver
count: 1
Note that I've commented signedBy
directive which typically is used by the clients to make sure they are deploying on a trusted provider. Leaving it commented, means that you can deploy at any Akash provider you want, not necessarily signed.
You can use the akash tx audit attr create
command for signing attributes on your Akash Provider if you wish your clients to use signedBy
directive.
akash tx deployment create echoserver.yml \
--from default \
--node $AKASH_NODE \
--chain-id $AKASH_CHAIN_ID \
--gas-prices="0.025uakt" --gas="auto" --gas-adjustment=1.15
Now that the deployment has been announced to the Akash network, let's look at our Akash Provider's side.
Here is what a successful reservation looks like from the Akash provider's point of view:
Reservation fulfilled
is what we are looking for.
Jun 30 00:00:46 akash1 start-provider.sh[1029866]: I[2021-06-30|00:00:46.122] syncing sequence cmp=client/broadcaster local=31 remote=31
Jun 30 00:00:53 akash1 start-provider.sh[1029866]: I[2021-06-30|00:00:53.837] order detected module=bidengine-service order=order/akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829/1/1
Jun 30 00:00:53 akash1 start-provider.sh[1029866]: I[2021-06-30|00:00:53.867] group fetched module=bidengine-order order=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829/1/1
Jun 30 00:00:53 akash1 start-provider.sh[1029866]: I[2021-06-30|00:00:53.867] requesting reservation module=bidengine-order order=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829/1/1
Jun 30 00:00:53 akash1 start-provider.sh[1029866]: D[2021-06-30|00:00:53.868] reservation requested module=provider-cluster cmp=service cmp=inventory-service order=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829/1/1 resources="group_id:<owner:\"akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h\" dseq:1585829 gseq:1 > state:open group_spec:<name:\"akash\" requirements:<signed_by:<> > resources:<resources:<cpu:<units:<val:\"100\" > > memory:<quantity:<val:\"134217728\" > > storage:<quantity:<val:\"134217728\" > > endpoints:<> > count:1 price:<denom:\"uakt\" amount:\"2000\" > > > created_at:1585832 "
Jun 30 00:00:53 akash1 start-provider.sh[1029866]: I[2021-06-30|00:00:53.868] Reservation fulfilled module=bidengine-order order=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829/1/1
Jun 30 00:00:53 akash1 start-provider.sh[1029866]: D[2021-06-30|00:00:53.868] submitting fulfillment module=bidengine-order order=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829/1/1 price=357uakt
Jun 30 00:00:53 akash1 start-provider.sh[1029866]: I[2021-06-30|00:00:53.932] broadcast response cmp=client/broadcaster response="Response:\n TxHash: BDE0FE6CD12DB3B137482A0E93D4099D7C9F6A5ABAC597E17F6E94706B84CC9A\n Raw Log: []\n Logs: []" err=null
Jun 30 00:00:53 akash1 start-provider.sh[1029866]: I[2021-06-30|00:00:53.932] bid complete module=bidengine-order order=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829/1/1
Jun 30 00:00:56 akash1 start-provider.sh[1029866]: I[2021-06-30|00:00:56.121] syncing sequence cmp=client/broadcaster local=32 remote=31
Now that the Akash provider's got reservation fulfilled, we should be able to see it as a bid (offer) on the client side:
$ akash query market bid list --owner=$AKASH_ACCOUNT_ADDRESS --node $AKASH_NODE --dseq $AKASH_DSEQ
...
- bid:
bid_id:
dseq: "1585829"
gseq: 1
oseq: 1
owner: akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h
provider: akash1nxq8gmsw2vlz3m68qvyvcf3kh6q269ajvqw6y0
created_at: "1585836"
price:
amount: "357"
denom: uakt
state: open
escrow_account:
balance:
amount: "50000000"
denom: uakt
id:
scope: bid
xid: akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829/1/1/akash1nxq8gmsw2vlz3m68qvyvcf3kh6q269ajvqw6y0
owner: akash1nxq8gmsw2vlz3m68qvyvcf3kh6q269ajvqw6y0
settled_at: "1585836"
state: open
transferred:
amount: "0"
denom: uakt
...
Let's create the leases now (accept the bid offered by the Akash Provider):
akash tx market lease create \
--chain-id $AKASH_CHAIN_ID \
--node $AKASH_NODE \
--owner $AKASH_ACCOUNT_ADDRESS \
--dseq $AKASH_DSEQ \
--gseq $AKASH_GSEQ \
--oseq $AKASH_OSEQ \
--provider $AKASH_PROVIDER \
--from default \
--gas-prices="0.025uakt" --gas="auto" --gas-adjustment=1.15
Now we can see "lease won" at the provider's site:
Jun 30 00:03:42 akash1 start-provider.sh[1029866]: D[2021-06-30|00:03:42.479] ignoring group module=bidengine-order order=akash15yd3qszmqausvzpj7n0y0e4pft2cu9rt5gccda/1346631/1/1 group=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829/1
Jun 30 00:03:42 akash1 start-provider.sh[1029866]: I[2021-06-30|00:03:42.479] lease won module=bidengine-order order=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829/1/1 lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829/1/1/akash1nxq8gmsw2vlz3m68qvyvcf3kh6q269ajvqw6y0
Jun 30 00:03:42 akash1 start-provider.sh[1029866]: I[2021-06-30|00:03:42.480] shutting down module=bidengine-order order=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829/1/1
Jun 30 00:03:42 akash1 start-provider.sh[1029866]: I[2021-06-30|00:03:42.480] lease won module=provider-manifest lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829/1/1/akash1nxq8gmsw2vlz3m68qvyvcf3kh6q269ajvqw6y0
Jun 30 00:03:42 akash1 start-provider.sh[1029866]: I[2021-06-30|00:03:42.480] new lease module=manifest-manager deployment=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829 lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829/1/1/akash1nxq8gmsw2vlz3m68qvyvcf3kh6q269ajvqw6y0
Jun 30 00:03:42 akash1 start-provider.sh[1029866]: D[2021-06-30|00:03:42.480] emit received events skipped module=manifest-manager deployment=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829 data=<nil> leases=1 manifests=0
Jun 30 00:03:42 akash1 start-provider.sh[1029866]: I[2021-06-30|00:03:42.520] data received module=manifest-manager deployment=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829 version=77fd690d5e5ec8c320a902da09a59b48dc9abd0259d84f9789fee371941320e7
Jun 30 00:03:42 akash1 start-provider.sh[1029866]: D[2021-06-30|00:03:42.520] emit received events skipped module=manifest-manager deployment=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829 data="deployment:<deployment_id:<owner:\"akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h\" dseq:1585829 > state:active version:\"w\\375i\\r^^\\310\\303 \\251\\002\\332\\t\\245\\233H\\334\\232\\275\\002Y\\330O\\227\\211\\376\\343q\\224\\023 \\347\" created_at:1585832 > groups:<group_id:<owner:\"akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h\" dseq:1585829 gseq:1 > state:open group_spec:<name:\"akash\" requirements:<signed_by:<> > resources:<resources:<cpu:<units:<val:\"100\" > > memory:<quantity:<val:\"134217728\" > > storage:<quantity:<val:\"134217728\" > > endpoints:<> > count:1 price:<denom:\"uakt\" amount:\"2000\" > > > created_at:1585832 > escrow_account:<id:<scope:\"deployment\" xid:\"akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829\" > owner:\"akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h\" state:open balance:<denom:\"uakt\" amount:\"5000000\" > transferred:<denom:\"uakt\" amount:\"0\" > settled_at:1585859 > " leases=1 manifests=0
Send the manifest to finally deploy the echoserver
service on your Akash Provider!
akash provider send-manifest echoserver.yml \
--node $AKASH_NODE \
--dseq $AKASH_DSEQ \
--provider $AKASH_PROVIDER \
--from default
Provider's got the manifest => "manifest received", and kube-builder
module has "created service" under c9mdnf8o961odir96rdcflt9id95rq2a2qesidpjuqd76
namespace:
Jun 30 00:06:16 akash1 start-provider.sh[1029866]: I[2021-06-30|00:06:16.122] syncing sequence cmp=client/broadcaster local=32 remote=32
Jun 30 00:06:21 akash1 start-provider.sh[1029866]: D[2021-06-30|00:06:21.413] inventory fetched module=provider-cluster cmp=service cmp=inventory-service nodes=1
Jun 30 00:06:21 akash1 start-provider.sh[1029866]: D[2021-06-30|00:06:21.413] node resources module=provider-cluster cmp=service cmp=inventory-service node-id=akash-single.domainXYZ.com available-cpu="units:<val:\"7050\" > attributes:<key:\"arch\" value:\"amd64\" > " available-memory="quantity:<val:\"32896909312\" > " available-storage="quantity:<val:\"47409223602\" > "
Jun 30 00:06:26 akash1 start-provider.sh[1029866]: I[2021-06-30|00:06:26.122] syncing sequence cmp=client/broadcaster local=32 remote=32
Jun 30 00:06:35 akash1 start-provider.sh[1029866]: I[2021-06-30|00:06:35.852] manifest received module=manifest-manager deployment=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829
Jun 30 00:06:35 akash1 start-provider.sh[1029866]: D[2021-06-30|00:06:35.852] requests valid module=manifest-manager deployment=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829 num-requests=1
Jun 30 00:06:35 akash1 start-provider.sh[1029866]: D[2021-06-30|00:06:35.853] publishing manifest received module=manifest-manager deployment=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829 num-leases=1
Jun 30 00:06:35 akash1 start-provider.sh[1029866]: D[2021-06-30|00:06:35.853] publishing manifest received for lease module=manifest-manager deployment=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829 lease_id=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829/1/1/akash1nxq8gmsw2vlz3m68qvyvcf3kh6q269ajvqw6y0
Jun 30 00:06:35 akash1 start-provider.sh[1029866]: I[2021-06-30|00:06:35.853] manifest received module=provider-cluster cmp=service
Jun 30 00:06:36 akash1 start-provider.sh[1029866]: D[2021-06-30|00:06:36.023] provider/cluster/kube/builder: created service module=kube-builder service="&Service{ObjectMeta:{echoserver 0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[akash.network:true akash.network/manifest-service:echoserver akash.network/namespace:c9mdnf8o961odir96rdcflt9id95rq2a2qesidpjuqd76] map[] [] [] []},Spec:ServiceSpec{Ports:[]ServicePort{ServicePort{Name:0-80,Protocol:TCP,Port:80,TargetPort:{0 8080 },NodePort:0,AppProtocol:nil,},},Selector:map[string]string{akash.network: true,akash.network/manifest-service: echoserver,akash.network/namespace: c9mdnf8o961odir96rdcflt9id95rq2a2qesidpjuqd76,},ClusterIP:,Type:ClusterIP,ExternalIPs:[],SessionAffinity:,LoadBalancerIP:,LoadBalancerSourceRanges:[],ExternalName:,ExternalTrafficPolicy:,HealthCheckNodePort:0,PublishNotReadyAddresses:false,SessionAffinityConfig:nil,IPFamily:nil,TopologyKeys:[],},Status:ServiceStatus{LoadBalancer:LoadBalancerStatus{Ingress:[]LoadBalancerIngress{},},},}"
Jun 30 00:06:36 akash1 start-provider.sh[1029866]: I[2021-06-30|00:06:36.121] syncing sequence cmp=client/broadcaster local=32 remote=32
Jun 30 00:06:36 akash1 start-provider.sh[1029866]: D[2021-06-30|00:06:36.157] provider/cluster/kube/builder: created rules module=kube-builder rules="[{Host:623n1u4k2hbiv6f1kuiscparqk.ingress.nixaid.com IngressRuleValue:{HTTP:&HTTPIngressRuleValue{Paths:[]HTTPIngressPath{HTTPIngressPath{Path:/,Backend:IngressBackend{Resource:nil,Service:&IngressServiceBackend{Name:echoserver,Port:ServiceBackendPort{Name:,Number:80,},},},PathType:*Prefix,},},}}}]"
Jun 30 00:06:36 akash1 start-provider.sh[1029866]: D[2021-06-30|00:06:36.222] deploy complete module=provider-cluster cmp=service cmp=deployment-manager lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829/1/1/akash1nxq8gmsw2vlz3m68qvyvcf3kh6q269ajvqw6y0 manifest-group=akash
Let's see the lease status from the client side:
akash provider lease-status \
--node $AKASH_NODE \
--dseq $AKASH_DSEQ \
--provider $AKASH_PROVIDER \
--from default
{
"services": {
"echoserver": {
"name": "echoserver",
"available": 1,
"total": 1,
"uris": [
"623n1u4k2hbiv6f1kuiscparqk.ingress.nixaid.com"
],
"observed_generation": 1,
"replicas": 1,
"updated_replicas": 1,
"ready_replicas": 1,
"available_replicas": 1
}
},
"forwarded_ports": {}
}
We've got it!
Let's query it:
$ curl 623n1u4k2hbiv6f1kuiscparqk.ingress.nixaid.com
Hostname: echoserver-5c6f84887-6kh9p
Pod Information:
-no pod information available-
Server values:
server_version=nginx: 1.13.3 - lua: 10008
Request Information:
client_address=10.233.85.136
method=GET
real path=/
query=
request_version=1.1
request_scheme=http
request_uri=http://623n1u4k2hbiv6f1kuiscparqk.ingress.nixaid.com:8080/
Request Headers:
accept=*/*
host=623n1u4k2hbiv6f1kuiscparqk.ingress.nixaid.com
user-agent=curl/7.68.0
x-forwarded-for=CLIENT_IP_REDACTED
x-forwarded-host=623n1u4k2hbiv6f1kuiscparqk.ingress.nixaid.com
x-forwarded-port=80
x-forwarded-proto=http
x-real-ip=CLIENT_IP_REDACTED
x-request-id=8cdbcd7d0c4f42440669f7396e206cae
x-scheme=http
Request Body:
-no body in request-
Our deployment on our own Akash provider is working as expected! Hooray!
Let's see how does our deployment is actually looking from the Kubernetes point of view on our Akash Provider:
# kubectl get all -A -l akash.network=true
NAMESPACE NAME READY STATUS RESTARTS AGE
c9mdnf8o961odir96rdcflt9id95rq2a2qesidpjuqd76 pod/echoserver-5c6f84887-6kh9p 1/1 Running 0 2m37s
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
c9mdnf8o961odir96rdcflt9id95rq2a2qesidpjuqd76 service/echoserver ClusterIP 10.233.47.15 <none> 80/TCP 2m37s
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
c9mdnf8o961odir96rdcflt9id95rq2a2qesidpjuqd76 deployment.apps/echoserver 1/1 1 1 2m38s
NAMESPACE NAME DESIRED CURRENT READY AGE
c9mdnf8o961odir96rdcflt9id95rq2a2qesidpjuqd76 replicaset.apps/echoserver-5c6f84887 1 1 1 2m37s
# kubectl get ing -A
NAMESPACE NAME CLASS HOSTS ADDRESS PORTS AGE
c9mdnf8o961odir96rdcflt9id95rq2a2qesidpjuqd76 echoserver <none> 623n1u4k2hbiv6f1kuiscparqk.ingress.nixaid.com localhost 80 8m47s
# kubectl -n c9mdnf8o961odir96rdcflt9id95rq2a2qesidpjuqd76 describe ing echoserver
Name: echoserver
Namespace: c9mdnf8o961odir96rdcflt9id95rq2a2qesidpjuqd76
Address: localhost
Default backend: default-http-backend:80 (<error: endpoints "default-http-backend" not found>)
Rules:
Host Path Backends
---- ---- --------
623n1u4k2hbiv6f1kuiscparqk.ingress.nixaid.com
/ echoserver:80 (10.233.85.137:8080)
Annotations: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Sync 8m9s (x2 over 9m5s) nginx-ingress-controller Scheduled for sync
# crictl pods
POD ID CREATED STATE NAME NAMESPACE ATTEMPT RUNTIME
4c22dba05a2c0 5 minutes ago Ready echoserver-5c6f84887-6kh9p c9mdnf8o961odir96rdcflt9id95rq2a2qesidpjuqd76 0 runsc
...
The client can read his deployment's logs too:
akash \
--node "$AKASH_NODE" \
provider lease-logs \
--dseq "$AKASH_DSEQ" \
--gseq "$AKASH_GSEQ" \
--oseq "$AKASH_OSEQ" \
--provider "$AKASH_PROVIDER" \
--from default \
--follow
[echoserver-5c6f84887-6kh9p] Generating self-signed cert
[echoserver-5c6f84887-6kh9p] Generating a 2048 bit RSA private key
[echoserver-5c6f84887-6kh9p] ..............................+++
[echoserver-5c6f84887-6kh9p] ...............................................................................................................................................+++
[echoserver-5c6f84887-6kh9p] writing new private key to '/certs/privateKey.key'
[echoserver-5c6f84887-6kh9p] -----
[echoserver-5c6f84887-6kh9p] Starting nginx
[echoserver-5c6f84887-6kh9p] 10.233.85.136 - - [30/Jun/2021:00:08:00 +0000] "GET / HTTP/1.1" 200 744 "-" "curl/7.68.0"
[echoserver-5c6f84887-6kh9p] 10.233.85.136 - - [30/Jun/2021:00:27:10 +0000] "GET / HTTP/1.1" 200 744 "-" "curl/7.68.0"
After done testing, it's time to close the deployment:
akash tx deployment close \
--node $AKASH_NODE \
--chain-id $AKASH_CHAIN_ID \
--dseq $AKASH_DSEQ \
--owner $AKASH_ACCOUNT_ADDRESS \
--from default \
--gas-prices="0.025uakt" --gas="auto" --gas-adjustment=1.15
Provider's sees it as expected "deployment closed", "teardown request", ...:
Jun 30 00:28:44 akash1 start-provider.sh[1029866]: I[2021-06-30|00:28:44.828] deployment closed module=provider-manifest deployment=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829
Jun 30 00:28:44 akash1 start-provider.sh[1029866]: I[2021-06-30|00:28:44.828] manager done module=provider-manifest deployment=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829
Jun 30 00:28:44 akash1 start-provider.sh[1029866]: D[2021-06-30|00:28:44.829] teardown request module=provider-cluster cmp=service cmp=deployment-manager lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829/1/1/akash1nxq8gmsw2vlz3m68qvyvcf3kh6q269ajvqw6y0 manifest-group=akash
Jun 30 00:28:44 akash1 start-provider.sh[1029866]: D[2021-06-30|00:28:44.830] shutting down module=provider-cluster cmp=service cmp=deployment-manager lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829/1/1/akash1nxq8gmsw2vlz3m68qvyvcf3kh6q269ajvqw6y0 manifest-group=akash cmp=deployment-monitor
Jun 30 00:28:44 akash1 start-provider.sh[1029866]: D[2021-06-30|00:28:44.830] shutdown complete module=provider-cluster cmp=service cmp=deployment-manager lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829/1/1/akash1nxq8gmsw2vlz3m68qvyvcf3kh6q269ajvqw6y0 manifest-group=akash cmp=deployment-monitor
Jun 30 00:28:44 akash1 start-provider.sh[1029866]: D[2021-06-30|00:28:44.837] teardown complete module=provider-cluster cmp=service cmp=deployment-manager lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829/1/1/akash1nxq8gmsw2vlz3m68qvyvcf3kh6q269ajvqw6y0 manifest-group=akash
Jun 30 00:28:44 akash1 start-provider.sh[1029866]: D[2021-06-30|00:28:44.837] waiting on dm.wg module=provider-cluster cmp=service cmp=deployment-manager lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829/1/1/akash1nxq8gmsw2vlz3m68qvyvcf3kh6q269ajvqw6y0 manifest-group=akash
Jun 30 00:28:44 akash1 start-provider.sh[1029866]: D[2021-06-30|00:28:44.838] waiting on withdrawal module=provider-cluster cmp=service cmp=deployment-manager lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829/1/1/akash1nxq8gmsw2vlz3m68qvyvcf3kh6q269ajvqw6y0 manifest-group=akash
Jun 30 00:28:44 akash1 start-provider.sh[1029866]: D[2021-06-30|00:28:44.838] shutting down module=provider-cluster cmp=service cmp=deployment-manager lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829/1/1/akash1nxq8gmsw2vlz3m68qvyvcf3kh6q269ajvqw6y0 manifest-group=akash cmp=deployment-withdrawal
Jun 30 00:28:44 akash1 start-provider.sh[1029866]: D[2021-06-30|00:28:44.838] shutdown complete module=provider-cluster cmp=service cmp=deployment-manager lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829/1/1/akash1nxq8gmsw2vlz3m68qvyvcf3kh6q269ajvqw6y0 manifest-group=akash cmp=deployment-withdrawal
Jun 30 00:28:44 akash1 start-provider.sh[1029866]: I[2021-06-30|00:28:44.838] shutdown complete module=provider-cluster cmp=service cmp=deployment-manager lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829/1/1/akash1nxq8gmsw2vlz3m68qvyvcf3kh6q269ajvqw6y0 manifest-group=akash
Jun 30 00:28:44 akash1 start-provider.sh[1029866]: I[2021-06-30|00:28:44.838] manager done module=provider-cluster cmp=service lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829/1/1/akash1nxq8gmsw2vlz3m68qvyvcf3kh6q269ajvqw6y0
Jun 30 00:28:44 akash1 start-provider.sh[1029866]: D[2021-06-30|00:28:44.838] unreserving capacity module=provider-cluster cmp=service cmp=inventory-service order=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829/1/1
Jun 30 00:28:44 akash1 start-provider.sh[1029866]: I[2021-06-30|00:28:44.838] attempting to removing reservation module=provider-cluster cmp=service cmp=inventory-service order=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829/1/1
Jun 30 00:28:44 akash1 start-provider.sh[1029866]: I[2021-06-30|00:28:44.838] removing reservation module=provider-cluster cmp=service cmp=inventory-service order=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829/1/1
Jun 30 00:28:44 akash1 start-provider.sh[1029866]: I[2021-06-30|00:28:44.838] unreserve capacity complete module=provider-cluster cmp=service cmp=inventory-service order=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/1585829/1/1
Jun 30 00:28:46 akash1 start-provider.sh[1029866]: I[2021-06-30|00:28:46.122] syncing sequence cmp=client/broadcaster local=36 remote=36
Tearing down the cluster
Just in case if you want to destroy your Kubernetes cluster:
systemctl disable akash-provider
systemctl stop akash-provider
kubectl drain <node name> --delete-local-data --force --ignore-daemonsets
###kubectl delete node <node name>
kubeadm reset
iptables -F && iptables -t nat -F && iptables -t nat -X && iptables -t mangle -F && iptables -t mangle -X && iptables -t raw -F && iptables -t raw -X && iptables -X
ip6tables -F && ip6tables -t nat -F && ip6tables -t nat -X && ip6tables -t mangle -F && ip6tables -t mangle -X && ip6tables -t raw -F && ip6tables -t raw -X && ip6tables -X
ipvsadm -C
conntrack -F
## if Weave Net was used:
weave reset (if you used) (( or "ip link delete weave" ))
## if Calico was used:
ip link
ip link delete cali*
ip link delete vxlan.calico
modprobe -r ipip
A bit of troubleshooting / getting out of the following situation:
## if getting during "crictl rmp -a" (deleting all pods using crictl)
removing the pod sandbox "f89d5f4987fbf80790e82eab1f5634480af814afdc82db8bca92dc5ed4b57120": rpc error: code = Unknown desc = sandbox network namespace "/var/run/netns/cni-65fbbdd0-8af6-8c2a-0698-6ef8155ca441" is not fully closed
ip netns ls
ip -all netns delete
ps -ef|grep -E 'runc|runsc|shim'
ip r
pidof runsc-sandbox |xargs -r kill
pidof /usr/bin/containerd-shim-runc-v2 |xargs -r kill -9
find /run/containerd/io.containerd.runtime.v2.task/ -ls
rm -rf /etc/cni/net.d
systemctl restart containerd
###systemctl restart docker
Scaling your Akash provider horizontally
You can scale your Akash Provider should you want to add more space for new deployments.
To do that, acquire new baremetal or VPS host and repeat all the steps up until "Deploy Kubernetes cluster using kubeadm" (not including).
Run the following commands on your new master (control-plane) or worker node:
apt update
apt -y dist-upgrade
apt autoremove
apt -y install ethtool socat conntrack
mkdir -p /etc/kubernetes/manifests
## If you are using NodeLocal DNSCache
sed -i -s 's/10.233.0.10/169.254.25.10/g' /var/lib/kubelet/config.yaml
Generate the token on your existing master (control-plane) node.
You are going to need it in order to join your new master / worker nodes.
If adding new master nodes, make sure to run the upload-certs
phase:
This is to avoid copying /etc/kubernetes/pki
manually from your master node to new master nodes.
kubeadm init phase upload-certs --upload-certs --config kubeadm-config.yaml
Generate the token which you will use for joining your new master or worker node to your kubernetes cluster:
kubeadm token create --config kubeadm-config.yaml --print-join-command
To join any number of the master (control-plane) nodes run the following command:
kubeadm join akash-master-lb.domainXYZ.com:6443 --token REDACTED.REDACTED --discovery-token-ca-cert-hash sha256:REDACTED --control-plane --certificate-key REDACTED
To join any number of the worker nodes run the following command:
kubeadm join akash-master-lb.domainXYZ.com:6443 --token REDACTED.REDACTED --discovery-token-ca-cert-hash sha256:REDACTED
Scale the ingress
Now that you have more than 1 worker nodes, you can scale the ingress-nginx-controller
so to increase the service availability.
To do this, you just need to run the following commands.
Label all the workers with the akash.network/role=ingress
label:
kubectl label nodes akash-worker-<##>.nixaid.com akash.network/role=ingress
Scale the ingress-nginx-controller
to the number of workers you have:
kubectl -n ingress-nginx scale deployment ingress-nginx-controller --replicas=<number of worker nodes>
And now register new DNS A records for the IP of your worker nodes to *.ingress.nixaid.com
wildcard name running the ingress-nginx-controller
:-)
Example:
$ dig +noall +answer anything.ingress.nixaid.com
anything.ingress.nixaid.com. 1707 IN A 167.86.73.47
anything.ingress.nixaid.com. 1707 IN A 185.211.5.95
Known issues and workarounds
- Akash Provider is bleeding https://github.com/ovrclk/akash/issues/1363
- provider would not pull the new image when the tag is same https://github.com/ovrclk/akash/issues/1354
- Dangling containers when manifest's got an error https://github.com/ovrclk/akash/issues/1353
- [netpol] akash-deployment-restrictions prevents PODs from accessing kube-dns over 53/udp, 53/tcp in pod subnet https://github.com/ovrclk/akash/issues/1339
Donate
Please consider donating to me if you found this article useful.
Email me or DM me on Twitter https://twitter.com/andreyarapov for the donation.
References
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/
https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-init/
https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/configure-cgroup-driver/#configuring-the-kubelet-cgroup-driver
https://storage.googleapis.com/kubernetes-release/release/stable.txt
https://gvisor.dev/docs/user_guide/containerd/quick_start/
https://github.com/containernetworking/cni#how-do-i-use-cni
https://docs.projectcalico.org/getting-started/kubernetes/quickstart
https://kubernetes.io/docs/concepts/overview/components/
https://matthewpalmer.net/kubernetes-app-developer/articles/how-does-kubernetes-use-etcd.html
https://kubernetes.io/docs/tasks/administer-cluster/configure-upgrade-etcd/
https://kubernetes.io/docs/tasks/administer-cluster/encrypt-data/
https://docs.akash.network/operator/provider
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#tear-down