1 - 安装kubectl

安装 kubectl 命令行工具

kubectl 是 Kubernetes 的命令行工具,允许对Kubernetes集群运行命令。

可以使用kubectl来部署应用程序,检查和管理集群资源,并查看日志。

https://kubernetes.io/docs/tasks/tools/

1.1 - 在 ubuntu 上安装 kubectl

安装配置 kubectl

参考 Kubernetes 官方文档:

分步骤安装

和后面安装 kubeadm 方式一样,只是这里只需要安装 kubectl 一个工具,不需要安装 kubeadm 和 kublete

执行如下命令:

sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl

curl -fsSL https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-archive-keyring.gpg

echo "deb [signed-by=/etc/apt/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo apt-get update

k8s 暂时固定使用 1.23.14 版本:

sudo apt-get install kubectl=1.23.14-00
# sudo apt-get install kubelet=1.23.14-00 kubeadm=1.23.14-00 kubectl=1.23.14-00

直接安装

不推荐这样安装,会安装最新版本,而且安装目录是 /usr/local/bin/

curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
rm kubectl

如果 /usr/local/bin/ 不在 path 路径下,则需要修改一下 path:

export PATH=/usr/local/bin:$PATH

验证一下:

kubectl version --output=yaml

输出为:

clientVersion:
  buildDate: "2023-06-14T09:53:42Z"
  compiler: gc
  gitCommit: 25b4e43193bcda6c7328a6d147b1fb73a33f1598
  gitTreeState: clean
  gitVersion: v1.27.3
  goVersion: go1.20.5
  major: "1"
  minor: "27"
  platform: linux/amd64
kustomizeVersion: v5.0.1

The connection to the server localhost:8080 was refused - did you specify the right host or port?

配置

oh-my-zsh自动完成

在使用 oh-my-zsh 之后,会更加的简单(强烈推荐使用 oh-my-zsh ),只要在 oh-my-zsh 的 plugins 列表中增加 kubectl 即可。

然后,在 ~/.zshrc 中增加以下内容:

# k8s auto complete
alias k=kubectl
complete -F __start_kubectl k

source ~/.zshrc 之后即可使用,此时用 k 这个别名来执行 kubectl 命令时也可以实现自动完成,非常的方便。

2 - 通过 kubeadm 安装 kubenetes

通过 kubeadm 安装 kubenetes

2.1 - 在ubuntu上安装kubenetes

在ubuntu上安装kubenetes

2.1.1 - ubuntu22.04上用kubeadm安装kubenetes

通过 kubeadm 在 ubuntu 22.04 上安装 kubenetes

以 ubuntu server 22.04 为例,参考 Kubernetes 官方文档:

前期准备

检查 docker 版本

检查 container 配置

sudo vi /etc/containerd/config.toml

确保文件不存在或者一下这行内容被注释:

# disabled_plugins = ["cri"]

修改之后需要重启 containerd:

sudo systemctl restart containerd.service

备注:如果不做这个修改,k8s 安装时会报错 “CRI v1 runtime API is not implemented”。

禁用虚拟内存swap

执行 free -m 命令检测:

$ free -m        
              total        used        free      shared  buff/cache   available
Mem:          15896        1665       11376          20        2854       13819
Swap:             0           0           0

如果Swap这一行不是0,则说明虚拟内存swap被开启了,需要关闭。

需要做两个事情:

  1. 操作系统安装时就不要设置swap分区,如果有,删除该swap分区

  2. 即使没有swap分区,也会开启swap,需要通过 sudo vi /etc/fstab 找到swap 这一行:

    # 在swap分区这行前加 # 禁用掉swap
    /swapfile                                 none            swap    sw              0       0
    

    重启之后再用 free -m 命令检测。

安装kubeadm

执行如下命令:

sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl

curl -fsSL https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-archive-keyring.gpg

echo "deb [signed-by=/etc/apt/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo apt-get update

安装最新版本

sudo apt-get install -y kubelet kubeadm kubectl

安装完成后

kubectl version --output=yaml

查看 kubectl 版本:

clientVersion:
  buildDate: "2023-06-14T09:53:42Z"
  compiler: gc
  gitCommit: 25b4e43193bcda6c7328a6d147b1fb73a33f1598
  gitTreeState: clean
  gitVersion: v1.27.3
  goVersion: go1.20.5
  major: "1"
  minor: "27"
  platform: linux/amd64
kustomizeVersion: v5.0.1

The connection to the server localhost:8080 was refused - did you specify the right host or port?

查看 kubeadm 版本:

kubeadm version 
kubeadm version: &version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.3", GitCommit:"25b4e43193bcda6c7328a6d147b1fb73a33f1598", GitTreeState:"clean", BuildDate:"2023-06-14T09:52:26Z", GoVersion:"go1.20.5", Compiler:"gc", Platform:"linux/amd64"}

查看 kubelet 版本:

kubelet --version
Kubernetes v1.27.3

安装特定版本

如果希望安装特定版本:

sudo apt-get install kubelet=1.23.14-00 kubeadm=1.23.14-00 kubectl=1.23.14-00

具体有哪些可用的版本,可以看这里:

https://packages.cloud.google.com/apt/dists/kubernetes-xenial/main/binary-amd64/Packages

安装k8s

参考:https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/

sudo kubeadm init --pod-network-cidr=10.244.0.0/16 -v=9
sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.100.40 -v=9

注意后面为了使用 CNI network 和 Flannel,我们在这里设置了 --pod-network-cidr=10.244.0.0/16,如果不加这个设置,Flannel 会一直报错。如果机器上有多个网卡,可以用 --apiserver-advertise-address 指定要使用的IP地址。

kubeadm init 输出如下:

......

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.0.57:6443 --token gwr923.gctdq2sr423mrwp7 \
	--discovery-token-ca-cert-hash sha256:ad86f4eb0d430fc1bdf784ae655dccdcb14881cd4ca8d03d84cd2135082c4892 

为了使用普通用户,按照上面的提示执行:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

安装完成后,node处于NotReady状态:

$ kubectl get node  
NAME        STATUS     ROLES                  AGE    VERSION
skyserver   NotReady   control-plane,master   3m7s   v1.23.5

kubectl describe 可以看到是因为没有安装 network plugin

$ kubectl describe node ubuntu2204 
Name:               ubuntu2204
Roles:              control-plane

......

  Ready            False   Wed, 28 Jun 2023 16:53:27 +0000   Wed, 28 Jun 2023 16:52:41 +0000   KubeletNotReady              container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized

安装 flannel 作为 pod network add-on:

kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml

备注:有时会遇到 raw.githubusercontent.com 这个域名被污染,解析为 127.0.0.1,导致无法访问。解决方法是访问 https://ipaddress.com/website/raw.githubusercontent.com 然后查看可用的IP地址,找一个速度最快的,在 /etc/hosts 文件中加入一行记录即可,如 185.199.111.133 raw.githubusercontent.com

稍等就可以看到 node 的状态变为 Ready了:

$ kubectl get node                                                                                           
NAME        STATUS   ROLES                  AGE     VERSION
skyserver   Ready    control-plane,master   4m52s   v1.23.5

最后,如果是测试用的单节点,为了让负载可以跑在 k8s 的 master 节点上,执行下列命令去除 master/control-plane 的污点:

# 以前的污点名为 master
# kubectl taint nodes --all node-role.kubernetes.io/master-
# 新版本污点名改为 control-plane (master政治不正确)
kubectl taint nodes --all node-role.kubernetes.io/control-plane-

可以通过 kubectl describe node skyserver 对比去除污点前后 node 信息中的 Taints 部分,去除污点前:

Taints:             node.kubernetes.io/not-ready:NoExecute
                    node-role.kubernetes.io/master:NoSchedule
                    node.kubernetes.io/not-ready:NoSchedule

去除污点后:

Taints:             <none>

常见问题

CRI v1 runtime API is not implemented

如果类似的报错(新版本):

[preflight] Some fatal errors occurred:
	[ERROR CRI]: container runtime is not running: output: time="2023-06-28T16:12:49Z" level=fatal msg="validate service connection: CRI v1 runtime API is not implemented for endpoint \"unix:///var/run/containerd/containerd.sock\": rpc error: code = Unimplemented desc = unknown service runtime.v1.RuntimeService"
, error: exit status 1

或者报错(老一些的版本):

[preflight] Some fatal errors occurred:
	[ERROR CRI]: container runtime is not running: output: E1125 11:16:01.799551   14661 remote_runtime.go:948] "Status from runtime service failed" err="rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService"
time="2022-11-25T11:16:01+08:00" level=fatal msg="getting status of runtime: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService"
, error: exit status 1

这都是因为 containerd 的默认配置文件中 disable 了 CRI 的原因,可以打开文件 /etc/containerd/config.toml 看到这行

disabled_plugins = ["cri"]

将这行注释之后,重启 containerd :

sudo systemctl restart containerd.service

之后重新尝试 kubeadm init。

参考:

控制平面不启动或者异常重启

安装最新版本(1.27 / 1.25)完成显示成功,但是控制平面没有启动,6443 端口无法连接:

k get node       

E0628 16:34:50.966940    6581 memcache.go:265] couldn't get current server API group list: Get "https://192.168.0.57:6443/api?timeout=32s": read tcp 192.168.0.57:41288->192.168.0.1:7890: read: connection reset by peer - error from a previous attempt: read tcp 192.168.0.57:41276->192.168.0.1:7890: read: connection reset by peer

使用中发现控制平面经常不稳定, 大量的 pod 在反复重启,日志中有提示:pod sandbox changed。

记录测试验证有问题的版本:

  • kubeadm: 1.27.3 / 1.25.6
  • kubelet:1.27.3 / 1.25.6
  • docker: 24.0.2 / 20.10.21

尝试回退 docker 版本,k8s 1.27 的 changelog 中,

https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.27.md 提到的 docker 版本是 v20.10.21 (incompatible 是什么鬼?) :

github.com/docker/docker: v20.10.18+incompatible → v20.10.21+incompatible

这个 v20.10.21 版本我翻了一下我之前的安装记录,非常凑巧之前是有使用这个 docker 版本的,而且稳定没出问题。因此考虑换到这个版本:

VERSION_STRING=5:20.10.21~3-0~ubuntu-jammy
sudo apt-get install docker-ce=$VERSION_STRING docker-ce-cli=$VERSION_STRING containerd.io docker-buildx-plugin docker-compose-plugin

k8s 暂时固定选择 1.23.14 这个经过验证的版本:

sudo apt-get install kubelet=1.23.14-00 kubeadm=1.23.14-00 kubectl=1.23.14-00

备注: 1.27.3 / 1.25.6 这两个 k8s 的版本都验证过会有问题,暂时不清楚原因,先固定用 1.23.14。

后续再排查。

失败重来

如果遇到安装失败,需要重新开始,或者想铲掉现有的安装,则可以:

  1. 运行kubeadm reset
  2. 删除 .kube 目录
  3. 再次执行 kubeadm init

如果网络设置有改动,则需要彻底的重置网络。具体见下一章。

将节点加入到集群

如果有多个kubenetes节点(即多台机器),则需要将其他节点加入到集群中。具体见下一章。

2.1.2 - Kubernetes安装后配置

Kubernetes安装后配置

配置 kubectl 自动完成

zsh配置

mac默认使用zsh,为了实现 kubectl 的自动完成功能,需要在 ~/.zshrc 中增加以下内容:

# 注意这一行要加在文件的最前面
autoload -Uz compinit && compinit -i

......

# k8s auto complete
source <(kubectl completion zsh)
alias k=kubectl
complete -F __start_kubectl k

同时为了使用方便,为 kubectl 增加了 k 的别名,同样也为 k 增加了 自动完成功能。

使用oh-my-zsh

在使用 oh-my-zsh 之后,会更加的简单(强烈推荐使用 oh-my-zsh ),只要在 oh-my-zsh 的 plugins 列表中增加 kubectl 即可。

然后,在 ~/.zshrc 中增加以下内容:

# k8s auto complete
alias k=kubectl
complete -F __start_kubectl k

source ~/.zshrc 之后即可使用,此时用 k 这个别名来执行 kubectl 命令时也可以实现自动完成,非常的方便。

显示正在使用的kubectl上下文

https://github.com/ohmyzsh/ohmyzsh/tree/master/plugins/kubectx

这个插件增加了 kubectx_prompt_info()函数。它显示正在使用的 kubectl context 的名称(kubectl config current-context)。

你可以用它来定制提示,并知道你是否在prod集群上.

使用方式为修改 ~/.zshrc:

  1. 在 plugins 中增加 “kubectx”
  2. 增加一行 RPS1='$(kubectx_prompt_info)'

source ~/.zshrc 之后即可生效,会在命令行的最右侧显示出kubectl context 的名称,默认情况下 kubectl config current-context 的输出是 “kubernetes-admin@kubernetes”。

如果需要更友好的显示,则可以将名字映射为可读性更强的标记,如 dev, stage, prod:

kubectx_mapping[kubernetes-admin@kubernetes]="dev"

备注: 在多个k8s环境下切换时应该很有用,后续有需要再研究。

从其他机器上操作k8s集群

如果k8s安装在本机,则相应的 kubectl 等命令行工具在安装过程中都在本地准备就绪,而且 kubeadm init 命令在安装完毕之后会提示:

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

安装上面的提示操作之后,就可以在本地通过 kubectl 命令行工具操作安装的k8s集群。

如果我们希望从其他机器上方便的操作k8s集群,而不是限制要先ssh登录到安装k8s控制平面的机器上,则可以简单的在这台机器上安装kubectl并配置好kubeconf文件。

步骤如下:

  1. 安装 kubectl:和前面的不走类似

  2. 配置 kubeconf

    mkdir -p $HOME/.kube
    # 复制集群的config文件到这台机器
    cp -i /path/to/cluster/config $HOME/.kube/config
    

如果有多个k8s集群需要操作,则可以在执行 kubectl 命令时通过 --kubeconfig 参数指定要使用的 kubeconf 文件:

kubectl --kubeconfig /home/sky/.kube/skyserver get nodes

每次都输入 “–kubeconfig /home/sky/.kube/skyserver” 会很累,可以通过设置临时的环境变量来在当前终端下选择kubeconf文件,如:

export KUBECONFIG=$HOME/.kube/skyserver
k get nodes

# 不需要用时,关闭终端或者unset
unset KUBECONFIG

如果需要同时操作多个集群,需要在多个集群之间反复切换,则应该使用context来灵活切换,参考:

取消docker和k8s的更新

通过 apt 方式安装的 docker 和 k8s,会在 apt upgrade 时自动升级到最新版本,这未必安全,通常也没有必要。

可以考虑取消docker和k8s的的 apt 更新,cd /etc/apt/sources.list.d,将 docker 和 k8s 的ppa配置文件内容用 “#” 注释掉就可以了。需要时可以重新打开。

2.1.3 - 通过kubeadm join增加节点

通过kubeadm join命令为kubenetes集群增加节点

参考 Kubernetes 官方文档:

准备工作

通过 kubeadmin init 命令安装k8s时,会有如下提示:

Then you can join any number of worker nodes by running the following on each as root:

sudo kubeadm join 192.168.0.41:6443 --token 5ezixq.itmxvdgey8uduysr \
        --discovery-token-ca-cert-hash sha256:d641cec650bdee479a3e7479b558ab68886f7c41ef89f2857099776ed72bcaae

这里用到的 token 可以通过 kubeadm token list 命令获取:

$ kubeadm token list                                                                                                                       
TOKEN                     TTL         EXPIRES                USAGES                   DESCRIPTION                                                EXTRA GROUPS
5ezixq.itmxvdgey8uduysr   12h         2021-12-28T04:22:54Z   authentication,signing   The default bootstrap token generated by 'kubeadm init'.   system:bootstrappers:kubeadm:default-node-token

由于 token 的有效期(TTL)通常不是很久(默认12小时),因此可能会出现没有可用的token的情况。此时需要在该集群上创建新的token(注意需要登录到集群的控制平面所在的节点上执行命令,因为后面会读取本地文件):

$ kubeadm token create
omkq4t.v6nnkj4erms2ipyf
$ kubeadm token list  
TOKEN                     TTL         EXPIRES                USAGES                   DESCRIPTION                                                EXTRA GROUPS
omkq4t.v6nnkj4erms2ipyf   23h         2021-12-29T09:19:23Z   authentication,signing   <none>                                                     system:bootstrappers:kubeadm:default-node-token

discovery-token-ca-cert-hash 可以通过下面的命令生成:

openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'

d641cec650bdee479a3e7479b558ab68886f7c41ef89f2857099776ed72bcaae

执行kubeadm join

输出如下:

$ sudo kubeadm join 192.168.0.41:6443 --token 5ezixq.itmxvdgey8uduysr \
        --discovery-token-ca-cert-hash sha256:d641cec650bdee479a3e7479b558ab68886f7c41ef89f2857099776ed72bcaae

[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
W1228 00:04:48.056252   78445 utils.go:69] The recommended value for "resolvConf" in "KubeletConfiguration" is: /run/systemd/resolve/resolv.conf; the provided value is: /run/systemd/resolve/resolv.conf
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

在当前机器上,执行命令,会发现无法连接本地 api server:

$ k get nodes   
The connection to the server localhost:8080 was refused - did you specify the right host or port?

在另一台机器上执行命令,可以看到这个节点添加成功:

$ k get nodes
NAME         STATUS   ROLES                  AGE    VERSION
skyserver    Ready    control-plane,master   11h    v1.23.1
skyserver2   Ready    <none>                 4m1s   v1.23.1

错误处理

pod无法启动

发现有调度到某个节点的pod无法启动,一直卡在 ContainerCreating 上:

$ get pods -A
NAMESPACE              NAME                                         READY   STATUS              RESTARTS      AGE
kubernetes-dashboard   dashboard-metrics-scraper-799d786dbf-6wksz   0/1     ContainerCreating   0             8h

查看该pod信息发现调度到node skywork2,然后报错 "cni0" already has an IP address different from 10.244.2.1/24:

k describe pods dashboard-metrics-scraper-799d786dbf-hqlg6 -n kubernetes-dashboard 
Name:           dashboard-metrics-scraper-799d786dbf-hqlg6
Namespace:      kubernetes-dashboard
Priority:       0
Node:           skywork2/192.168.0.20
......
  Warning  FailedCreatePodSandBox  17s (x4 over 20s)   kubelet            (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "41479d55f5428ec9a36267170dd1516f996bcf9d49f772d98c2fc79230f64830" network for pod "dashboard-metrics-scraper-799d786dbf-hqlg6": networkPlugin cni failed to set up pod "dashboard-metrics-scraper-799d786dbf-hqlg6_kubernetes-dashboard" network: failed to delegate add: failed to set bridge addr: "cni0" already has an IP address different from 10.244.2.1/24

这是因为之前这个节点在 kubeadm join 之前,做过 kubeadm init ,在 kebeadm reset 之后残余了部分网络配置。

解决的方法是彻底的重置网络再join, 操作如下:

sudo -i
kubeadm reset -f
systemctl stop kubelet
systemctl stop docker
rm -rf /var/lib/cni/
rm -rf /var/lib/kubelet/*
rm -rf /etc/cni/
rm -rf /etc/kubernetes/
ifconfig cni0 down
ifconfig flannel.1 down
ifconfig docker0 down
ip link delete cni0
ip link delete flannel.1
systemctl start docker
systemctl start kubelet

在清理干净之后再次执行 kubeadm join 即可。

备注: 发现在节点执行 kubeadm reset 之后,在master节点上执行 kebuctr get nodes 时这个节点信息迟迟不能剔除。安全起见可以手工执行一次 kebuctl delete nodes skywork2

参考资料:

2.1.4 - 部署并访问Dashboard

Dashboard

参考资料:

部署dashboard

在下面地址上查看当前dashboard的版本:

https://github.com/kubernetes/dashboard/releases

根据对kubernetes版本的兼容情况选择对应的dashboard的版本:

  • dashboard 2.7 : 全面兼容 k8s 1.25
  • dashboard 2.6.1 : 全面兼容 k8s 1.24
  • dashboard 2.5.1: 全面兼容 k8s 1.23

通过如下命令部署:

kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.5.1/aio/deploy/recommended.yaml

其中版本号可以查看 https://github.com/kubernetes/dashboard/releases

部署成功之后,可以看到 kubernetes-dashboard 相关的两个pod:

$ k get pods -A          
NAMESPACE              NAME                                         READY   STATUS    RESTARTS      AGE
kubernetes-dashboard   dashboard-metrics-scraper-799d786dbf-krhln   1/1     Running   0             11m
kubernetes-dashboard   kubernetes-dashboard-6b6b86c4c5-ptstx        1/1     Running   0             8h

和 kubernetes-dashboard 相关的两个service:

$ k get services -A 
NAMESPACE              NAME                        TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                  AGE
kubernetes-dashboard   dashboard-metrics-scraper   ClusterIP   10.103.242.118   <none>        8000/TCP                 8h
kubernetes-dashboard   kubernetes-dashboard        ClusterIP   10.106.3.227     <none>        443/TCP                  8h

访问dashboard

参考官方文章: https://github.com/kubernetes/dashboard/blob/master/docs/user/accessing-dashboard/README.md

前面部署 dashboard 时使用的是 recommended 配置,和文章要求一致。

当前集群信息如下:

$ kubectl cluster-info
Kubernetes control plane is running at https://192.168.0.41:6443
CoreDNS is running at https://192.168.0.41:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

kubectl proxy

直接 kubectl proxy 启动的是本地代理服务器,只能通过 localhost 访问,这个只适合本地单集群使用:

$ k proxy          
Starting to serve on 127.0.0.1:8001

kubectl port-forward

$ kubectl port-forward -n kubernetes-dashboard service/kubernetes-dashboard 8080:443
Forwarding from 127.0.0.1:8080 -> 8443
Forwarding from [::1]:8080 -> 8443

类似的,也只能本地访问 https://localhost:8080 。

NodePort

执行:

kubectl -n kubernetes-dashboard edit service kubernetes-dashboard

修改 type: ClusterIPtype: NodePort:

apiVersion: v1
...
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
  resourceVersion: "343478"
  selfLink: /api/v1/namespaces/kubernetes-dashboard/services/kubernetes-dashboard
  uid: 8e48f478-993d-11e7-87e0-901b0e532516
spec:
  clusterIP: 10.100.124.90
  externalTrafficPolicy: Cluster
  ports:
  - port: 443
    protocol: TCP
    targetPort: 8443
  selector:
    k8s-app: kubernetes-dashboard
  sessionAffinity: None
  type: ClusterIP

看一下具体分配的 node port 是哪个:

$ kubectl -n kubernetes-dashboard get service kubernetes-dashboard
NAME                   TYPE       CLUSTER-IP     EXTERNAL-IP   PORT(S)         AGE
kubernetes-dashboard   NodePort   10.106.3.227   <none>        443:32212/TCP   9h

可以看到这里分配的是 32212 端口。

然后就是 node 的 ip 地址了,如果是单节点的集群,那么 node ip 就固定为 master node 的IP,可以通过 kubectl cluster-info 获取。如果是多节点的集群,则需要找到 kubernetes-dashboard 服务被部署到了哪个节点。

$ k get pods -A -o wide | grep kubernetes-dashboard
kubernetes-dashboard   dashboard-metrics-scraper-799d786dbf-krhln   1/1     Running   0             32m   10.244.1.3     skyserver2   <none>           <none>
kubernetes-dashboard   kubernetes-dashboard-6b6b86c4c5-ptstx        1/1     Running   0             9h    10.244.1.2     skyserver2   <none>           <none>

如图 kubernetes-dashboard 服务被部署到了 skyserver2 节点,skyserver2 的 IP 是 192.168.0.50,则拼合起来的地址是

https://192.168.0.50:32212

或者为了方便起见,将每台node的名字和IP地址绑定,通过 sudo vi /ete/hosts 修改hosts文件,增加以下内容:

# node IP
192.168.0.10            skywork
192.168.0.20            skywork2
192.168.0.40            skyserver
192.168.0.50            skyserver2

之后就可以通过 https://skyserver2:32212 访问了。

特别事项:浏览器对自签名证书网站的访问处理

使用浏览器访问该地址时,可以连接上,但是浏览器会因为网站使用的是自签名证书而报错 “此网站连接不安全” 拒绝访问。

各浏览器的处理:

  • edag:拒绝访问,可以使用魔术短语: thisisunsafe (没有输入框,只要单击该页面以确保它具有焦点,然后键盘输入即可)
  • firefox:默认拒绝,选择"接受风险并继续"后可以正常访问
  • Chrome:待测试,应该可以使用魔术短语: thisisunsafe
  • Safari: 默认拒绝,点击 “Show details” -> “visit this website” -> “visit website” 可以绕开限制继续访问

参考:

登录Dashboard

dashboard

通过token登录

token可以通过下面的命令简单获取到:

kubectl -n kube-system describe $(kubectl -n kube-system get secret -n kube-system -o name | grep namespace) | grep token

输出为:

$ kubectl -n kube-system describe $(kubectl -n kube-system get secret -n kube-system -o name | grep namespace) | grep token
Name:         namespace-controller-token-r87br
Type:  kubernetes.io/service-account-token
token:      eyJhbGciOiJSUzI1NiIsImtpZCI6ImNuYUVPT3FRR0dVOFBmN3pFeW81Y1p5R004RVh6VGtJUUpfSHo1ZVFMUVEifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJuYW1lc3BhY2UtY29udHJvbGxlci10b2tlbi1yODdiciIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50Lm5hbWUiOiJuYW1lc3BhY2UtY29udHJvbGxlciIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6ImU2NjU3ODI3LTc4NTUtNDAzOC04MmJjLTlmMjI0OWM3NzYyZiIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDprdWJlLXN5c3RlbTpuYW1lc3BhY2UtY29udHJvbGxlciJ9.sVRT_x5NB4sqYwyyqn2Mm3hKg1jhvCsCDMbm_JY-3a19tknzwv_ZPpGOHWrPxmCG45_-tHExi7BbbGK1ZAky2UjtEpxmtVNR6yqHRMYvXtqifqHI4yS6ig-t5WiZ0a4h1q6xZfWsM9nlINSTGQbguCCN2kXUYyAZ0HPdPhdFtmyH9_fjI-FXQOPeK9t9GfWn9Nm52T85spzriwOMY96fFXZ3YaiuzfY5aBtGoxLwDu7O2GOazBmeFaRzEEGR0RjgdM7WPFmtDvbaidIJDPkLznqftqwUFeWHjz6-toO8iaKW_QKHFBvZTQ6uXSc__tbcSYyThu3Ty97-Ml8TArhacw

复制这里的 token 提交就可以登录。

参考:

通过kubeconf文件登录

在 kebeconf 文件(路径为 ~/.kube/config)中加入 token 信息:

apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: XXXXXX==
    server: https://192.168.0.41:6443
  name: kubernetes
contexts:
- context:
    cluster: kubernetes
    user: kubernetes-admin
  name: kubernetes-admin@kubernetes
current-context: kubernetes-admin@kubernetes
kind: Config
preferences: {}
users:
- name: kubernetes-admin
  user:
    client-certificate-data: XXXXX==
    client-key-data: XXXX=
    token: eyJhbGciOiJSUzI1NiIsImtpZCI6ImNuYUVPT3FRR0dVOFBmN3pFeW81Y1p5R004RVh6VGtJUUpfSHo1ZVFMUVEifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJuYW1lc3BhY2UtY29udHJvbGxlci10b2tlbi1yODdiciIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50Lm5hbWUiOiJuYW1lc3BhY2UtY29udHJvbGxlciIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6ImU2NjU3ODI3LTc4NTUtNDAzOC04MmJjLTlmMjI0OWM3NzYyZiIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDprdWJlLXN5c3RlbTpuYW1lc3BhY2UtY29udHJvbGxlciJ9.sVRT_x5NB4sqYwyyqn2Mm3hKg1jhvCsCDMbm_JY-3a19tknzwv_ZPpGOHWrPxmCG45_-tHExi7BbbGK1ZAky2UjtEpxmtVNR6yqHRMYvXtqifqHI4yS6ig-t5WiZ0a4h1q6xZfWsM9nlINSTGQbguCCN2kXUYyAZ0HPdPhdFtmyH9_fjI-FXQOPeK9t9GfWn9Nm52T85spzriwOMY96fFXZ3YaiuzfY5aBtGoxLwDu7O2GOazBmeFaRzEEGR0RjgdM7WPFmtDvbaidIJDPkLznqftqwUFeWHjz6-toO8iaKW_QKHFBvZTQ6uXSc__tbcSYyThu3Ty97-Ml8TArhacw

默认生成的kebuconf文件是不带 token 字段的,加上即可。

然后在页面上提交这个 kebuconf 文件即可登录。相比token登录方式,不需要每次去获取token内容,一次保存之后以后方便很多。

2.1.5 - 部署 metrics-server

metrics-server

安装 metrics-server

通过 kubeadm 安装的 k8s 集群默认是没有安装 metrics-server,因此需要手工安装。

注意:不要按照官方文档所说的那样直接安装,会不可用的。

修改 api server

先检查 k8s 集群的 api server 是否有启用API Aggregator:

ps -ef | grep apiserver 

对比:

ps -ef | grep apiserver | grep enable-aggregator-routing

默认是没有开启的。因此需要修改 k8s apiserver 的配置文件:

sudo vi /etc/kubernetes/manifests/kube-apiserver.yaml

增加 --enable-aggregator-routing=true

apiVersion: v1
kind: Pod
......
spec:
  containers:
  - command:
    - kube-apiserver
	......
    - --enable-bootstrap-token-auth=true
    - --enable-aggregator-routing=true  # 增加这行

api server 会自动重启,稍后用命令验证一下:

ps -ef | grep apiserver | grep enable-aggregator-routing

下载并修改安装文件

先下载安装文件,直接用最新版本:

wget https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

如果要安装指定版本,请查看 https://github.com/kubernetes-sigs/metrics-server/releases/ 页面。

修改下载下来的 components.yaml, 增加 --kubelet-insecure-tls 并修改 --kubelet-preferred-address-types

  template:
    metadata:
      labels:
        k8s-app: metrics-server
    spec:
      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP   # 修改这行,默认是InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        - --kubelet-insecure-tls  # 增加这行

然后安装:

$ k apply -f components.yaml

serviceaccount/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
service/metrics-server created
deployment.apps/metrics-server created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created

稍等片刻看是否启动:

$ kubectl get pod -n kube-system | grep metrics-server
metrics-server-5979f785c8-lmtq5     1/1     Running   0                46s

验证一下,查看 service 信息

$ kubectl describe svc metrics-server -n kube-system

Name:              metrics-server
Namespace:         kube-system
Labels:            k8s-app=metrics-server
Annotations:       <none>
Selector:          k8s-app=metrics-server
Type:              ClusterIP
IP Family Policy:  SingleStack
IP Families:       IPv4
IP:                10.98.127.10
IPs:               10.98.127.10
Port:              https  443/TCP
TargetPort:        https/TCP
Endpoints:         10.244.0.37:4443		# ping 一下这个 IP 地址
Session Affinity:  None
Events:            <none>

使用

简单验证一下基本使用。

$ kubectl top nodes
NAME        CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
skyserver   384m         1%     1687Mi          1% 

$ kubectl top pods -n kube-system 
NAME                                CPU(cores)   MEMORY(bytes)   
coredns-64897985d-9z82d             2m           19Mi            
coredns-64897985d-wkzc7             2m           20Mi            
etcd-skyserver                      23m          77Mi            
kube-apiserver-skyserver            74m          282Mi           
kube-controller-manager-skyserver   24m          58Mi            
kube-flannel-ds-lnl72               4m           39Mi            
kube-proxy-8g26s                    1m           37Mi            
kube-scheduler-skyserver            5m           23Mi            
metrics-server-5979f785c8-lmtq5     4m           21Mi 

2.1.6 - ubuntu 20.04下用 kubeadm 安装 kubenetes

通过 kubeadm 在 ubuntu 上安装 kubenetes

参考 Kubernetes 官方文档:

前期准备

关闭防火墙

systemctl disable firewalld && systemctl stop firewalld

安装docker和bridge-utils

要求节点上安装有 docker (或者其他container runtime)和 bridge-utils (用来操作linux bridge).

查看 docker 版本:

$ docker --version
Docker version 20.10.21, build baeda1f

bridge-utils可以通过apt安装:

sudo apt-get install bridge-utils

设置iptables

要确保 br_netfilter 模块已经加载,可以通过运行 lsmod | grep br_netfilter来完成。

$ lsmod | grep br_netfilter
br_netfilter           32768  0
bridge                307200  1 br_netfilter

如需要明确加载,请调用 sudo modprobe br_netfilter

为了让作为Linux节点的iptables能看到桥接流量,应该确保 net.bridge.bridge-nf-call-iptables 在 sysctl 配置中被设置为1,执行命令:

cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
br_netfilter
EOF

cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sudo sysctl --system

禁用虚拟内存swap

执行 free -m 命令检测:

$ free -m        
              total        used        free      shared  buff/cache   available
Mem:          15896        1665       11376          20        2854       13819
Swap:             0           0           0

如果Swap这一行不是0,则说明虚拟内存swap被开启了,需要关闭。

需要做两个事情:

  1. 操作系统安装时就不要设置swap分区,如果有,删除该swap分区

  2. 即使没有swap分区,也会开启swap,需要通过 sudo vi /etc/fstab 找到swap 这一行:

    # 在swap分区这行前加 # 禁用掉swap
    /swapfile                                 none            swap    sw              0       0
    

    重启之后再用 free -m 命令检测。

设置docker的cgroup driver

docker 默认的 cgroup driver 是 cgroupfs,可以通过 docker info 命令查看:

$ docker info | grep "Cgroup Driver"
 Cgroup Driver: cgroupfs

而 kubernetes 在v1.22版本之后,如果用户没有在 KubeletConfiguration 下设置 cgroupDriver 字段,则 kubeadm 将默认为 systemd

需要修改 docker 的 cgroup driver 为 systemd, 方式为打开 docker 的配置文件(如果不存在则创建)

sudo vi /etc/docker/daemon.json

增加内容:

{
"exec-opts": ["native.cgroupdriver=systemd"]
}

修改完成后重启 docker:

systemctl restart docker

# 重启后检查一下
docker info | grep "Cgroup Driver"

否则,在安装过程中,由于 cgroup driver 的不一致,kubeadm init 命令会因为 kubelet 无法启动而超时失败,报错为:

[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.

Unfortunately, an error has occurred:
        timed out waiting for the condition

This error is likely caused by:
        - The kubelet is not running
        - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
        - 'systemctl status kubelet'
        - 'journalctl -xeu kubelet'

执行 systemctl status kubelet 会发现 kubelet 因为报错而退出,执行 journalctl -xeu kubelet 会发现有如下的错误信息:

Dec 26 22:31:21 skyserver2 kubelet[132861]: I1226 22:31:21.438523  132861 docker_service.go:264] "Docker Info" dockerInfo=&{ID:AEON:SBVF:43UK:WASV:YIQK:QGGA:7RU3:IIDK:DV7M:6QLH:5ICJ:KT6R Containers:2 ContainersRunning:0 ContainersPaused:>
Dec 26 22:31:21 skyserver2 kubelet[132861]: E1226 22:31:21.438616  132861 server.go:302] "Failed to run kubelet" err="failed to run Kubelet: misconfiguration: kubelet cgroup driver: \"systemd\" is different from docker cgroup driver: \"c>
Dec 26 22:31:21 skyserver2 systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
-- Subject: Unit process exited
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
-- 
-- An ExecStart= process belonging to unit kubelet.service has exited.
-- 
-- The process' exit code is 'exited' and its exit status is 1.

参考:

安装kubeadm

按照官方文档的指示,执行如下命令:

sudo -i
apt-get update
apt-get install -y apt-transport-https ca-certificates curl
curl -fsSLo /usr/share/keyrings/kubernetes-archive-keyring.gpg https://packages.cloud.google.com/apt/doc/apt-key.gpg
echo "deb [signed-by=/usr/share/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list

apt-get update
apt-get install -y kubelet kubeadm kubectl

这会安装最新版本的kubernetes:

......
Setting up conntrack (1:1.4.6-2build2) ...
Setting up kubectl (1.25.4-00) ...
Setting up ebtables (2.0.11-4build2) ...
Setting up socat (1.7.4.1-3ubuntu4) ...
Setting up cri-tools (1.25.0-00) ...
Setting up kubernetes-cni (1.1.1-00) ...
Setting up kubelet (1.25.4-00) ...
Created symlink /etc/systemd/system/multi-user.target.wants/kubelet.service → /lib/systemd/system/kubelet.service.
Setting up kubeadm (1.25.4-00) ...
Processing triggers for man-db (2.10.2-1) ...
Processing triggers for doc-base (0.11.1) ...
Processing 1 added doc-base file...

# 查看版本
$ kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.4", GitCommit:"872a965c6c6526caa949f0c6ac028ef7aff3fb78", GitTreeState:"clean", BuildDate:"2022-11-09T13:35:06Z", GoVersion:"go1.19.3", Compiler:"gc", Platform:"linux/amd64"}

$ kubelet --version
Kubernetes v1.25.4

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.4", GitCommit:"872a965c6c6526caa949f0c6ac028ef7aff3fb78", GitTreeState:"clean", BuildDate:"2022-11-09T13:36:36Z", GoVersion:"go1.19.3", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.7
The connection to the server localhost:8080 was refused - did you specify the right host or port?

如果希望安装特定版本:

apt-get install kubelet=1.23.5-00 kubeadm=1.23.5-00 kubectl=1.23.5-00

apt-get install kubelet=1.23.14-00 kubeadm=1.23.14-00 kubectl=1.23.14-00

apt-get install kubelet=1.24.8-00 kubeadm=1.24.8-00 kubectl=1.24.8-00

具体有哪些可用的版本,可以看这里:

https://packages.cloud.google.com/apt/dists/kubernetes-xenial/main/binary-amd64/Packages

由于 kubernetes 1.25 之后默认使用

安装k8s

sudo kubeadm init --pod-network-cidr=10.244.0.0/16 -v=9
sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.100.40 -v=9

注意后面为了使用 CNI network 和 Flannel,我们在这里设置了 --pod-network-cidr=10.244.0.0/16,如果不加这个设置,Flannel 会一直报错。如果机器上有多个网卡,可以用 --apiserver-advertise-address 指定要使用的IP地址。

如果遇到报错:

[preflight] Some fatal errors occurred:
	[ERROR CRI]: container runtime is not running: output: E1125 11:16:01.799551   14661 remote_runtime.go:948] "Status from runtime service failed" err="rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService"
time="2022-11-25T11:16:01+08:00" level=fatal msg="getting status of runtime: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService"
, error: exit status 1

则可以执行下列命令之后重新尝试 kubeadm init:

$ rm -rf /etc/containerd/config.toml 
$ systemctl restart containerd.service

kubeadm init 输出如下:

......

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.100.40:6443 --token uq5nqn.bppygpcqty6icec4 \
	--discovery-token-ca-cert-hash sha256:51c13871cd25b122f3a743040327b98b1c19466d01e1804aa2547c047b83632b 

为了使用普通用户,按照上面的提示执行:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

安装完成后,node处于NotReady状态:

$ kubectl get node  
NAME        STATUS     ROLES                  AGE    VERSION
skyserver   NotReady   control-plane,master   3m7s   v1.23.5

kubectl describe 可以看到是因为没有安装 network plugin

$ kubectl describe node skyserver
Name:               skyserver
Roles:              control-plane,master
......
  Ready            False   Thu, 24 Mar 2022 13:57:21 +0000   Thu, 24 Mar 2022 13:57:06 +0000   KubeletNotReady              container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized

安装flannel:

kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml

备注:有时会遇到 raw.githubusercontent.com 这个域名被污染,解析为 127.0.0.1,导致无法访问。解决方法是访问 https://ipaddress.com/website/raw.githubusercontent.com 然后查看可用的IP地址,找一个速度最快的,在 /etc/hosts 文件中加入一行记录即可,如 185.199.111.133 raw.githubusercontent.com

稍等就可以看到 node 的状态变为 Ready了:

$ kubectl get node                                                                                           
NAME        STATUS   ROLES                  AGE     VERSION
skyserver   Ready    control-plane,master   4m52s   v1.23.5

最后,如果是测试用的单节点,为了让负载可以跑在k8s的master节点上,执行下列命令去除master的污点:

kubectl taint nodes --all node-role.kubernetes.io/master-

可以通过 kubectl describe node skyserver 对比去除污点前后 node 信息中的 Taints 部分,去除污点前:

Taints:             node.kubernetes.io/not-ready:NoExecute
                    node-role.kubernetes.io/master:NoSchedule
                    node.kubernetes.io/not-ready:NoSchedule

去除污点后:

Taints:             <none>

常见问题

有时会遇到 coredns pod无法创建的情况:

$ k get pods -A                                                                                              
NAMESPACE     NAME                                READY   STATUS              RESTARTS   AGE
kube-system   coredns-64897985d-9z82d             0/1     ContainerCreating   0          82s
kube-system   coredns-64897985d-wkzc7             0/1     ContainerCreating   0          82s

问题发生在 flannel 上:

$ k describe pods -n kube-system coredns-64897985d-9z82d
......
  Warning  FailedCreatePodSandBox  100s                 kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "675b91ac9d25f0385d3794847f47c94deac2cb712399c21da59cf90e7cccb246" network for pod "coredns-64897985d-9z82d": networkPlugin cni failed to set up pod "coredns-64897985d-9z82d_kube-system" network: open /run/flannel/subnet.env: no such file or directory
  Normal   SandboxChanged          97s (x12 over 108s)  kubelet            Pod sandbox changed, it will be killed and re-created.
  Warning  FailedCreatePodSandBox  96s (x4 over 99s)    kubelet            (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "b46dcd8abb9ab0787fdb2ab9f33ebf052c2dd1ad091c006974a3db7716904196" network for pod "coredns-64897985d-9z82d": networkPlugin cni failed to set up pod "coredns-64897985d-9z82d_kube-system" network: open /run/flannel/subnet.env: no such file or directory

解决的方式就是重新执行:

kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml

备注:这个问题只遇到过一次。

失败重来

如果遇到安装失败,需要重新开始,或者想铲掉现有的安装,则可以:

  1. 运行kubeadm reset
  2. 删除 .kube 目录
  3. 再次执行 kubeadm init

如果网络设置有改动,则需要彻底的重置网络。具体见下一章。

将节点加入到集群

如果有多个kubenetes节点(即多台机器),则需要将其他节点加入到集群中。具体见下一章。

2.2 - 在debian12上安装kubenetes

在debian12上安装kubenetes

2.2.1 - 在debian12上安装kubenetes

在debian12上安装kubenetes

准备

系统更新

确保更新debian系统到最新,移除不再需要的软件,清理无用的安装包:

sudo apt update && sudo apt full-upgrade -y
sudo apt autoremove
sudo apt autoclean

如果更新了内核,最好重启一下。

swap分区

安装 Kubernetes 要求机器不能有 swap 分区。

开启模块

cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF

sudo modprobe overlay
sudo modprobe br_netfilter

# sysctl params required by setup, params persist across reboots
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables  = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward                 = 1
EOF

# Apply sysctl params without reboot
sudo sysctl --system

安装 docker

卸载非官方版本 Docker:

for pkg in docker.io docker-doc docker-compose podman-docker containerd runc; do sudo apt-get remove $pkg; done

安装:

sudo apt-get update
sudo apt-get install ca-certificates curl gnupg
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/debian/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg

echo \
  "deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/debian \
  "$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

启动 docker 并设置开机自动运行:

sudo systemctl enable docker --now

安装 goalng 1.20 或者更高版本

这是为了给下面的手工 build cri-dockerd 做准备。

https://golang.org/dl/

下载最新版本的 golang。

mkdir -p ~/temp
mkdir -p ~/work/soft/gopath
cd ~/temp
wget https://go.dev/dl/go1.22.4.linux-amd64.tar.gz
sudo tar -C /usr/local -xzf go1.22.4.linux-amd64.tar.gz

修改

vi ~/.zshrc

加入以下内容:

export GOPATH=/home/sky/work/soft/gopath
export PATH=/usr/local/go/bin:$GOPATH/bin:$PATH

执行:

source ~/.zshrc

go version
go env

安装 cri-dockerd

注意需要先安装 goalng 1.20 或者更高版本。

mkdir -p ~/temp
cd ~/temp
git clone https://github.com/Mirantis/cri-dockerd.git
cd cri-dockerd
make cri-dockerd
sudo mkdir -p /usr/local/bin
sudo install -o root -g root -m 0755 cri-dockerd /usr/local/bin/cri-dockerd
sudo install packaging/systemd/* /etc/systemd/system
sudo sed -i -e 's,/usr/bin/cri-dockerd,/usr/local/bin/cri-dockerd,' /etc/systemd/system/cri-docker.service
sudo systemctl daemon-reload
sudo systemctl enable cri-docker.service
sudo systemctl enable --now cri-docker.socket

安装 helm

为后面安装 dashboard 做准备:

curl https://baltocdn.com/helm/signing.asc | gpg --dearmor | sudo tee /usr/share/keyrings/helm.gpg > /dev/null
sudo apt-get install apt-transport-https --yes
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/helm.gpg] https://baltocdn.com/helm/stable/debian/ all main" | sudo tee /etc/apt/sources.list.d/helm-stable-debian.list
sudo apt-get update
sudo apt-get install helm

安装 kubernetes

安装 kubeadm / kubelet / kubectl

sudo apt-get update
sudo apt-get install -y apt-transport-https

假定要安装的 kubernetes 版本为 1.29:

export K8S_VERSION=1.29

curl -fsSL https://pkgs.k8s.io/core:/stable:/v${K8S_VERSION}/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg

echo "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v${K8S_VERSION}/deb/ /" | sudo tee /etc/apt/sources.list.d/kubernetes.list

开始安装 kubelet kubeadm kubectl:

sudo apt update
sudo apt install kubelet kubeadm kubectl -y

禁止这三个程序的自动更新:

sudo apt-mark hold kubelet kubeadm kubectl

验证安装:

kubectl version --client && echo && kubeadm version
Client Version: v1.29.6
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3

kubeadm version: &version.Info{Major:"1", Minor:"29", GitVersion:"v1.29.6", GitCommit:"062798d53d83265b9e05f14d85198f74362adaca", GitTreeState:"clean", BuildDate:"2024-06-11T20:22:13Z", GoVersion:"go1.21.11", Compiler:"gc", Platform:"linux/amd64"}

优化zsh

~/.zshrc 中增加以下内容:

# k8s auto complete
alias k=kubectl
complete -F __start_kubectl k

source ~/.zshrc 之后即可使用,此时用 k 这个别名来执行 kubectl 命令时也可以实现自动完成,非常的方便。

取消自动更新

docker / helm / kubernetes 这些的版本没有必要升级到最新,因此可以取消他们的自动更新。

cd /etc/apt/sources.list.d
ls
docker.list  helm-stable-debian.list  kubernetes.list

初始化集群

sudo kubeadm init --pod-network-cidr 10.244.0.0/16 --cri-socket unix:///var/run/cri-dockerd.sock --apiserver-advertise-address=192.168.6.224

输出为:

sudo kubeadm init --pod-network-cidr 10.244.0.0/16 --cri-socket unix:///var/run/cri-dockerd.sock --apiserver-advertise-address=192.168.6.224
I0621 05:51:22.665581   20837 version.go:256] remote version is much newer: v1.30.2; falling back to: stable-1.29
[init] Using Kubernetes version: v1.29.6
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'

[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [debian12 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.6.224]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [debian12 localhost] and IPs [192.168.6.224 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [debian12 localhost] and IPs [192.168.6.224 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "super-admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 3.500697 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node debian12 as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node debian12 as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: 1x7fi1.kjxn00med7dd3xwx
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.6.224:6443 --token 1x7fi1.kjxn00med7dd3xwx \
	--discovery-token-ca-cert-hash sha256:51037fa4e37f485e10cb8ddfe8ec23e57d0dcd6698e5982f01449b6b6ca843e5

根据提示操作:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

此时节点的状态会是 NotReady:

kubectl get node  
NAME       STATUS     ROLES           AGE    VERSION
debian12   NotReady   control-plane   4m7s   v1.29.6

需要继续安装网络插件,可以选择 flannel 或者 Calico。

对于测试用的单节点,去除 master/control-plane 的污点:

kubectl taint nodes --all node-role.kubernetes.io/control-plane-

(可选)安装 flannel

kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml

(可选)安装 Calico

https://docs.tigera.io/calico/latest/getting-started/kubernetes/self-managed-onprem/onpremises#install-calico

查看最新版本,当前最新版本是 v3.28:

kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.28.0/manifests/calico.yaml

安装 dashboard

在下面地址上查看当前dashboard的版本:

https://github.com/kubernetes/dashboard/releases

根据对kubernetes版本的兼容情况选择对应的dashboard的版本:

  • dashboard 7.5 : 全面兼容 k8s 1.29

最新版本需要用 helm 进行安装:

helm repo add kubernetes-dashboard https://kubernetes.github.io/dashboard/
helm upgrade --install kubernetes-dashboard kubernetes-dashboard/kubernetes-dashboard --create-namespace --namespace kubernetes-dashboard

输出为:

helm repo add kubernetes-dashboard https://kubernetes.github.io/dashboard/
helm upgrade --install kubernetes-dashboard kubernetes-dashboard/kubernetes-dashboard --create-namespace --namespace kubernetes-dashboard
"kubernetes-dashboard" has been added to your repositories
Release "kubernetes-dashboard" does not exist. Installing it now.
NAME: kubernetes-dashboard
LAST DEPLOYED: Fri Jun 21 06:23:53 2024
NAMESPACE: kubernetes-dashboard
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
*************************************************************************************************
*** PLEASE BE PATIENT: Kubernetes Dashboard may need a few minutes to get up and become ready ***
*************************************************************************************************

Congratulations! You have just installed Kubernetes Dashboard in your cluster.

To access Dashboard run:
  kubectl -n kubernetes-dashboard port-forward svc/kubernetes-dashboard-kong-proxy 8443:443

NOTE: In case port-forward command does not work, make sure that kong service name is correct.
      Check the services in Kubernetes Dashboard namespace using:
        kubectl -n kubernetes-dashboard get svc

Dashboard will be available at:
  https://localhost:8443

此时 dashboard 的 service 和 pod 情况:

kubectl -n kubernetes-dashboard get services 
NAME                                   TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                         AGE
kubernetes-dashboard-api               ClusterIP   10.107.22.93     <none>        8000/TCP                        17m
kubernetes-dashboard-auth              ClusterIP   10.102.201.198   <none>        8000/TCP                        17m
kubernetes-dashboard-kong-manager      NodePort    10.103.64.84     <none>        8002:30161/TCP,8445:31811/TCP   17m
kubernetes-dashboard-kong-proxy        ClusterIP   10.97.134.204    <none>        443/TCP                         17m
kubernetes-dashboard-metrics-scraper   ClusterIP   10.98.177.211    <none>        8000/TCP                        17m
kubernetes-dashboard-web               ClusterIP   10.109.72.203    <none>        8000/TCP                        17m

可以访问 http://ip:31811 来访问 dashboard 中的 kong manager。

以前的版本是要访问 kubernetes-dashboard service,现在新版本修改为要访问 kubernetes-dashboard-kong-proxy。

为了方便,使用 node port 来访问 dashboard,需要执行

kubectl -n kubernetes-dashboard edit service kubernetes-dashboard-kong-proxy

然后修改 type: ClusterIPtype: NodePort。然后看一下具体分配的 node port 是哪个:

kubectl -n kubernetes-dashboard get service kubernetes-dashboard-kong-proxy

输出为:

$ kubectl -n kubernetes-dashboard get service kubernetes-dashboard-kong-proxy 

NAME                              TYPE       CLUSTER-IP      EXTERNAL-IP   PORT(S)         AGE
kubernetes-dashboard-kong-proxy   NodePort   10.97.134.204   <none>        443:30730/TCP   24m

直接就可以用浏览器直接访问:

https://192.168.0.101:30730/

创建用户并登录 dashboard

参考:Creating sample user

创建 admin-user 用户:

vi admin-user-ServiceAccount.yaml

内容为:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: admin-user
  namespace: kubernetes-dashboard

执行:

k create -f admin-user-ServiceAccount.yaml

然后绑定角色:

vi admin-user-ClusterRoleBinding.yaml

内容为:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: admin-user
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: admin-user
  namespace: kubernetes-dashboard

执行:

k create -f admin-user-ClusterRoleBinding.yaml

然后创建 token :

kubectl -n kubernetes-dashboard create token admin-user

输出为:

$ kubectl -n kubernetes-dashboard create token admin-user
eyJhbGciOiJSUzI1NiIsImtpZCI6IjdGczc3STI1VVA1OFpKdF9zektMVVFtZjd1NXRDRU8xTTZpZ1VYbDdKWFEifQ.eyJhdWQiOlsiaHR0cHM6Ly9rdWJlcm5ldGVzLmRlZmF1bHQuc3ZjLmNsdXN0ZXIubG9jYWwiXSwiZXhwIjoxNzE5MDM2NDM0LCJpYXQiOjE3MTkwMzI4MzQsImlzcyI6Imh0dHBzOi8va3ViZXJuZXRlcy5kZWZhdWx0LnN2Yy5jbHVzdGVyLmxvY2FsIiwia3ViZXJuZXRlcy5pbyI6eyJuYW1lc3BhY2UiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsInNlcnZpY2VhY2NvdW50Ijp7Im5hbWUiOiJhZG1pbi11c2VyIiwidWlkIjoiNGY4YmQ3YjAtZjM2OS00MjgzLWJlNmItMThjNjUyMzE0YjQ0In19LCJuYmYiOjE3MTkwMzI4MzQsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDprdWJlcm5ldGVzLWRhc2hib2FyZDphZG1pbi11c2VyIn0.GOYLXoCCeaZPQ-kuJgx0d4KzRnLkHDHJArAjOwRqg49WIhAl3Hb8O2oD6at2jFgItO-xihFm3D3Ru2jXnPnMhvir0BJ5LBnumH0xDakZ4PrwvCAQADv8KR1ZuzMHlN5yktJ14eSo_UN1rZarq5P1DnbAIHRmgtIlRL2Hfl_Bamkuoxpwr06v50nJHskW7K3A2LjUlgv5rdS7FckIPaD5apmag7NyUi7FP1XEItUX20tF7jy5E5Gv9mI_HDGMTVMxawY4IAvipRcKVQ3tAypVOOMhrqGsfBprtWUkwmyWW8p0jHcAmqq-WX-x-vN70qI4Y2RipKGd4d6z39zPEPCsow

这个 token 就可以用在 kubernetes-dashboard 的登录页面上了。

为了方便,将这个 token 存储在 Secret :

vi admin-user-Secret.yaml

内容为:

apiVersion: v1
kind: Secret
metadata:
  name: admin-user
  namespace: kubernetes-dashboard
  annotations:
    kubernetes.io/service-account.name: "admin-user"   
type: kubernetes.io/service-account-token

执行:

k create -f admin-user-Secret.yaml

之后就可以用命令随时获取这个 token 了:

kubectl get secret admin-user -n kubernetes-dashboard -o jsonpath={".data.token"} | base64 -d

安装 metrics server

下载:

cd ~/work/soft/k8s
wget https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

修改下载下来的 components.yaml, 增加 --kubelet-insecure-tls 并修改 --kubelet-preferred-address-types

  template:
    metadata:
      labels:
        k8s-app: metrics-server
    spec:
      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP   # 修改这行,默认是InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        - --kubelet-insecure-tls  # 增加这行

然后安装:

k apply -f components.yaml

稍等片刻看是否启动:

$ kubectl get pod -n kube-system | grep metrics-server

验证一下,查看 service 信息

kubectl describe svc metrics-server -n kube-system

简单验证一下基本使用:

kubectl top nodes
kubectl top pods -n kube-system 

参考资料

3 - 通过 minikube 安装 kubenetes

通过 minikube 安装 kubenetes

3.1 - minikube概述

minikube概述

3.2 - ubuntu下用minikube安装

ubuntu下用minikube安装kubernetes

参考官方资料:

https://kubernetes.io/docs/tasks/tools/install-minikube/

准备

  1. VT-x 或 AMD-v 虚拟化支持必须在电脑的bios中开启
  2. 安装虚拟机: 对于 Linux, 可以安装 VirtualBox 或 KVM

安装VirtualBox

具体操作参考这里:

https://skyao.io/learning-linux-mint/daily/system/virtualbox.html

安装kubectl

参考:

https://kubernetes.io/docs/tasks/tools/install-kubectl/

执行命令如下:

curl -LO https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl

chmod +x ./kubectl
sudo mv ./kubectl /usr/local/bin/kubectl
kubectl version

注意:如果更新了Minikube,务必重新再执行上述步骤以便更新kubectl到最新版本,否则可能出现问题,比如minikube dashboard打不开浏览器。

如果需要重新安装,需要先执行清理:

rm -rf ~/.kube/
sudo rm -rf /usr/bin/kubectl

安装Minikube

安装命令:

curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64 \
  && sudo install minikube-linux-amd64 /usr/local/bin/minikube

minikube version

如果有版本更新,重新执行上面命令再次安装即可。

通过minikube运行k8s

参考: https://kubernetes.io/docs/getting-started-guides/minikube/

warn::一定要设置代理

切记要设置代理,否则会因为网络被墙导致无法获取镜像,kubectl get pod 会发现一直阻塞在 status ContainerCreating

minikube start --docker-env http_proxy=http://192.168.31.152:8123 --docker-env https_proxy=http://192.168.31.152:8123 --docker-env no_proxy=localhost,127.0.0.1,::1,192.168.31.0/24,192.168.99.0/24

如果有全局翻墙,就可以简单的minikube start启动。

注意:这里的代理一定要是http代理,因此不能直接输入shadowsocks的地址,要用pilipo提供的http代理,而且要设置pilipo的proxyAddress,不能只监听127.0.0.1.

如果没有正确设置代理就执行了 minikube start,则只能删除虚拟机然后重新来过,后面再设置代理是没有效果的:

minikube stop
minikube delete

按照上面的文章可以测试minikube是否可以正常工作.如果想查看k8s的控制台,输入下面命令:

minikube dashboard

备忘

用到的一些的命令,备用.

kubectl:

  • kubectl get pod
  • kubectl get pods –all-namespaces
  • kubectl get service
  • kubectl describe po hello-minikube-180744149-lj0rd

minikube:

  • minikube dashboard
  • minikube status
  • minikube service hello-minikube –url
  • curl $(minikube service hello-minikube –url)

3.3 - MacOS下用minikube安装

MacOS下用minikube安装kubernetes

安装VirtualBox

https://www.virtualbox.org/wiki/Downloads

下载安装for mac的版本,如 “ VirtualBox 6.0.4 platform packages” 下 的 OSX hosts,然后安装下载的img文件。

之后在下载 VirtualBox 6.0.4 Oracle VM VirtualBox Extension Pack,双击安装。

安装kubectl

brew install kubernetes-cli

安装minikube

brew cask install minikube

完成后测试一下:

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.2", GitCommit:"cff46ab41ff0bb44d8584413b598ad8360ec1def", GitTreeState:"clean", BuildDate:"2019-01-13T23:15:13Z", GoVersion:"go1.11.4", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.2", GitCommit:"cff46ab41ff0bb44d8584413b598ad8360ec1def", GitTreeState:"clean", BuildDate:"2019-01-10T23:28:14Z", GoVersion:"go1.11.4", Compiler:"gc", Platform:"linux/amd64"}

启动minikube

如果是在可以翻墙的环境中,安装最新版本的kubernetes,可以简单执行命令:

minikube start

如果不支持全局翻墙,可以指定镜像地址,也可以指定要安装的kubernetes版本:

minikube start --memory=8192 --cpus=4 --disk-size=20g  --registry-mirror=https://docker.mirrors.ustc.edu.cn --kubernetes-version=v1.12.5 --docker-env http_proxy=http://192.168.0.40:8123 --docker-env https_proxy=http://192.168.0.40:8123 --docker-env no_proxy=localhost,127.0.0.1,::1,192.168.0.0/24,192.168.99.0/24


minikube start --memory=8192 --cpus=4 --disk-size=20g --kubernetes-version=v1.12.5 

实测下载还是出问题,怀疑是不是要在minikuber start 前面再加 http_proxy=http://192.168.0.40:8123 http_proxys=http://192.168.0.40:8123 稍后验证。

启动dashborad

标准方式,执行命令 minikube dashboard ,然后就会自动打开浏览器访问地址 http://localhost:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/#!/login

备注:之前各个版本都正常,最近用v0.33.1的minikuber安装的kubernetes 1.13.2版本,遇到问题,报错如下:

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.2", GitCommit:"cff46ab41ff0bb44d8584413b598ad8360ec1def", GitTreeState:"clean", BuildDate:"2019-01-13T23:15:13Z", GoVersion:"go1.11.4", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.2", GitCommit:"cff46ab41ff0bb44d8584413b598ad8360ec1def", GitTreeState:"clean", BuildDate:"2019-01-10T23:28:14Z", GoVersion:"go1.11.4", Compiler:"gc", Platform:"linux/amd64"}

$ minikube version
minikube version: v0.33.1

$ minikube dashboard
Enabling dashboard ...
Verifying dashboard health ...
Launching proxy ...
Verifying proxy health ...

http://127.0.0.1:51695/api/v1/namespaces/kube-system/services/http:kubernetes-dashboard:/proxy/ is not responding properly: Temporary Error: unexpected response code: 503
Temporary Error: unexpected response code: 503
Temporary Error: unexpected response code: 503
Temporary Error: unexpected response code: 503

导致无法打开dashboard,只好用 kubeproxy 的方式:

$ kubectl proxy
Starting to serve on 127.0.0.1:8001

然后手动打开地址: http://localhost:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/#!/login

参考:https://stackoverflow.com/questions/52916548/minikube-dashboard-returns-503-error-on-macos

然后又是遇到 dashboard 登录的问题。

4 - 在 Docker Desktop 中安装 kubenetes

在 Docker Desktop 中安装 kubenetes

4.1 - MacOS下用Docker Desktop安装kubernetes

MacOS下用Docker Desktop安装kubernetes

这是mac下获得一个可用的kubernetes的最简单的方法

安装

在 docker desktop 安装完成之后,在 “Preferences” 中,左边选择 “Kubernetes”

docker desktop 会自行安装好最新版本的 kubernetes。

这个方式的最大优点是足够简单,只要网络OK基本只要点击下鼠标即可。