Kubernestes 1.17.5手工部署
构建企业级的高可用架构
- Kubernetes 1.17.5-rc2
- Docker 19.03.6-ce
- Etcd 3.3.7
- Flanneld 0.11.0
- calico xx.xx
- 插件:
- Coredns
- Dashboard
- Heapster (influxdb、grafana)
- Metrics-Server
- EFK (elasticsearch、fluentd、kibana)
- 镜像仓库:
- docker registry
- harbor
主要配置策略
kube-apiserver:
- 使用 keepalived 和 haproxy 实现 3 节点高可用
- 关闭非安全端口 8080 和匿名访问
- 在安全端口 6443 接收 https 请求
- 严格的认证和授权策略 (x509、token、RBAC)
- 开启 bootstrap token 认证,支持 kubelet TLS bootstrapping
- 使用 https 访问 kubelet、etcd,加密通信
kube-controller-manager:
- 3 节点高可用;
- 关闭非安全端口,在安全端口 10252 接收 https 请求;
- 使用 kubeconfig 访问 apiserver 的安全端口;
- 自动 approve kubelet 证书签名请求 (CSR),证书过期后自动轮转;
- 各 controller 使用自己的 ServiceAccount 访问 apiserver;
kube-scheduler:
- 3 节点高可用;
- 使用 kubeconfig 访问 apiserver 的安全端口;
kubelet:
- 使用 kubeadm 动态创建 bootstrap token,而不是在 apiserver 中静态配置;
- 使用 TLS bootstrap 机制自动生成 client 和 server 证书,过期后自动轮转;
- 在 KubeletConfiguration 类型的 JSON 文件配置主要参数;
- 关闭只读端口,在安全端口 10250 接收 https 请求,对请求进行认证和授权,拒绝匿名访问和非授权访问;
- 使用 kubeconfig 访问 apiserver 的安全端口;
kube-proxy:
- 使用 kubeconfig 访问 apiserver 的安全端口;
- 在 KubeProxyConfiguration 类型的 JSON 文件配置主要参数;
- 使用 ipvs 代理模式;
集群插件:
- DNS:使用功能、性能更好的 coredns;
- Dashboard:支持登录认证;
- Metric:heapster、metrics-server,使用 https 访问 kubelet 安全端口;
- Log:Elasticsearch、Fluend、Kibana;
- Registry 镜像库:docker-registry、harbor;
环境初始化
- 配置相关的yum源: https://mirrors.tuna.tsinghua.edu.cn/help/epel/
内核升级
# 查看当前的内核以及系统
[root@k8s-master-1 system]# uname -r
3.10.0-862.el7.x86_64
[root@k8s-master-1 system]# cat /etc/redhat-release
CentOS Linux release 7.5.1804 (Core)
# 开始升级
rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-3.el7.elrepo.noarch.rpm
yum clean all && yum makecache
yum --disablerepo=\* --enablerepo=elrepo-kernel repolist
yum --disablerepo=\* --enablerepo=elrepo-kernel list kernel*
yum --disablerepo=\* --enablerepo=elrepo-kernel install -y kernel-lt.x86_64
yum remove kernel-tools-libs.x86_64 kernel-tools.x86_64
yum --disablerepo=\* --enablerepo=elrepo-kernel install -y kernel-lt-tools.x86_64
rpm -qa | grep kernel
# reboot 选择新的内核版本进行启动
# 升级完成后可以移除久的内核
yum remove kernel-3.10.0-862.el7.x86_64
docker安装
- 推荐使用的版本19.03
yum install -y yum-utils device-mapper-persistent-data lvm2
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum makecache fast
yum install -y --setopt=obsoletes=0 docker-ce-19.03.1.ce-3.el7
systemctl restart docker
systemctl enable docker
修改主机名
# 根据你的规划来命名
hostnamectl set-hostname 主机名(k8s-master-01)
关闭selinux
# 关闭 SELinux,否则后续 K8S 挂载目录时可能报错 Permission denied:
# setenforce 0
# grep SELINUX /etc/selinux/config
SELINUX=disabled
开启LVS
modprobe br_netfilter
modprobe ip_vs
关闭dnsmasq
#linux 系统开启了 dnsmasq 后(如 GUI 环境),将系统 DNS Server 设置为 127.0.0.1,这会导致 docker 容器无法解析域名,需要关闭它:
service dnsmasq stop
systemctl disable dnsmasq
关闭swap分区
swapoff -a
#防止开机自动挂载 swap 分区,可以注释 /etc/fstab 中相应的条目:
sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
关闭防火墙
systemctl stop firewalld
systemctl disable firewalld
iptables -F && iptables -X && iptables -F -t nat && iptables -X -t nat
iptables -P FORWARD ACCEPT
安装相关的需要的软件
yum install -y epel-release
yum install -y conntrack ipvsadm ipset jq sysstat curl iptables libseccomp lrzsz wget tree
本地hosts
[root@k8s-master-1 system]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.42.170 k8s-master-1 master01 etcd01.k8s.io k8s-master-1.k8s.io
192.168.42.171 k8s-master-2 master02 etcd02.k8s.io k8s-master-2.k8s.io
192.168.42.172 k8s-master-3 master03 etcd03.k8s.io k8s-master-3.k8s.io
192.168.42.173 k8s-node-1
192.168.42.174 k8s-node-2
时间同步
# 调整系统 TimeZone,其他同步方式也可以
timedatectl set-timezone Asia/Shanghai
# 更新系统时间
ntpdate cn.pool.ntp.org
# 将当前的 UTC 时间写入硬件时钟
timedatectl set-local-rtc 0
# 重启依赖于系统时间的服务
systemctl restart rsyslog
systemctl restart crond
相关的系统参数调优
- cp_tw_recycle 和 Kubernetes 的 NAT 冲突,必须关闭 ,否则会导致服务不通;
- 关闭不使用的 IPV6 协议栈,防止触发 docker BUG;
cat <<EOF > /etc/sysctl.d/k8s.conf
net.ipv6.conf.all.disable_ipv6 = 1 #禁用ipv6
net.ipv6.conf.default.disable_ipv6 = 1 #禁用ipv6
net.ipv6.conf.lo.disable_ipv6 = 1 #禁用ipv6
net.ipv4.neigh.default.gc_stale_time = 120 #决定检查过期多久邻居条目
net.ipv4.conf.all.rp_filter = 0 #关闭反向路由校验
net.ipv4.conf.default.rp_filter = 0 #关闭反向路由校验
net.ipv4.conf.default.arp_announce = 2 #始终使用与目标IP地址对应的最佳本地IP地址作为ARP请求的源IP地址
net.ipv4.conf.lo.arp_announce = 2 #始终使用与目标IP地址对应的最佳本地IP地址作为ARP请求的源IP地址
net.ipv4.conf.all.arp_announce = 2 #始终使用与目标IP地址对应的最佳本地IP地址作为ARP请求的源IP地址
net.ipv4.ip_forward = 1 #启用ip转发功能
net.ipv4.tcp_max_tw_buckets = 5000 #表示系统同时保持TIME_WAIT套接字的最大数量
net.ipv4.tcp_syncookies = 1 #表示开启SYN Cookies。当出现SYN等待队列溢出时,启用cookies来处理
net.ipv4.tcp_max_syn_backlog = 1024 #接受SYN同包的最大客户端数量
net.ipv4.tcp_synack_retries = 2 #活动TCP连接重传次数
net.bridge.bridge-nf-call-ip6tables = 1 #要求iptables对bridge的数据进行处理
net.bridge.bridge-nf-call-iptables = 1 #要求iptables对bridge的数据进行处理
net.bridge.bridge-nf-call-arptables = 1 #要求iptables对bridge的数据进行处理
net.netfilter.nf_conntrack_max = 2310720 #修改最大连接数
fs.inotify.max_user_watches=89100 #同一用户同时可以添加的watch数目
fs.may_detach_mounts = 1 #允许文件卸载
fs.file-max = 52706963 #系统级别的能够打开的文件句柄的数量
fs.nr_open = 52706963 #单个进程可分配的最大文件数
vm.overcommit_memory=1 #表示内核允许分配所有的物理内存,而不管当前的内存状态如何
vm.panic_on_oom=0 #内核将检查是否有足够的可用内存供应用进程使用
vm.swappiness = 0 #关注swap
net.ipv4.tcp_keepalive_time = 600 #修复ipvs模式下长连接timeout问题,小于900即可
net.ipv4.tcp_keepalive_intvl = 30 #探测没有确认时,重新发送探测的频度
net.ipv4.tcp_keepalive_probes = 10 #在认定连接失效之前,发送多少个TCP的keepalive探测包
vm.max_map_count=524288 #定义了一个进程能拥有的最多的内存区域
EOF
sysctl --system
# 立即生效
sysctl -p /etc/sysctl.d/k8s.conf
# 挂载需要使用到的内核系统
mount -t cgroup -o cpu,cpuacct none /sys/fs/cgroup/cpu,cpuacct
创建工作目录
# 所有master节点
mkdir -p /opt/k8s/{bin,cert,tag}
mkdir -p /etc/{kubernetes,etcd}
mkdir -p /var/lib/etcd
chown -R k8s /opt/k8s
chown -R k8s /etc/kubernetes
chown -R k8s /var/lib/etcd
# 主master:
mkdir -p /etc/etcd/cert
chown -R k8s /etc/etcd/cert
mkdir -p /var/lib/etcd && chown -R k8s /etc/etcd/cert
机器初始化
- 初始化工具: yum install -y net-tools telnet tree nmap sysstat lrzsz dos2unix bind-utils
dns配置
- 指定DNS安装: yum -y install bind
-
主配置文件修改参数
#监听的53写当前机器的端口号即可
listen-on port 53 { IP; };
#哪一些客户端能检查DNS解析(默认是localhost,any就是所有)
allow-query { any; };
#指向上一级的DNS地址
forwarders { ip; };
#节省资源
dnssec-enable no;
dnssec-validation no;
#检查是否有报错
named-checkconf
配置区域文件(主机域,业务域)
- 编辑zone: vim /etc/named.rfc1912.zones
zone "host.com" IN {
type master;
file "host.com.zone";
allow-update { ip; };
};
zone "k8s.com" IN {
type master;
file "k8s.com.zone";
allow-update { ip; };
};
配置区域数据文件
- vim /var/named/host.com.zone
$ORIGIN host.com.
$TTL 600 ; 10 minutes
@ IN SOA dns.host.com. dnsadmin.host.com. (
2020051401 ; serial
10800 ; refresh (3 hours)
900 ; retry (15 minutes)
604800 ; expire (1 week)
86400 ; minimum (1 day)
)
NS dns.host.com.
$TTL 60 ; 1 minute
dns A 172.16.128.60
master01 A 172.16.128.60
master02 A 172.16.128.61
master03 A 172.16.128.62
node01 A 172.16.128.63
node02 A 172.16.128.64
node03 A 172.16.128.65
node04 A 172.16.128.66
node05 A 172.16.128.67
node06 A 172.16.128.68
ceph01 A 172.16.128.69
ceph02 A 172.16.128.70
ceph03 A 172.16.128.71
-
vim /etc/named/k8s.com.zone (其中serial需要改成自己当天的时间,底部的主机名和ip也修改成自己的)
-
修改后 : systemctl restart named (查看是否有报错,多数报错都是自己配置zone文件不对)
$ORIGIN k8s.com.
$TTL 600 ; 10 minutes
@ IN SOA dns.k8s.com. dnsadmin.k8s.com. (
2020051401 ; serial
10800 ; refresh (3 hours)
900 ; retry (15 minutes)
604800 ; expire (1 week)
86400 ; minimum (1 day)
)
NS dns.k8s.com.
$TTL 60 ; 1 minute
dns A 172.16.128.60
检查域名解析是否生效
一键批量替换
- 这个是我的host地址
#127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
172.16.128.60 zhc-k8s-master-test-01 master01 k8s-m1 k8s-master01
172.16.128.61 zhc-k8s-master-test-02 master02 k8s-m2 k8s-master02
172.16.128.62 zhc-k8s-master-test-03 master03 k8s-m3 k8s-master03
172.16.128.63 zhc-k8s-master-test-04 node01 k8s-n1
172.16.128.64 zhc-k8s-master-test-05 node02 k8s-n2
172.16.128.65 zhc-k8s-master-test-06 node03 k8s-n3
172.16.128.66 zhc-k8s-node-test-01 node04 k8s-n4
172.16.128.67 zhc-k8s-node-test-02 node05 k8s-n5
172.16.128.68 zhc-k8s-node-test-03 node06 k8s-n6
172.16.128.69 zhc-k8s-ceph-test-01 ceph01 k8s-rook-ceph01
172.16.128.70 zhc-k8s-ceph-test-02 ceph02 k8s-rook-ceph02
172.16.128.71 zhc-k8s-ceph-test-03 ceph03 k8s-rook-ceph03
- for i in
tail -12 /etc/hosts | awk '{print $3}'
;do ssh $i “echo ‘nameserver 172.16.128.60’ > /etc/resolv.con
CA认证初始化
方式一 Cfssl
自签需要准备的是根证书(权威机构),先要创建CA证书的请求文件
签发证书工具准备
mkdir -p /root/ca
wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64 -O /usr/bin/cfssl
wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64 -O /usr/bin/cfssl-json
wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64 -O /usr/bin/cfssl-certinfo
chmod +x /usr/bin/cfssl*
创建根证书
- CA 证书是集群所有节点共享的,_只需要创建一个 CA 证书_,后续创建的所有证书都由它签名。
- CA 证书的配置文件,用于配置根证书的使用场景 (profile) 和具体参数 (usage,过期时间、服务端认证、客户端认证、加密等),后续在签名其它证书时需要指定特定场景。
cat > ca-config.json <<EOF
{
"signing": {
"default": {
"expiry": "175200h"
},
"profiles": {
"server": {
"expiry": "175200h",
"usages": [
"signing",
"key encipherment",
"server auth"
]
},
"client": {
"expiry": "175200h",
"usages": [
"signing",
"key encipherment",
"client auth"
]
},
"peer": {
"expiry": "175200h",
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
]
},
"kubernetes": {
"expiry": "175200h",
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
]
}
}
}
}
EOF
创建证书签名请求文件
cat > ca-csr.json <<EOF
{
"CN": "kubernetes-ca",
"hosts": [
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "guangzhou",
"L": "guangzhou",
"O": "k8s",
"OU": "ops"
}
],
"ca": {
"expiry": "175200h"
}
}
EOF
生产证书
- cfssl gencert -initca ca-csr.json | cfssl-json -bare ca –
方式二 Openssl
生产CA证书
- openssl genrsa -out ca.key 2048
-
openssl req -x509 -new -nodes -key ca.key -subj “/CN=k8s” -days 36500 -out ca.crt
kubernetes证书和私钥
openssl genrsa -out server.key 2048
cat >csr.conf <<EOF
[ req ]
default_bits = 2048
prompt = no
default_md = sha256
req_extensions = req_ext
distinguished_name = dn
[ dn ]
C = <country>
ST = <state>
L = <city>
O = <organization>
OU = <organization unit>
CN = k8s
[ req_ext ]
subjectAltName = @alt_names
[ alt_names ]
DNS.1 = kubernetes
DNS.2 = kubernetes.default
DNS.3 = kubernetes.default.svc
DNS.4 = kubernetes.default.svc.cluster
DNS.5 = kubernetes.default.svc.cluster.local
IP.1 = 172.26.6.1
IP.2 = 10.254.0.1
[ v3_ext ]
authorityKeyIdentifier=keyid,issuer:always
basicConstraints=CA:FALSE
keyUsage=keyEncipherment,dataEncipherment
extendedKeyUsage=serverAuth,clientAuth
subjectAltName=@alt_names
EOF
openssl req -new -key server.key -out server.csr -config csr.conf
openssl x509 -req -in server.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out server.crt -days 10000 -extensions v3_ext -extfile csr.conf
openssl x509 -noout -text -in ./server.crt
脚本生成证书懒人模式
cd /opt/k8s/cret/
# github上提供的CA认证的自动生成的证书使用openssl生产的
git clone https://github.com/iKubernetes/k8s-certs-generator.git
# 执行生产证书这里需要填写域名认证详情查看/etc/hosts
bash gencerts.sh etcd
# 查看证书的类型
ls
apiserver-etcd-client.crt
apiserver-etcd-client.key
ca.crt
ca.key
client.crt
client.key
peer.crt
peer.key
server.crt
server.key
创建所有的证书
# 脚本设置证书的参数
[root@k8s-master-1 k8s-certs-generator]# bash gencerts.sh k8s
Enter Domain Name [ilinux.io]: k8s.io # 验证的域名
Enter Kubernetes Cluster Name [kubernetes]: # 证书名称
Enter the IP Address in default namespace
of the Kubernetes API Server[10.96.0.1]: # 一般默认都是这个网段可以做修改
Enter Master servers name[master01 master02 master03]: k8s-master-1 k8s-master-2 k8s-master-3 #根据自己的主机名进行修改
# 查看生产出来的所有的CA证书
[root@k8s-master-1 k8s-certs-generator]# tree kubernetes/
kubernetes/
├── CA
│ ├── ca.crt
│ └── ca.key
├── front-proxy
│ ├── front-proxy-ca.crt
│ ├── front-proxy-ca.key
│ ├── front-proxy-client.crt
│ └── front-proxy-client.key
├── ingress
│ ├── ingress-server.crt
│ ├── ingress-server.key
│ └── patches
│ └── ingress-tls.patch
├── k8s-master-1
│ ├── auth
│ │ ├── admin.conf
│ │ ├── controller-manager.conf
│ │ └── scheduler.conf
│ ├── pki
│ │ ├── apiserver.crt
│ │ ├── apiserver-etcd-client.crt
│ │ ├── apiserver-etcd-client.key
│ │ ├── apiserver.key
│ │ ├── apiserver-kubelet-client.crt
│ │ ├── apiserver-kubelet-client.key
│ │ ├── ca.crt
│ │ ├── ca.key
│ │ ├── front-proxy-ca.crt
│ │ ├── front-proxy-ca.key
│ │ ├── front-proxy-client.crt
│ │ ├── front-proxy-client.key
│ │ ├── kube-controller-manager.crt
│ │ ├── kube-controller-manager.key
│ │ ├── kube-scheduler.crt
│ │ ├── kube-scheduler.key
│ │ ├── sa.key
│ │ └── sa.pub
│ └── token.csv
├── k8s-master-2
│ ├── auth
│ │ ├── admin.conf
│ │ ├── controller-manager.conf
│ │ └── scheduler.conf
│ ├── pki
│ │ ├── apiserver.crt
│ │ ├── apiserver-etcd-client.crt
│ │ ├── apiserver-etcd-client.key
│ │ ├── apiserver.key
│ │ ├── apiserver-kubelet-client.crt
│ │ ├── apiserver-kubelet-client.key
│ │ ├── ca.crt
│ │ ├── ca.key
│ │ ├── front-proxy-ca.crt
│ │ ├── front-proxy-ca.key
│ │ ├── front-proxy-client.crt
│ │ ├── front-proxy-client.key
│ │ ├── kube-controller-manager.crt
│ │ ├── kube-controller-manager.key
│ │ ├── kube-scheduler.crt
│ │ ├── kube-scheduler.key
│ │ ├── sa.key
│ │ └── sa.pub
│ └── token.csv
├── k8s-master-3
│ ├── auth
│ │ ├── admin.conf
│ │ ├── controller-manager.conf
│ │ └── scheduler.conf
│ ├── pki
│ │ ├── apiserver.crt
│ │ ├── apiserver-etcd-client.crt
│ │ ├── apiserver-etcd-client.key
│ │ ├── apiserver.key
│ │ ├── apiserver-kubelet-client.crt
│ │ ├── apiserver-kubelet-client.key
│ │ ├── ca.crt
│ │ ├── ca.key
│ │ ├── front-proxy-ca.crt # 聚合层在整合第三方资源时候需要使用上的认证
│ │ ├── front-proxy-ca.key
│ │ ├── front-proxy-client.crt
│ │ ├── front-proxy-client.key
│ │ ├── kube-controller-manager.crt
│ │ ├── kube-controller-manager.key
│ │ ├── kube-scheduler.crt
│ │ ├── kube-scheduler.key
│ │ ├── sa.key
│ │ └── sa.pub
│ └── token.csv # 令牌认证时候所属的用户名和他所在的组和ID
└── kubelet
├── auth
│ ├── bootstrap.conf
│ └── kube-proxy.conf
└── pki
├── ca.crt
├── kube-proxy.crt
└── kube-proxy.key
16 directories, 80 files
# 复制到相应的节点上去
cp -r k8s-master-1/* /etc/kubernetes/
scp -rp k8s-master-3/* master03:/etc/kubernetes/
scp -rp k8s-master-2/* master02:/etc/kubernetes/
# 分别到master02和master03上去执行
mv /etc/kubernetes/pki /etc/kubernetes/cert
etcd高可用集群
配置结构体
/opt/etcd/
├── certs
│ ├── ca.pem
│ ├── etcd-peer-key.pem
│ └── etcd-peer.pem
└── sh
└── etcd-start.sh
1.创建请求证书
下载etcd文件
- https://github.com/etcd-io/etcd/releases
解压到
- ~/binary-install/etcd/
创建软连接并授权
- ln -s /root/binary-install/etcd/etcdctl /usr/bin/
-
ln -s /root/binary-install/etcd/etcd /usr/bin/
-
ln -s /root/binary-install/bin/* /usr/bin/
-
chmod +x /root/binary-install/etcd/*
创建对等证书生成证书签名请求(csr)的JSON配置文件
- hosts 字段指定授权使用该证书的 etcd 节点 IP 或域名列表,这里将 5个master 都列在其中,用于以后扩展etcd集群
cat > etcd-peer-csr.json <<EOF
{
"CN": "etcd-peer",
"hosts": [
"127.0.0.1",
"172.16.128.60",
"172.16.128.61",
"172.16.128.62",
"172.16.128.63",
"172.16.128.64",
"172.16.128.65"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "guangzhou",
"L": "guangzhou",
"O": "k8s",
"OU": "ops"
}
]
}
EOF
- 签证 : cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=peer etcd-peer-csr.json | cfssl-json -bare etcd-peer
签发client端证书
cat > client-csr.json <<EOF
{
"CN": "k8s-etc-client",
"hosts": [
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "guangzhou",
"L": "guangzhou",
"O": "k8s",
"OU": "ops"
}
]
}
EOF
#执行client证书的签发
[root@k8s-m1 certs]# cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=client client-csr.json | cfssl-json -bare client
#得到的证书
client.csr
client-csr.json
client-key.pem
client.pem
善后事宜
- 创建工作目录: mkdir -p /data/www/{etcd,logs} /opt/etcd/{certs,sh} /data/www/logs/{kube-apiserver,etcd-server}
-
拷贝到etcd工作目录: cp -r ca.pem etcd-peer-key.pem etcd-peer.pem /opt/etcd/certs/
-
拷贝到所有的k8s-master节点: scp -r ca.pem etcd-peer-key.pem etcd-peer.pem master-x:/opt/etcd/certs/
2.创建启动模板
- 创建位置 : /opt/etcd/sh
cat > etcd-service.sh <<EOF
#!/bin/sh
/usr/bin/etcd --name etcd-master01 \
--data-dir /data/www/etcd/ \
--listen-peer-urls https://172.16.128.60:2380 \
--listen-client-urls https://172.16.128.60:2379,http://127.0.0.1:2379 \
--quota-backend-bytes 8000000000 \
--initial-advertise-peer-urls https://172.16.128.60:2380 \
--advertise-client-urls https://172.16.128.60:2379,http://127.0.0.1:2379 \
--initial-cluster etcd-master01=https://172.16.128.60:2380,etcd-master02=https://172.16.128.61:2380,etcd-master03=https://172.16.128.62:2380 \
--ca-file /opt/etcd/certs/ca.pem \
--cert-file /opt/etcd/certs/etcd-peer.pem \
--key-file /opt/etcd/certs/etcd-peer-key.pem \
--client-cert-auth \
--trusted-ca-file /opt/etcd/certs/ca.pem \
--peer-ca-file /opt/etcd/certs/ca.pem \
--peer-cert-file /opt/etcd/certs/etcd-peer.pem \
--peer-key-file /opt/etcd/certs/etcd-peer-key.pem \
--peer-client-cert-auth \
--peer-trusted-ca-file /opt/etcd/certs/ca.pem \
--log-output stdout
参数详细解释
User:指定以 k8s 账户运行;
WorkingDirectory、--data-dir:指定工作目录和数据目录为 /var/lib/etcd,需在启动服务前创建这个目录;
--name:指定节点名称,当 --initial-cluster-state 值为 new 时,--name 的参数值必须位于 --initial-cluster 列表中;
--cert-file、--key-file:etcd server 与 client 通信时使用的证书和私钥;
--trusted-ca-file:签名 client 证书的 CA 证书,用于验证 client 证书;
--peer-cert-file、--peer-key-file:etcd 与 peer 通信使用的证书和私钥;
--peer-trusted-ca-file:签名 peer 证书的 CA 证书,用于验证 peer 证书;
3.创建supervisor配置文件
[program:etcd-server]
command=/opt/etcd/sh/etcd-start.sh
numprocs=1
directory=/opt/etcd
autostart=true
autorestart=true
startsecs=22
startretries=3
exitcodes=0,2
stopsignal=QUIT
stopwaitsecs=10
user=root
redirect_stderr=false
stdout_logfile=/data/www/logs/etcd-server/etcd.log
stdout_logfile_maxbytes=64MB
stdout_logfile_backups=4
stdout_capture_maxbytes=1MB
stdout_events_enabled=false
stderr_logfile=/data/www/logs/etcd-server/etcd-err.log
stderr_logfile_maxbytes=64MB
stderr_logfile_backups=4
stderr_capture_maxbytes=1MB
stderr_events_enabled=false
3个节点的supervisor启动
- supervisorctl update && superviserctl start etcd-server
-
如果有报错去: /data/www/logs/etcd-server/etcd-err.log 查看具体报错原因
健康状态
3.启动并且查看集群状态
#单个etcd健康检查
ETCDCTL_API=3 etcdctl --endpoints=http://127.0.0.1:2379 endpoint health
#多个etcd健康状态检查:
[root@k8s-m1 certs]# ETCDCTL_API=3 etcdctl \
--endpoints="https://172.16.128.60:2379,https://172.16.128.61:2379,https://172.16.128.62:2379" \
--cacert=ca.pem \
--cert=client.pem \
--key=client-key.pem cluster-health
member 22ff37890acab3a0 is healthy: got healthy result from http://127.0.0.1:2379
member 7d048f282a9f3287 is healthy: got healthy result from http://127.0.0.1:2379
member b95acb2a296fae3c is healthy: got healthy result from http://127.0.0.1:2379
cluster is healthy
#查看集群的主节点
etcdctl member list
#查看集群的所有keys
ETCDCTL_API=3 etcdctl \
--endpoints="https://172.16.128.60:2379,https://172.16.128.61:2379,https://172.16.128.62:2379" \
--cacert=ca.pem \
--cert=client.pem \
--key=client-key.pem get / --prefix --keys-only
k8s系统组件部署
创建环境
# 创建运行者的用户及其工作目录
useradd -r kube
mkdir /var/run/kubernetes
chown kube:kube /var/run/kubernetes
1.Apiserver
kubelet 首次启动时向 kube-apiserver 发送 TLS Bootstrapping 请求,kube-apiserver 验证 kubelet 请求中的 token 是否与它配置的 token 一致,如果一致则自动为 kubelet 生成证书和秘钥
完整配置图
/opt/kube-apiserver/
├── certs
│ ├── apiserver-key.pem
│ ├── apiserver.pem
│ ├── ca-key.pem
│ ├── ca.pem
│ ├── client-key.pem
│ └── client.pem
├── conf
│ └── add-audit.yaml
└── sh
└── kube-apiserver.sh
metrics-server证书签发
- 注意: “CN”: “system:metrics-server” 一定是这个,因为后面授权时用到这个名称,否则会报禁止匿名访问
cat > metrics-server-csr.json <<EOF
{
"CN": "system:metrics-server",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "guangzhou",
"L": "guangzhou",
"O": "k8s",
"OU": "system"
}
]
}
EOF
#签证命令
[root@k8s-m1 certs]# cfssl gencert \
-ca=ca.pem \
-ca-key=ca-key.pem \
-config=ca-config.json \
-profile=kubernetes \
metrics-server-csr.json | cfssl-json -bare metrics-server
#拷贝到工作目录
cp -r metrics-server.pem metrics-server-key.pem /opt/kube-apiserver/certs
apiserver签发证书
- 172.16.128.240 为VIP地址keepalived的地址,将未来会扩展的apiserver最好都写上,VIP最好做成域名形式,还有coredns的IP预留
cat > apiserver-csr.json <<EOF
{
"CN": "apiserver",
"hosts": [
"127.0.0.1",
"172.16.128.240",
"kubernetes.default",
"kubernetes.default.svc",
"kubernetes.default.svc.cluster",
"kubernetes.default.svc.cluster.local",
"172.16.128.60",
"172.16.128.61",
"172.16.128.62",
"172.16.128.63",
"172.16.128.64",
"10.168.0.2"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "guangzhou",
"L": "guangzhou",
"O": "k8s",
"OU": "ops"
}
]
}
EOF
手动方式
#Openssl方式
openssl ecparam -name secp521r1 -genkey -nout -out sa.key
openssl ec -in sa.key -outform PEM -pubout -out sa.pub
chomd 0600 sa.key
# 需要定制化可以手工做,不需要则用自动生成的
#cfssl方式(主要方式)
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server apiserver-csr.json | cfssl-json -bare apiserver
#生成的文件
apiserver.csr
apiserver-csr.json
apiserver-key.pem
apiserver.pem
#拷贝到工作目录
cp -r ca.pem client.pem client-key.pem apiserver-key.pem apiserver.pem ca-key.pem /opt/kube-apiserver/certs/
#再拷贝到其他master节点
bootstrap Token证书有两种方式一种是使用固定的证书认证一种是利用kubeadm动态的创建24H过期的证书但是需要给每一个kubelet签发不同的token证书稍微麻烦些,本次采用固定的证书方式
Create token.csv
# 创建bootstrap Token证书
BOOTSTRAP_TOKEN=$(head -c 16 /dev/urandom | od -An -t x | tr -d ' ')
cat > token.csv <<EOF
${BOOTSTRAP_TOKEN},kubelet-bootstrap,10001,"system:kubelet-bootstrap"
EOF
[root@k8s-m1 tmp]# cat token.csv
5c2acbd76fc3883824a2b4c5641f539e,kubelet-bootstrap,10001,"system:kubelet-bootstrap"
生成apiserver启动文件
- 注意点 : 证书位置确保正确,etcd确保正确
#!/bin/bash
/usr/bin/kube-apiserver \
--apiserver-count 3 \
--audit-log-path /data/www/logs/kube-apiserver/audit-log \
--audit-policy-file /opt/kube-apiserver/conf/add-audit.yaml \
--authorization-mode RBAC \
--client-ca-file /opt/kube-apiserver/certs/ca.pem \
--requestheader-client-ca-file /opt/kube-apiserver/certs/ca.pem \
--enable-admission-plugins NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,MutatingAdmissionWebhook,ValidatingAdmissionWebhook,ResourceQuota \
--enable-bootstrap-token-auth true \
--etcd-cafile /opt/kube-apiserver/certs/ca.pem \
--etcd-certfile /opt/kube-apiserver/certs/client.pem \
--etcd-keyfile /opt/kube-apiserver/certs/client-key.pem \
--etcd-servers https://172.16.128.60:2379,https://172.16.128.61:2379,https://172.16.128.62:2379 \
--service-account-key-file /opt/kube-apiserver/certs/ca-key.pem \
--service-cluster-ip-range 10.168.0.0/16 \
--service-node-port-range 3000-29999 \
--target-ram-mb=1024 \
--kubelet-client-certificate /opt/kube-apiserver/certs/client.pem \
--kubelet-client-key /opt/kube-apiserver/certs/client-key.pem \
--logtostderr=true \
--log-dir /data/logs/kubernetes/kube-apiserver \
--tls-cert-file /opt/kube-apiserver/certs/apiserver.pem \
--tls-private-key-file /opt/kube-apiserver/certs/apiserver-key.pem \
--experimental-encryption-provider-config /opt/kube-apiserver/conf/encryption-config.yaml \
--proxy-client-cert-file /opt/kube-apiserver/certs/metrics-server.pem \
--proxy-client-key-file /opt/kube-apiserver/certs/metrics-server-key.pem \
--requestheader-extra-headers-prefix X-Remote-Extra- \
--requestheader-group-headers X-Remote-Group \
--requestheader-username-headers X-Remote-User \
--runtime-config api/all=true \
--enable-aggregator-routing true \
--token-auth-file /opt/kube-apiserver/conf/token.csv \
--requestheader-allowed-names aggregator \
--requestheader-extra-headers-prefix X-Remote-Extra- \
--requestheader-group-headers X-Remote-Group \
--requestheader-username-headers X-Remote-User \
--allow-privileged true \
--v 4
EOF
参数关键项目解释
--experimental-encryption-provider-config:启用加密特性;
--authorization-mode=Node,RBAC: 开启 Node 和 RBAC 授权模式,拒绝未授权的请求;
--enable-admission-plugins:启用 ServiceAccount 和 NodeRestriction;
--service-account-key-file:签名 ServiceAccount Token 的公钥文件,kube-controller-manager 的 --service-account-private-key-file 指定私钥文件,两者配对使用;
--tls-*-file:指定 apiserver 使用的证书、私钥和 CA 文件。--client-ca-file 用于验证 client (kue-controller-manager、kube-scheduler、kubelet、kube-proxy 等)请求所带的证书;
--kubelet-client-certificate、--kubelet-client-key:如果指定,则使用 https 访问 kubelet APIs;需要为证书对应的用户(上面 kubernetes*.pem 证书的用户为 kubernetes) 用户定义 RBAC 规则,否则访问 kubelet API 时提示未授权;
--bind-address: 不能为 127.0.0.1,否则外界不能访问它的安全端口 6443;
--insecure-port=0:关闭监听非安全端口(8080);
--service-cluster-ip-range: 指定 Service Cluster IP 地址段;
--service-node-port-range: 指定 NodePort 的端口范围;
--runtime-config=api/all=true: 启用所有版本的 APIs,如 autoscaling/v2alpha1;
--enable-bootstrap-token-auth:启用 kubelet bootstrap 的 token 认证;
--apiserver-count=3:指定集群运行模式,多台 kube-apiserver 会通过 leader 选举产生一个工作节点,其它节点处于阻塞状态;
User=k8s:使用 k8s 账户运行;
配置add-audit.yaml
apiVersion: audit.k8s.io/v1beta1 # This is required.
kind: Policy
# Don't generate audit events for all requests in RequestReceived stage.
omitStages:
- "RequestReceived"
rules:
# Log pod changes at RequestResponse level
- level: RequestResponse
resources:
- group: ""
# Resource "pods" doesn't match requests to any subresource of pods,
# which is consistent with the RBAC policy.
resources: ["pods"]
# Log "pods/log", "pods/status" at Metadata level
- level: Metadata
resources:
- group: ""
resources: ["pods/log", "pods/status"]
# Don't log requests to a configmap called "controller-leader"
- level: None
resources:
- group: ""
resources: ["configmaps"]
resourceNames: ["controller-leader"]
# Don't log watch requests by the "system:kube-proxy" on endpoints or services
- level: None
users: ["system:kube-proxy"]
verbs: ["watch"]
resources:
- group: "" # core API group
resources: ["endpoints", "services"]
# Don't log authenticated requests to certain non-resource URL paths.
- level: None
userGroups: ["system:authenticated"]
nonResourceURLs:
- "/api*" # Wildcard matching.
- "/version"
# Log the request body of configmap changes in kube-system.
- level: Request
resources:
- group: "" # core API group
resources: ["configmaps"]
# This rule only applies to resources in the "kube-system" namespace.
# The empty string "" can be used to select non-namespaced resources.
namespaces: ["kube-system"]
# Log configmap and secret changes in all other namespaces at the Metadata level.
- level: Metadata
resources:
- group: "" # core API group
resources: ["secrets", "configmaps"]
# Log all other resources in core and extensions at the Request level.
- level: Request
resources:
- group: "" # core API group
- group: "extensions" # Version of group should NOT be included.
# A catch-all rule to log all other requests at the Metadata level.
- level: Metadata
# Long-running requests like watches that fall under this rule will not
# generate an audit event in RequestReceived.
omitStages:
- "RequestReceived"
启动apiserver
netstat -lnput|grep 6443 #查看端口是否已经监听
# 进行etcd健康检查
ETCDCTL_API=3 etcdctl \
--endpoints=https://etcd01.k8s.io:2379 \
--cacert=/opt/k8s/cert/ca.crt \
--cert=/opt/k8s/cert/client.crt \
--key=/opt/k8s/cert/client.key \
get /registry/ --prefix --keys-only
配置kubectl
[root@k8s-master-1 auth]# mkdir ~/.kube
[root@k8s-master-1 auth]# kubectl config view --kubeconfig=/etc/kubernetes/auth/admin.conf
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: DATA+OMITTED
server: https://k8s-master-1.k8s.io:6443
name: kubernetes
contexts:
- context:
cluster: kubernetes
user: k8s-admin # 当前集群的用户
name: k8s-admin@kubernetes
current-context: k8s-admin@kubernetes
kind: Config
preferences: {}
users:
- name: k8s-admin
user:
client-certificate-data: REDACTED
client-key-data: REDACTED
创建ClusterRoleBinding授予用户组操作的权限
# 将之前创建的共享Token的用户bootstrap和用户组绑定到内部专门做RABA授权上来
# 如果不设置会导致Node节点无法Join到master中
# 可以绑定用户也可以绑定组二选一
[root@k8s-master-1 kubernetes]# cat token.csv
e96666.837357b060f2573b,"system:bootstrapper",10001,"system:bootstrappers"
[root@k8s-master-1 kubernetes]# kubectl create clusterrolebinding system:bootstrapper --user=system:bootstrapper --clusterrole=system:node-bootstrapper
clusterrolebinding.rbac.authorization.k8s.io/system:bootstrapper created
检查当前的集群信息
[root@k8s-master-1 kubernetes]# kubectl get svc kubernetes
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 48m
[root@k8s-master-1 kubernetes]# kubectl cluster-info
Kubernetes master is running at https://k8s-master-1.k8s.io:6443
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
[root@k8s-master-1 kubernetes]# kubectl get all --all-namespaces
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 49m
[root@k8s-master-1 kubernetes]# kubectl get componentstatuses
NAME STATUS MESSAGE ERROR
controller-manager Unhealthy Get http://127.0.0.1:10252/healthz: dial tcp 127.0.0.1:10252: connect: connection refused # 这里还没有安装
scheduler Unhealthy Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused # 这里还没有安装
etcd-0 Healthy {"health":"true"}
etcd-2 Healthy {"health":"true"}
etcd-1 Healthy {"health":"true"}
# 注意事项
如果执行 kubectl 命令式时输出如下错误信息,则说明使用的 ~/.kube/config 文件不对,请切换到正确的账户后再执行该命令: The connection to the server localhost:8080 was refused - did you specify the right host or port? 执行 kubectl get componentstatuses 命令时,apiserver 默认向 127.0.0.1 发送请求。当 controller-manager、scheduler 以集群模式运行时,有可能和 kube-apiserver 不在一台机器上,这时 controller-manager 或 scheduler 的状态为 Unhealthy,但实际上它们工作正常。
授予kubernetes证书访问kubelet-API的权限
# 在执行 kubectl exec、run、logs 等命令时,apiserver 会转发到 kubelet。这里定义 RBAC 规则,授权 apiserver 调用 kubelet API
kubectl create clusterrolebinding kube-apiserver:kubelet-apis --clusterrole=system:kubelet-api-admin --user kubernetes
2.Controller-Manager
- 与 kube-apiserver 的安全端口通信时;
- 在安全端口(https,10252) 输出 prometheus 格式的 metrics;
配置文件
- mkdir -p /opt/kube-controller-manager/{certs,sh} /data/www/logs/kube-controller-manager/
-
cd /opt/kube-controller-manager/sh
-
确保对应的证书在指定的目录下
cat > controller-run.sh <<EOF
#!/bin/bash
/usr/bin/kube-controller-manager \
--cluster-cidr 172.7.0.0/16 \
--leader-elect true \
--log-dir /data/logs/kubernetes/kube-controller-manager \
--master http://127.0.0.1:8080 \
--allocate-node-cidrs true \
--service-account-private-key-file /opt/kube-controller-manager/certs/ca-key.pem \
--service-cluster-ip-range 10.168.0.0/16 \
--root-ca-file /opt/kube-controller-manager/certs/ca.pem \
# --cluster-signing-cert-file /opt/kube-controller-manager/certs/ca.pem \
# --cluster-signing-key-file /opt/kube-controller-manager/certs/ca-key.pem \
#证书自动续约
--feature-gates RotateKubeletServerCertificate=true \
--experimental-cluster-signing-duration 86700h0m0s \
--node-monitor-grace-periodn 40s \
--node-monitor-period 5s \
--controllers=*,bootstrapsigner,tokencleaner \
--pod-eviction-timeout 5m0s \
--v 4
EOF
参数详解
--port=0:关闭监听 http /metrics 的请求,同时 --address 参数无效,--bind-address 参数有效;
--secure-port=10252、--bind-address=0.0.0.0: 在所有网络接口监听 10252 端口的 https /metrics 请求;
--kubeconfig:指定 kubeconfig 文件路径,kube-controller-manager 使用它连接和验证 kube-apiserver;
--cluster-signing-*-file:签名 TLS Bootstrap 创建的证书;
--experimental-cluster-signing-duration:指定 TLS Bootstrap 证书的有效期;
--root-ca-file:放置到容器 ServiceAccount 中的 CA 证书,用来对 kube-apiserver 的证书进行校验;
--service-account-private-key-file:签名 ServiceAccount 中 Token 的私钥文件,必须和 kube-apiserver 的 --service-account-key-file 指定的公钥文件配对使用;
--service-cluster-ip-range :指定 Service Cluster IP 网段,必须和 kube-apiserver 中的同名参数一致;
--leader-elect=true:集群运行模式,启用选举功能;被选为 leader 的节点负责处理工作,其它节点为阻塞状态;
--feature-gates=RotateKubeletServerCertificate=true:开启 kublet server 证书的自动更新特性;
--controllers=*,bootstrapsigner,tokencleaner:启用的控制器列表,tokencleaner 用于自动清理过期的 Bootstrap token;
--horizontal-pod-autoscaler-*:custom metrics 相关参数,支持 autoscaling/v2alpha1;
--tls-cert-file、--tls-private-key-file:使用 https 输出 metrics 时使用的 Server 证书和秘钥;
--use-service-account-credentials=true:
设置supervisor启动参数
cat > /etc/supervisord.d/kube-controller-managser.ini <<EOF
[program:kube-controller-manager]
command=/opt/kube-controller-manager/sh/controller-run.sh
numprocs=1
directory=/opt/kube-controller-manager/sh
autostart=true
autorestart=true
startsecs=22
startretries=3
exitcodes=0,2
stopsignal=QUIT
stopwaitsecs=10
user=root
redirect_stderr=false
stdout_logfile=/data/www/logs/kube-controller-manager/controll-stdout.log
stdout_logfile_maxbytes=64MB
stdout_logfile_backups=4
stdout_capture_maxbytes=1MB
stdout_events_enabled=false
stderr_logfile=/data/www/logs/kube-controller-manager/controll-stderr.log
stderr_logfile_maxbytes=64MB
stderr_logfile_backups=4
stderr_capture_maxbytes=1MB
stderr_events_enabled=false
EOF
启动服务
supervisorctl update
supervsiorctl start kube-controller-manager
查看集群controller-manager leader info
[root@k8s-master-1 etcd]# kubectl get endpoints kube-controller-manager --namespace=kube-system -o yaml
apiVersion: v1
kind: Endpoints
metadata:
annotations:
control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"k8s-master-02_f114f5b9-6ee7-11e9-8b96-000c29299465","leaseDurationSeconds":15,"acquireTime":"2019-05-05T03:45:09Z","renewTime":"2019-05-05T03:45:38Z","leaderTransitions":11}'
creationTimestamp: "2019-05-05T03:36:38Z"
name: kube-controller-manager
namespace: kube-system
resourceVersion: "6536"
selfLink: /api/v1/namespaces/kube-system/endpoints/kube-controller-manager
uid: fd25486d-6ee6-11e9-aaef-000c29ccaf3e
[root@k8s-master-1 etcd]# kubectl get componentstatuses # 检查集群状态信息
3.scheduler
- mkdir -p /opt/kube-scheduler/{sh,certs} /data/www/logs/kube-scheduler/
配置文件信息
cat > /opt/kube-scheduler/sh/scheduler-run.sh <<EOF
#!bin/bash
/usr/bin/kube-scheduler \
--master http://127.0.0.1:8080 \
--leader-elect true \
--alsologtostderr true \
--log-dir /data/www/logs/kubernetes/kube-scheduler \
--logtostderr false \
--v=2
EOF
参数详解
--address:在 127.0.0.1:10251 端口接收 http /metrics 请求;kube-scheduler 目前还不支持接收 https 请求;
--kubeconfig:指定 kubeconfig 文件路径,kube-scheduler 使用它连接和验证 kube-apiserver;
--leader-elect=true:集群运行模式,启用选举功能;被选为 leader 的节点负责处理工作,其它节点为阻塞状态;
配置supervisor文件
[program:kube-scheduler]
command=sh /opt/kube-scheduler/sh/scheduler-run.sh
numprocs=1
directory=/opt/kube-scheduler
autostart=true
autorestart=true
startsecs=22
startretries=3
exitcodes=0,2
stopsignal=QUIT
stopwaitsecs=10
user=root
redirect_stderr=false
stdout_logfile=/data/www/logs/kube-scheduler/scheduler-stdout.log
stdout_logfile_maxbytes=64MB
stdout_logfile_backups=4
stdout_capture_maxbytes=1MB
stdout_events_enabled=false
stderr_logfile=/data/www/logs/kube-scheduler/scheduler-stderr.log
stderr_logfile_maxbytes=64MB
stderr_logfile_backups=4
stderr_capture_maxbytes=1MB
stderr_events_enabled=false
查看当前的Leader
[root@k8s-master-1 system]# kubectl get endpoints kube-scheduler --namespace=kube-system -o yaml
apiVersion: v1
kind: Endpoints
metadata:
annotations:
control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"k8s-master-1_cac7d876-6ee9-11e9-a46c-000c29ccaf3e","leaseDurationSeconds":15,"acquireTime":"2019-05-05T03:56:58Z","renewTime":"2019-05-05T03:58:06Z","leaderTransitions":1}'
creationTimestamp: "2019-05-05T03:55:01Z"
name: kube-scheduler
namespace: kube-system
resourceVersion: "15410"
selfLink: /api/v1/namespaces/kube-system/endpoints/kube-scheduler
uid: 8ebfa2c2-6ee9-11e9-aaef-000c29ccaf3e
查看当前的组件信息
高可用校验
# 做到这个时候我们手动关闭k8s-master-01节点看看etcd/control-manage/scheduler是否会故障转移
# 首先先将配置文件从master01传递到其他的master节点上去
scp -r ~/.kube master02/master03:~/
# 停止master01机器
[root@k8s-master-02 ~]# kubectl get pod
Unable to connect to the server: dial tcp 192.168.42.170:6443: connect: no route to host
# 如果出现这个问题则修改配置文件vim ~/.kube/config 下图展示停止master01后集群任然高可用
分发到其他Master节点
- 分发无状态配置文件: for i in {master02,master03};do scp -r kube-apiserver kube-scheduler kube-controller-manager $i:/opt;done
-
启动组件: for i in {master02,master03};do ssh $i “supervisorctl update && supervisorctl start all”;done
-
查看状态: kubectl get cs
6.kubelet
文件结构图
- tmp-bootstrap目录是
动态签发bootstrap-token
时候使用的如果使用固定证书可以忽略
/opt/kubelet/
├── certs
│ ├── apiserver-key.pem
│ ├── apiserver.pem
│ ├── ca-key.pem
│ ├── ca.pem
│ ├── client-key.pem
│ ├── client.pem
│ ├── kubelet-key.pem
│ └── kubelet.pem
├── config
│ ├── bootstrap.kubeconfig
│ ├── csr-crb.yaml
│ ├── kubelet-config.json
│ └── kubelet.kubeconfig
├── sh
│ └── kubelet-run.sh
└── tmp-bootstrap
├── kubelet-bootstrap-k8s-master01.kubeconfig
├── kubelet-bootstrap-k8s-master02.kubeconfig
├── kubelet-bootstrap-k8s-master03.kubeconfig
├── kubelet-bootstrap-k8s-node01.kubeconfig
├── kubelet-bootstrap-k8s-node02.kubeconfig
├── kubelet-bootstrap-k8s-node03.kubeconfig
├── kubelet-bootstrap-k8s-node04.kubeconfig
├── kubelet-bootstrap-k8s-node05.kubeconfig
└── kubelet-bootstrap-k8s-node06.kubeconfig
建立TLS bootstrap secret来提供自动签证使用
设置config文件
- 如果为了方便可以不用给每一个节点都创建一个Bootstrap,只需要创建一个共享使用即可,以下是创建所有的
#如果APIserver不指定token则用自动生成(如果有指定则忽略--token-auth-file /opt/kube-apiserver/conf/token.csv)
export BOOTSTRAP_TOKEN=$(kubeadm token create \
--description kubelet-bootstrap-token \
--groups system:bootstrappers:${node_node_name} \
--kubeconfig ~/.kube/config)
# 定义所有集群的名称
node_node_name="k8s-master01 k8s-master02 k8s-master03 k8s-node01 k8s-node02 k8s-node03 k8s-node04 k8s-node05 k8s-node06"
export BOOTSTRAP_TOKEN=`cat /opt/kube-apiserver/conf/token.csv`
#复制以下所有执行
for node_node_name in $node_node_name
do
echo This Bootstrap Host is $node_node_name
# 创建 token(该步骤会对所有的k8s节点进行bootstrap证书创建)
# 设置集群参数
kubectl config set-cluster kubernetes \
--certificate-authority=/root/certs/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=kubelet-bootstrap-${node_node_name}.kubeconfig
# 设置客户端认证参数
kubectl config set-credentials kubelet-bootstrap \
--token=${BOOTSTRAP_TOKEN} \
--kubeconfig=kubelet-bootstrap-${node_node_name}.kubeconfig
# 设置上下文参数
kubectl config set-context default \
--cluster=kubernetes \
--user=kubelet-bootstrap \
--kubeconfig=kubelet-bootstrap-${node_node_name}.kubeconfig
# 设置默认上下文
kubectl config use-context default --kubeconfig=kubelet-bootstrap-${node_node_name}.kubeconfig
done
# 授权kubelet可以创建 csr
kubectl create clusterrolebinding kubeadm:kubelet-bootstrap \
--clusterrole system:node-bootstrapper --group system:bootstrappers
# 查看创建出来的token
kubeadm token list
- 拷贝到所有的节点:
for i in node_node_name;do scp -r kubelet-bootstrap-i.kubeconfig $i:/opt/kubelet/config/bootstrap.kubeconfig
自动批准csr
cat <<EOF | kubectl apply -f -
# Approve all CSRs for the group "system:bootstrappers"
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: auto-approve-csrs-for-group
subjects:
- kind: Group
name: system:bootstrappers
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: system:certificates.k8s.io:certificatesigningrequests:nodeclient
apiGroup: rbac.authorization.k8s.io
EOF
允许kubelet自己
cat <<EOF | kubectl apply -f -
# Approve renewal CSRs for the group "system:nodes"
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: auto-approve-renewals-for-nodes
subjects:
- kind: Group
name: system:nodes
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: system:certificates.k8s.io:certificatesigningrequests:selfnodeclient
apiGroup: rbac.authorization.k8s.io
EOF
- 绑定组关系:
kubectl create clusterrolebinding system-node-role-bound --clusterrole=system:node --group=system:nodes
准备证书
- 签发kubelet的证书
-
hosts字段最好预先规划好之后要增加的节点信息
-
mkdir -p /opt/kubelet/{sh,certs,config}
cat > /root/certs/kubelet-csr.json <<EOF
{
"CN": "kubelet-node",
"hosts": [
"127.0.0.1",
"172.16.128.60",
"172.16.128.61",
"172.16.128.62",
"172.16.128.63",
"172.16.128.64",
"172.16.128.65",
"172.16.128.66",
"172.16.128.67",
"172.16.128.68",
"172.16.128.69",
"172.16.128.70",
"172.16.128.71",
"172.16.128.72"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "guangzhou",
"L": "guangzhou",
"O": "k8s",
"OU": "ops"
}
]
}
EOF
生成证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server kubelet-csr.json | cfssl-json -bare kubelet
#kubelet需要使用的证书
apiserver-key.pem
apiserver.pem
ca-key.pem
ca.pem
client-key.pem
client.pem
kubelet-key.pem
kubelet.pem
创建admin证书
cat > /root/certs/admin-csr.json <<EOF
{
"CN": "admin",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "guangzhou",
"L": "guangzhou",
"O": "system:masters",
"OU": "system"
}
]
}
EOF
- 生产证书: cfssl gencert -ca-key=ca-key.pem -ca=ca.pem -config=ca-config.json -profile=kubernetes admin-csr.json | cfssl-json -bare admin
设置集群用户查看对应集群状态
- 创建用户kubernetes并且指定rolebingding
-
cd /opt/kubelet/config
#1.创建Cluster kubernetes(--serve指定的为你创建的keepalived的VIP地址)
kubectl config set-cluster kubernetes \
--certificate-authority=/root/certs/ca.pem \
--embed-certs=true \
--server=https://172.16.128.240:7443 \
--kubeconfig=kubelet.kubeconfig
#2.创建用户
kubectl config set-credentials admin \
--client-certificate=/root/certs/admin.pem \
--client-key=/root/certs/admin-key.pem \
--embed-certs=true \
--kubeconfig=kubelet.kubeconfig
#3.设置用户上下文context
kubectl config set-context kubernetes-context \
--cluster=kubernetes \
--user=admin \
--kubeconfig=kubelet.kubeconfig
#4.切换到用户kubernetes的context上下文
kubectl config use-context kubernetes-context --kubeconfig=kubelet.kubeconfig
绑定k8s系统RBCA权限
- 建立绑定关系:
kubectl apply -f /opt/kubelet/config/k8s-node.yaml
-
检查是否bingding成功:
kubectl get clusterrolebinding k8s-node
cat > /opt/kubelet/config/k8s-node.yaml <<EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: k8s-node
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:node
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: User
name: k8s-node
EOF
准备pause镜像
- pause:是一个在pod启动之前最先启动的镜像,hold住pod的namespace接受来自cni和网络IP的设置,初始化网络空间UTS空间然后一直处于peeding状态等待后续pod加入他的namespace中共享同一个网络协议栈
必须存在否则pod起不来
生成kubelet-config.json配置文件
cat > /opt/kubelet/config/kubelet-config.json <<EOF
{
"kind": "KubeletConfiguration",
"apiVersion": "kubelet.config.k8s.io/v1beta1",
"authentication": {
"x509": {
"clientCAFile": "/opt/kubelet/certs/ca.pem"
},
"webhook": {
"enabled": true,
"cacheTTL": "2m0s"
},
"anonymous": {
"enabled": false
}
},
"authorization": {
"mode": "Webhook",
"webhook": {
"cacheAuthorizedTTL": "5m0s",
"cacheUnauthorizedTTL": "30s"
}
},
"tlsCertFile": "/opt/kubelet/certs/kubelet.pem",
"tlsPrivateKeyFile": "/opt/kubelet/certs/kubelet-key.pem",
"address": "0.0.0.0",
"port": 10250,
"readOnlyPort": 0,
"cgroupDriver": "systemd",
"hairpinMode": "promiscuous-bridge",
"serializeImagePulls": false,
"rotateCertificates": true,
"featureGates": {
"RotateKubeletClientCertificate": true,
"RotateKubeletServerCertificate": true
},
"maxPods": 512,
"failSwapOn": false,
"containerLogMaxSize": "10Mi",
"containerLogMaxFiles": 5,
"clusterDomain": "cluster.local.",
"clusterDNS": ["10.254.0.2"]
}
EOF
创建启动文件
- 根据每一个节点的名称修改(必须和kube-proxy保持一致) –hostname-override
cat > /opt/kubelet/sh/kubelet-run.sh <<EOF
#!/bin/bash
/usr/bin/kubelet \
--hostname-override k8s-master01 \
--kubeconfig /opt/kubelet/config/kubelet.kubeconfig \
--config /opt/kubelet/config/kubelet-config.json \
--bootstrap-kubeconfig /opt/kubelet/config/bootstrap.kubeconfig \
--log-dir /data/www/logs/kube-kubelet \
--pod-infra-container-image harbor.ym/k8s/pause:latest \
--cni-conf-dir /root/binary-install/cni/ \
--root-dir /data/kubelet \
--v 5
EOF
创建kubelet启动脚本
cat > /etc/supervisord.d/kube-kubelet.ini <<EOF
[program:kube-kubelet]
command=sh /opt/kubelet/sh/kubelet-run.sh
numprocs=1
directory=/opt/kubelet/sh/
autostart=true
autorestart=true
startsecs=22
startretries=3
exitcodes=0,2
stopsignal=QUIT
stopwaitsecs=10
user=root
redirect_stderr=false
stdout_logfile=/data/www/logs/kube-kubelet/kubelet.stdout.log
stdout_logfile_maxbytes=64MB
stdout_logfile_backups=4
stdout_capture_maxbytes=1MB
stdout_events_enabled=false
stderr_logfile=/data/www/logs/kube-kubelet/kubelet.stderr.log
stderr_logfile_maxbytes=64MB
stderr_logfile_backups=4
stderr_capture_maxbytes=1MB
stderr_events_enabled=false
EOF
#启动kubelet
mkdir -p /data/www/logs/kube-kubelet && supervisorctl update
效果图
- 将配置分发到其他master节点和node节点
- kubelet和kube-proxy的hostname需要保持一致才能使用切记在分发给所有node节点时候更新
kubelet的 --hostname-override
参数项和kuber-proxy的hostnameOverride
参数 - 改配置和拷贝数据到其他master:
for i in {master02,master03};do scp -r kubelet kube-proxy i:/opt&& ssh i sed -i "s#master01#i#g" /opt/kubelet/sh/kubelet-run.sh && sed -i "s#master01#i#g" /opt/kube-proxy/sh/kube-proxy-run.sh ;done
- 拷贝supervisor文件并启动:
for i in {master02,master03};do ssh $i "mkdir -p /data/www/logs/kube-kubelet /data/www/logs/kube-proxy && supervisorctl update && supervisorctl start kubelet kube-proxy";done
-
查看节点状态: curl -sSL –cacert /root/certs/ca.pem –cert /root/certs/admin.pem –key /root/certs/admin-key.pem https://172.16.128.240:8443/api/v1/nodes/k8s-node01/proxy/configz | jq ‘.kubeletconfig|.kind=”KubeletConfiguration”|.apiVersion=”kubelet.config.k8s.io/v1beta1″‘
- kubelet和kube-proxy的hostname需要保持一致才能使用切记在分发给所有node节点时候更新
-
故障问题查看 :https://blog.csdn.net/qq_21816375/article/details/86693867
最终效果图
给节点打上标签和误点
# 给所有的master打上污点
master="k8s-master01 k8s-master02 k8s-master03"
for a in $master;do kubectl taint nodes $a node-role.kubernetes.io/master="":NoSchedule ;done
# 给所有的node节点标签
nodename="node01 node02 node03 node04 node05 node06"
for i in $nodename;do kubectl label node k8s-$i node-role.kubernetes.io/worker=worker ;done
5.coredns
- mkdir -p ~/binary-install/coredns/
-
执行克隆:
git clone https://github.com/coredns/deployment.git
# 1.指定特定的CoreDNS IP
[root@k8s-m1 kubernetes]# vim /root/binary-install/coredns/deployment/kubernetes/deploy.sh
111 if [[ -z $CLUSTER_DNS_IP ]]; then
112 # Default IP to kube-dns IP
113 #CLUSTER_DNS_IP=$(kubectl get service --namespace kube-system kube-dns -o jsonpath="{.spec.clusterIP}")
# 修改这个指定IP地址(默认是从集群中获取)这个IP必须和kube-proxy中保持一致
114 CLUSTER_DNS_IP='10.168.0.2'
# 2.执行创建CoreDNS
[root@k8s-m1 kubernetes]# ./deploy.sh | kubectl apply -f -
# 3.查看是否成功创建
[root@k8s-m1 kubernetes]# kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-59c6ddbf5d-p5vvw 1/1 Running 0 157m
检验是否成功
cat > ~/ns/busybox-test.yaml <<EOF
apiVersion: v1
kind: Pod
metadata:
name: busybox
namespace: default
spec:
containers:
- name: busybox
image: busybox:1.28.4
command:
- sleep
- "3600"
imagePullPolicy: IfNotPresent
restartPolicy: Always
EOF
4.网络组件
网络选择方案 | draft |
---|---|
flannel | true |
calico | true |
- kubernetes 要求集群内各节点(包括 master 节点)能通过 Pod 网段互联互通。flannel 使用 vxlan 技术为各节点创建一个可以互通的 Pod 网络,使用的端口为 UDP 8472,需要开放该端口如果使用公有云
Calico
安装calico网络,使用IPIP模式
Flanneld
- flannel 第一次启动时,从 etcd 获取 Pod 网段信息,为本节点分配一个未使用的 /24 段地址,然后创建 flannel.1(也可能是其它名称,如 flannel1 等) 接口
- flannel 将分配的 Pod 网段信息写入 /run/flannel/docker 文件,docker 后续使用这个文件中的环境变量设置 docker0 网桥
- 缺点flannel不支持网络策略
下载二进制文件
- 下载地址版本自由选择 -> https://github.com/coreos/flannel/releases
下载地址:https://github.com/coreos/flannel/releases
wget https://github.com/coreos/flannel/releases/download/v0.12.0/flannel-v0.12.0-linux-amd64.tar.gz
tar xvf flannel-v0.11.0-linux-amd64.tar.gz
cp flanneld /opt/k8s/bin/
cp mk-docker-opts.sh /opt/k8s/bin/
ln -s /root/binary-install/flanneld/flanneld /usr/bin/
目录结构
/opt/flanneld/
├── certs
│ ├── ca.pem
│ ├── flanneld-key.pem
│ └── flanneld.pem
├── config
│ └── subnet.env
└── sh
└── flannel-run.sh
证书签证
cat > /root/certs/flanneld-csr.json <<EOF
{
"CN": "flanneld",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "guangzhou",
"L": "guangzhou",
"O": "k8s",
"OU": "ops"
}
]
}
EOF
# 签发证书
cfssl gencert -ca=ca.pem \
-ca-key=ca-key.pem \
-config=ca-config.json \
-profile=kubernetes flanneld-csr.json | cfssl-json -bare flanneld
# 创建目录
mkdir -p /opt/flanneld/{certs,sh,config} /data/www/log/flanneld
# 拷贝文件
cp -r ca.pem flanneld-key.pem flanneld.pem /opt/flanneld/certs
# 分发到所有节点
scp -r xxxxx
写入etcd网段地址
- Network : 设置指定的网段
[root@k8s-m1 certs]# ETCDCTL_API=3;etcdctl \
--endpoints="https://172.16.128.60:2379,https://172.16.128.61:2379,https://172.16.128.62:2379" \
--ca-file=/opt/flanneld/certs/ca.pem \
--cert-file=/opt/flanneld/certs/flanneld.pem \
--key-file=/opt/flanneld/certs/flanneld-key.pem \
set /kubernetes/network/config '{"Network":"10.168.0.0/16", "SubnetLen": 24, "Backend": {"Type": "vxlan"}}'
#输出的信息
{"Network":"10.168.0.0/16", "SubnetLen": 24, "Backend": {"Type": "vxlan"}}
- 查看某一Pod网段对应的
flanneld
进程监听的IP
和网络参数
[root@k8s-m1 certs]# ETCDCTL_API=3;etcdctl \
--endpoints="https://172.16.128.60:2379,https://172.16.128.61:2379,https://172.16.128.62:2379" \
--ca-file=/opt/flanneld/certs/ca.pem \
--cert-file=/opt/flanneld/certs/flanneld.pem \
--key-file=/opt/flanneld/certs/flanneld-key.pem \
get /kubernetes/network/subnets/10.168.76.0-24
[root@k8s-m1 certs]# ETCDCTL_API=3;etcdctl --endpoints="https://172.16.128.60:2379,https://172.16.128.61:2379,https://172.16.128.62:2379" --ca-file=/opt/flanneld/certs/ca.pem --cert-file=/opt/flanneld/certs/flanneld.pem --key-file=/opt/flanneld/certs/flanneld-key.pem set /coreos.com/network/config '{ "Network": "172.17.0.0/16", "Backend": {"Type":"vxlan"}}'
平滑迁移flannel到Calico
提示:生产中切记不要这么操作风险极大,最好的办法是把业务迁移到新的集群中在进行相应的备份后方可更换网络插件,十分不建议生产后更改网络方式
备份好etcd
ETCDCTL_API=3 etcdctl \
> --endpoints="https://172.16.128.60:2379,https://172.16.128.61:2379,https://172.16.128.62:2379" \
> --cacert=ca.pem \
> --cert=client.pem \
> --key=client-key.pem snapshot save ~/back/etcd-snapshot-`date +%Y%m%d`.db
准备启动文件
cat > /opt/flanneld/sh/flannel-run.sh <<EOF
#!/bin/bash
/usr/bin/flanneld \
--public-ip=172.16.128.60 \
--etcd-cafile=/opt/flanneld/certs/ca.pem \
--etcd-certfile=/opt/flanneld/certs/flanneld.pem \
--etcd-keyfile=/opt/flanneld/certs/flanneld-key.pem \
--etcd-endpoints https://172.16.128.60:2379,https://172.16.128.61:2379,https://172.16.128.62:2379 \
--etcd-prefix=/kubernetes/network \
--iface=eth0 \
--subnet-file=/opt/flanneld/config/subnet.env \
--healthz-port=2401
EOF
准备supervisor文件
cat > /etc/supervisord.d/kube-flanneld.ini <<EOF
[program:kube-flanneld]
command=/opt/flanneld/sh/flannel-run.sh
numprocs=1
directory=/opt/flanneld/
autostart=true
autorestart=true
startsecs=22
startretries=3
exitcodes=0,2
stopsignal=QUIT
stopwaitsecs=10
user=root
redirect_stderr=false
stdout_logfile=/data/www/logs/flanneld/flanneld-stdout.log
stdout_logfile_maxbytes=64MB
stdout_logfile_backups=4
stdout_capture_maxbytes=1MB
stdout_events_enabled=false
stderr_logfile=/data/www/logs/flanneld/flanneld-stderr.log
stderr_logfile_maxbytes=64MB
stderr_logfile_backups=4
stderr_capture_maxbytes=1MB
stderr_events_enabled=false
EOF
- 更新配置: supervisorctl update
-
查看日志: tailf /data/www/logs/flanneld/
7.kube-porxy
Kube-proxy是实现Service的关键插件,kube-proxy会在每台节点上执行,然后监听API Server的Service与Endpoint资源物件的改变,然后来依据变化执行iptables来实现网路的转发。如果为了扩展性建议一个DaemonSet来执行,并且建立一些需要的Certificates。但是这里是完全二进制部署。
目录结构
/opt/kube-proxy/
├── certs
│ ├── ca.pem
│ ├── kube-proxy-key.pem
│ └── kube-proxy.pem
├── config
│ ├── kube-proxy.kubeconfig
│ └── kube-proxy.yaml
└── sh
└── kube-proxy-run.sh
签证
- 创建工作目录: mkdir -p /opt/kube-proxy/{sh,certs,config} /data/www/logs/kube-proxy
-
去到制作证书的目录: /root/certs/
cat > /root/certs/kube-proxy-csr.json <<EOF
{
"CN": "system:kube-proxy",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "guangzhou",
"L": "guangzhou",
"O": "system:masters",
"OU": "system"
}
]
}
EOF
#签证
cfssl gencert \
-ca-key=ca-key.pem \
-ca=ca.pem \
-config=ca-config.json \
-profile=kubernetes kube-proxy-csr.json | cfssl-json -bare kube-proxy
#拷贝指定的证书
cp -r ca.pem kube-proxy.pem kube-proxy-key.pem /opt/kube-proxy/certs
创建kube-proxy.kubeconfig文件
cd /opt/kube-proxy/config/
# 配置集群(记得改VIP地址)
kubectl config set-cluster kubernetes \
--certificate-authority=/root/certs/ca.pem \
--embed-certs=true \
--server=https://VIP:port \
--kubeconfig=kube-proxy.kubeconfig
# 配置客户端认证
kubectl config set-credentials kube-proxy \
--client-certificate=/opt/kube-proxy/certs/kube-proxy.pem \
--client-key=/opt/kube-proxy/certs/kube-proxy-key.pem \
--embed-certs=true \
--kubeconfig=kube-proxy.kubeconfig
# 配置关联
kubectl config set-context default \
--cluster=kubernetes \
--user=kube-proxy \
--kubeconfig=kube-proxy.kubeconfig
# 配置默认关联
kubectl config use-context default --kubeconfig=kube-proxy.kubeconfig
# 检查状态
kubectl config view --kubeconfig=kube-proxy.kubeconfig
配置kube-proxy.yaml
- 需要修改成和本机kubelet一致的host名称:
hostnameOverride: k8s-master01
cat > /opt/kube-proxy/config/kube-proxy.yaml <<EOF
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 0.0.0.0
clientConnection:
kubeconfig: /opt/kube-proxy/config/kube-proxy.kubeconfig
clusterCIDR: 10.168.0.0/16
healthzBindAddress: 127.0.0.1:10256
hostnameOverride: k8s-master01
kind: KubeProxyConfiguration
metricsBindAddress: 172.16.128.60:10249
mode: "ipvs"
EOF
创建启动文件
cat > /opt/kube-proxy/sh/kube-proxy-run.sh <<EOF
#!/bin/sh
/usr/bin/kube-proxy \
--config=/opt/kube-proxy/config/kube-proxy.yaml \
--logtostderr=true \
--v=4
EOF
配置supervisor
cat > /etc/supervisord.d/kube-proxy.ini <<EOF
[program:kube-proxy]
command=sh /opt/kube-proxy/sh/kube-proxy-run.sh
numprocs=1
directory=/opt/kube-proxy/
autostart=true
autorestart=true
startsecs=22
startretries=3
exitcodes=0,2
stopsignal=QUIT
stopwaitsecs=10
user=root
redirect_stderr=false
stdout_logfile=/data/www/logs/kube-proxy/proxy.log
stdout_logfile_maxbytes=64MB
stdout_logfile_backups=4
stdout_capture_maxbytes=1MB
stdout_events_enabled=false
stderr_logfile=/data/www/logs/kube-proxy/proxy-err.log
stderr_logfile_maxbytes=64MB
stderr_logfile_backups=4
stderr_capture_maxbytes=1MB
stderr_events_enabled=false
<<EOF
- 更新配置
supervisorctl update
-
检查日志
tailf /data/www/logs/kube-proxy/proxy-err.log
- 将配置分发到其他master节点和node节点
8.监控组件
在了解Metrics-Server之前,必须要事先了解下Metrics API的概念。Metrics API相比于之前的监控采集方式(hepaster)是一种新的思路,官方希望核心指标的监控应该是稳定的,版本可控的,且可以直接被用户访问(例如通过使用 kubectl top 命令),或由集群中的控制器使用(如HPA),和其他的Kubernetes APIs一样。官方废弃heapster项目,就是为了将核心资源监控作为一等公民对待,即像pod、service那样直接通过api-server或者client直接访问,不再是安装一个hepater来汇聚且由heapster单独管理。
- 用于支持自动扩缩容的 CPU/memory HPA metrics:metrics-server
-
通用的监控方案:使用第三方可以获取 Prometheus 格式监控指标的监控系统,如 Prometheus Operator
-
事件传输:使用第三方工具来传输、归档 kubernetes events
-
如果有harbor最好提前去下载镜像改名然后再改下资源列表清单
-
下载资源列表清单地址: metrics-server-github地址
-
最新Bug K8S>1.19.x 必须把docker升级到19版本且重启所有节点的kubelet
结构体
/root/binary-install/metrics/
├── aggregated-metrics-reader.yaml
├── auth-delegator.yaml
├── auth-reader.yaml
├── metrics-apiservice.yaml
├── metrics-server-deployment.yaml
├── metrics-server-service.yaml
└── resource-reader.yaml
修改参数
- metrics-server-deployment.yaml
# 可以去官方wget发布的realse
vim metrics-server-deployment.yaml
# 需要修改的配置
spec:
.....
spec:
hostNetwork: true #
args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname,InternalDNS,ExternalDNS
- --kubelet-use-node-status-port
#不要验证Kubelets提供的服务证书的CA。仅用于测试目的。
- --kubelet-insecure-tls
# 如果有健康检测的403报错则可以加这个
--authorization-always-allow-paths=/livez,/readyz,/openapi/v2
- 授权:
kubectl create clusterrolebinding custom-metric-with-cluster-admin --clusterrole=cluster-admin --user=system:metrics-server
- 检测检测的节点:
kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes" | jq .
9.k8s-Dashboard
部署步骤
# 下载yaml文件
wget -O kubernetes-dashboard.yaml https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.1/aio/deploy/recommended.yaml
签发证书
# 生成证书配置文件
cat > dashboard-csr.json <<EOF
{
"CN": "k8s-dashboard",
"hosts": [
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "guangzhou",
"L": "guangzhou",
"O": "k8s",
"OU": "ops"
}
]
}
EOF
# 签证
cfssl gencert \
-ca-key=ca-key.pem \
-ca=ca.pem \
-config=ca-config.json \
-profile=kubernetes dashboard-csr.json | cfssl-json -bare dashboard