1 - 介绍

Ubuntu Server的介绍

1.1 - 介绍Ubuntu Server

介绍ubuntu server

ubuntu server

Ubuntu是一个广泛应用于个人电脑,智能手机,服务器,云计算,以及智能物联网设备的开源操作系统。

而 Ubuntu server 是目前公共云上最流行的云客户机操作系统。

选择的理由

个人选择 ubuntu server 的主要理由是日常桌面一直用的是 ubuntu 系列, 包括之前用 ubuntu 和最近改用基于 ubuntu 的 mint linux,已经比较习惯 ubuntu 的使用,也习惯了 apt-get 等工具的使用。

因此在服务器上也偏好 ubuntu server 系列。

1.2 - Ubuntu Server版本

介绍ubuntu server的版本

ubuntu server 有多个主流版本,安全起见一般建议用 LTS 版本。

ubuntu server 23.04

23.04 版本自带的 Linux 内核版本为 6.2.0-20 。

ubuntu server 22.04

22.04.2 版本自带的 Linux 内核版本为 5.15.0-60 。

ubuntu server 20.04

20.04.3 版本自带的 Linux 内核版本为 5.4.0 。

ubuntu server 18.04

比较老了,除非必要,不推荐使用。

1.3 - Ubuntu Server资料收集

收集ubuntu server的资料

官方资料

中文网站

  • Ubuntu 中文 Wiki: 据说是 Ubuntu 官方所认可的社区,担负发行版中文翻译、开发、以及提供本地化支持的职能。

2 - 安装

Ubuntu Server的安装、配置和基础软件

2.1 - 安装Ubuntu Server

Ubuntu Server的安装

注意事项

安装时关闭网络

ubuntu server 在安装完成之后,会自动进入更新状态,然后由于没有设置国内的源,会导致速度很慢,时间会长达30分钟,而且无法中断,完全浪费时间。

因此,安装时的最佳实践是不带网络安装,这样安装过程非常快(SSD硬盘的话大概3分钟)。安装完成之后,设置好源,再进行 apt

物理机安装

常规安装,没啥特殊。

我习惯的磁盘分区:

  1. EFI 分区: 200或者300M
  2. 根目录: / 除了 timeshift 分区之外的空间都给这里
  3. timeshift 备份分区: 一般留 50-100g 作为 timeshift 的备份分区,非常实用。

一般建议用分区软件先行分好区再进行安装,因为 ubuntu 安装器的设置比较简陋。但如果是ubuntu单独占用一整块硬盘,设置简单也可以用 ubuntu 安全器。

选择自定义方方式:

  1. “reformat”: 先清理整块硬盘,去除所有现有分区
  2. “use as boot device”:设置硬盘为启动盘,这样安装器会默认创建 efi 分区,大小为 512m
  3. “add gpt partition”: 在硬盘的空余空间中创建一个分区,占用除 timeshift 分区之外的所有空间,格式化为 ext4,挂载为 /
  4. “add gpt partition”: 在剩余空间中创建分区,格式化为 ext4, 挂载为 /timeshift

备注:和windows一起安装时,不需要划分额外的 ESP 分区,ubuntu server在安装时会自动选择 windows 所在的 ESP 分区,而且无法改动。(新版本没有确认过)

安装过程中必选安装 openssh 服务。

虚拟机安装

ubuntu server 的安装非常简单,如果是用 vmware 安装,则更加的简单,vmware 会自动完成安装过程中的设置,直接自动安装完成。

ESXi安装

在 ESXi 中新建虚拟机,选择 ubuntu server 的 iso 启动,一路安装。

bios 启动选择 efi,开启安全启动。

raid安装

参考:

不能简单的利用 “Create Software RAID (md) " 功能直接把两块硬盘做成 raid0/1,这样会因为没有启动分区而报错:

If you put all disks into RAIDS or LVM VGs, there will be nowhere to put the boot partition.

基本思路是按照普通安装 ubuntu server 的方式,分别对两块硬盘做分区,并保持分区方案的一致。分区包括:

  • esp 分区:只在第一块硬盘上使用,第二块硬盘只要保持同样大小
  • 用作 “/” 的分区
  • 用作 “/timeshift” 的分区

然后再将准备好的这两个分区分别建立 raid0 md,然后分别格式化为 ext4,挂载为 “/” 和 “/timeshift” 。

然后正常安装 ubuntu server。

2.2 - 安装后常出现的问题

Ubuntu Server 安装后常出现的问题

登录后出现无法连接到 ubuntu.com 的错误提示

ssh 登录后,有时(主要是 ubuntu 22.04 下)会遇到这样的提示:

ssh sky@192.168.0.152
Welcome to Ubuntu 22.04.2 LTS (GNU/Linux 5.15.0-77-generic x86_64)

......

0 updates can be applied immediately.

Enable ESM Apps to receive additional future security updates.
See https://ubuntu.com/esm or run: sudo pro status

Failed to connect to https://changelogs.ubuntu.com/meta-release-lts. Check your Internet connection or proxy settings


Last login: Mon Jun 26 08:10:26 2023 from 192.168.0.90

解决的方式如下:

sudo rm /var/lib/ubuntu-release-upgrader/release-upgrade-available
/usr/lib/ubuntu-release-upgrader/release-upgrade-motd

重新登录就不会再看到错误提示了:

ssh sky@192.168.0.152
Welcome to Ubuntu 22.04.2 LTS (GNU/Linux 5.15.0-77-generic x86_64)

 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/advantage

  System information as of Mon Jun 26 03:15:45 PM UTC 2023

  System load: 0.00537109375      Memory usage: 2%   Processes:       258
  Usage of /:  1.5% of 441.87GB   Swap usage:   0%   Users logged in: 0


Expanded Security Maintenance for Applications is not enabled.

0 updates can be applied immediately.

Enable ESM Apps to receive additional future security updates.
See https://ubuntu.com/esm or run: sudo pro status


Last login: Mon Jun 26 08:31:13 2023 from 192.168.0.90

参考:

2.3 - 配置Timeshift

安装配置timeshift对ubuntu系统进行备份和恢复

2.3.1 - timeshift的介绍

timeshift的介绍

Timeshift 是一款自由开源工具,可创建文件系统的增量快照。可以使用 RSYNC 或 BTRFS 两种方式创建快照。

项目地址:

https://github.com/teejee2008/timeshift

参考资料:

在 ubuntu server 安装完成之后,第一时间安装 timeshift 进行备份,后续配置过程中随时可以回滚。

2.3.2 - timeshift的安装配置

安装并配置timeshift

安装

sudo apt install timeshift

完成之后看一下:

$ sudo timeshift

Timeshift v20.03 by Tony George (teejeetech@gmail.com)

Syntax:

  timeshift --check
  timeshift --create [OPTIONS]
  timeshift --restore [OPTIONS]
  timeshift --delete-[all] [OPTIONS]
  timeshift --list-{snapshots|devices} [OPTIONS]

配置

默认安装后,在第一次运行前,我们需要修改 timeshift 的配置文件,否则 timeshift 会默认找到一个 ext4 分区作为备份区。

看一下目前的硬盘情况:

$ sudo fdisk -l
Disk /dev/loop0: 55.45 MiB, 58130432 bytes, 113536 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/loop1: 70.32 MiB, 73728000 bytes, 144000 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/loop2: 32.3 MiB, 33865728 bytes, 66144 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/nvme0n1: 838.37 GiB, 900185481216 bytes, 219771846 sectors
Disk model: MZ1LB960HBJR-000FB                      
Units: sectors of 1 * 4096 = 4096 bytes
Sector size (logical/physical): 4096 bytes / 4096 bytes
I/O size (minimum/optimal): 131072 bytes / 131072 bytes
Disklabel type: gpt
Disk identifier: 7C431E31-78CA-4600-9C2F-C68D10E793CC

Device             Start       End   Sectors  Size Type
/dev/nvme0n1p1       256    131327    131072  512M EFI System
/dev/nvme0n1p2    131328 196739327 196608000  750G Linux filesystem
/dev/nvme0n1p3 196739328 219771391  23032064 87.9G Linux filesystem

这里的 /dev/nvme0n1p3 是我为 timeshift 预留的分区,存放在 nvme 磁盘上,以保证备份和恢复的速度。

$ sudo lsblk -f
NAME        FSTYPE   LABEL UUID                                 FSAVAIL FSUSE% MOUNTPOINT
loop0       squashfs                                                  0   100% /snap/core18/2128
loop1       squashfs                                                  0   100% /snap/lxd/21029
loop2       squashfs                                                  0   100% /snap/snapd/12704
nvme0n1                                                                        
├─nvme0n1p1 vfat           72C9-B4E4                             504.9M     1% /boot/efi
├─nvme0n1p2 ext4           a83415e6-ed69-4932-9d08-1e87d7510dc1  689.1G     1% /
└─nvme0n1p3 ext4           9b22569d-9410-48cc-b994-10257b2d0498   81.5G     0% /run/timeshift/backup

记录 nvme0n1p3 的 uuid ,然后修改配置, sudo vi /etc/timeshift/timeshift.json 打开后设置 backup_device_uuid 为 nvme0n1p3 的 uuid :

{
  "backup_device_uuid" : "9b22569d-9410-48cc-b994-10257b2d0498",
  "parent_device_uuid" : "",
  "do_first_run" : "true",
  "btrfs_mode" : "false",
  "include_btrfs_home" : "false",
  "stop_cron_emails" : "true",
  "schedule_monthly" : "false",
  "schedule_weekly" : "false",
  "schedule_daily" : "false",
  "schedule_hourly" : "false",
  "schedule_boot" : "false",
  "count_monthly" : "2",
  "count_weekly" : "3",
  "count_daily" : "5",
  "count_hourly" : "6",
  "count_boot" : "5",
  "snapshot_size" : "0",
  "snapshot_count" : "0",
  "exclude" : [
  ],
  "exclude-apps" : [
  ]
}

执行timeshift命令,就能看到配置生效了:

sudo timeshift --list
First run mode (config file not found)
Selected default snapshot type: RSYNC

/dev/nvme0n1p3 is mounted at: /run/timeshift/backup, options: rw,relatime,stripe=32

Device : /dev/nvme0n1p3
UUID   : 9b22569d-9410-48cc-b994-10257b2d0498
Path   : /run/timeshift/backup
Mode   : RSYNC
Status : No snapshots on this device
First snapshot requires: 0 B

No snapshots found

2.3.3 - 创建timeshift快照

通过create命令创建timeshift快照进行备份

创建快照

命令介绍

常见快照的命令为:

sudo timeshift --create --comments "first backup after install" --tags O

tags的类型:

  • O: Ondemand,默认值,一般用于手工创建快照
  • B: Boot
  • H: Hourly
  • D: Daily
  • W: Weekly
  • M: Monthly

示例

这是创建的第一个快照,操作系统和 timeshift 安装完成之后的第一个快照:

$ sudo timeshift --create --comments "first backup after install" 

/dev/nvme0n1p6 is mounted at: /run/timeshift/backup, options: rw,relatime

------------------------------------------------------------------------------
Estimating system size...
Creating new snapshot...(RSYNC)
Saving to device: /dev/nvme0n1p6, mounted at path: /run/timeshift/backup
Synching files with rsync...
Created control file: /run/timeshift/backup/timeshift/snapshots/2022-01-06_08-19-32/info.json
RSYNC Snapshot saved successfully (28s)
Tagged snapshot '2022-01-06_08-19-32': ondemand

完成后查看:

$ sudo timeshift --list

/dev/nvme0n1p6 is mounted at: /run/timeshift/backup, options: rw,relatime

Device : /dev/nvme0n1p6
UUID   : 208eb500-fd49-4580-b4ea-3b126d5b0fe4
Path   : /run/timeshift/backup
Mode   : RSYNC
Status : OK
1 snapshots, 96.1 GB free

Num     Name                 Tags  Description                 
------------------------------------------------------------------------------
0    >  2022-01-06_08-19-32  O     first backup after install  

2.4 - 基本配置

Ubuntu Server安装后的基本配置工作

2.4.1 - 设置更新源

设置Ubuntu Server的更新源

设置更新源

在进行任何更新和软件安装前,建议先设置更新源,以保证速度。

如果服务器在国内,则可以考虑设置apt源为国内代理,这样速度要好很多。

首先备份源列表:

sudo cp /etc/apt/sources.list /etc/apt/sources.list_original
sudo vi /etc/apt/sources.list

然后修改 /etc/apt/sources.list 文件.

ubuntu 23.04

阿里云源 :

deb http://mirrors.aliyun.com/ubuntu/ lunar main restricted universe multiverse
# deb-src http://mirrors.aliyun.com/ubuntu/ lunar main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ lunar-security main restricted universe multiverse
# deb-src http://mirrors.aliyun.com/ubuntu/ lunar-security main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ lunar-updates main restricted universe multiverse
# deb-src http://mirrors.aliyun.com/ubuntu/ lunar-updates main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ lunar-proposed main restricted universe multiverse
# deb-src http://mirrors.aliyun.com/ubuntu/ lunar-proposed main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ lunar-backports main restricted universe multiverse
# deb-src http://mirrors.aliyun.com/ubuntu/ lunar-backports main restricted universe multiverse

中科大源:

deb https://mirrors.ustc.edu.cn/ubuntu/ lunar main restricted universe multiverse
# deb-src https://mirrors.ustc.edu.cn/ubuntu/ lunar main restricted universe multiverse
deb https://mirrors.ustc.edu.cn/ubuntu/ lunar-updates main restricted universe multiverse
# deb-src https://mirrors.ustc.edu.cn/ubuntu/ lunar-updates main restricted universe multiverse
deb https://mirrors.ustc.edu.cn/ubuntu/ lunar-backports main restricted universe multiverse
# deb-src https://mirrors.ustc.edu.cn/ubuntu/ lunar-backports main restricted universe multiverse
deb https://mirrors.ustc.edu.cn/ubuntu/ lunar-security main restricted universe multiverse
# deb-src https://mirrors.ustc.edu.cn/ubuntu/ lunar-security main restricted universe multiverse
deb https://mirrors.ustc.edu.cn/ubuntu/ lunar-proposed main restricted universe multiverse
# deb-src https://mirrors.ustc.edu.cn/ubuntu/ lunar-proposed main restricted universe multiverse

ubuntu 22.04

阿里云源 (不知道为什么特别慢):

deb http://mirrors.aliyun.com/ubuntu/ jammy main restricted universe multiverse
# deb-src http://mirrors.aliyun.com/ubuntu/ jammy main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ jammy-security main restricted universe multiverse
# deb-src http://mirrors.aliyun.com/ubuntu/ jammy-security main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ jammy-updates main restricted universe multiverse
# deb-src http://mirrors.aliyun.com/ubuntu/ jammy-updates main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ jammy-proposed main restricted universe multiverse
# deb-src http://mirrors.aliyun.com/ubuntu/ jammy-proposed main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ jammy-backports main restricted universe multiverse
# deb-src http://mirrors.aliyun.com/ubuntu/ jammy-backports main restricted universe multiverse

中科大源(非常快):

deb https://mirrors.ustc.edu.cn/ubuntu/ jammy main restricted universe multiverse
# deb-src https://mirrors.ustc.edu.cn/ubuntu/ jammy main restricted universe multiverse
deb https://mirrors.ustc.edu.cn/ubuntu/ jammy-updates main restricted universe multiverse
# deb-src https://mirrors.ustc.edu.cn/ubuntu/ jammy-updates main restricted universe multiverse
deb https://mirrors.ustc.edu.cn/ubuntu/ jammy-backports main restricted universe multiverse
# deb-src https://mirrors.ustc.edu.cn/ubuntu/ jammy-backports main restricted universe multiverse
deb https://mirrors.ustc.edu.cn/ubuntu/ jammy-security main restricted universe multiverse
# deb-src https://mirrors.ustc.edu.cn/ubuntu/ jammy-security main restricted universe multiverse
deb https://mirrors.ustc.edu.cn/ubuntu/ jammy-proposed main restricted universe multiverse
# deb-src https://mirrors.ustc.edu.cn/ubuntu/ jammy-proposed main restricted universe multiverse

ubuntu 20.04

阿里云源:

deb http://mirrors.aliyun.com/ubuntu/ focal main restricted universe multiverse
# deb-src http://mirrors.aliyun.com/ubuntu/ focal main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ focal-security main restricted universe multiverse
# deb-src http://mirrors.aliyun.com/ubuntu/ focal-security main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ focal-updates main restricted universe multiverse
# deb-src http://mirrors.aliyun.com/ubuntu/ focal-updates main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ focal-proposed main restricted universe multiverse
# deb-src http://mirrors.aliyun.com/ubuntu/ focal-proposed main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ focal-backports main restricted universe multiverse
# deb-src http://mirrors.aliyun.com/ubuntu/ focal-backports main restricted universe multiverse

中科大镜像源:

deb https://mirrors.ustc.edu.cn/ubuntu/ focal main restricted universe multiverse
deb-src https://mirrors.ustc.edu.cn/ubuntu/ focal main restricted universe multiverse
deb https://mirrors.ustc.edu.cn/ubuntu/ focal-updates main restricted universe multiverse
deb-src https://mirrors.ustc.edu.cn/ubuntu/ focal-updates main restricted universe multiverse
deb https://mirrors.ustc.edu.cn/ubuntu/ focal-backports main restricted universe multiverse
deb-src https://mirrors.ustc.edu.cn/ubuntu/ focal-backports main restricted universe multiverse
deb https://mirrors.ustc.edu.cn/ubuntu/ focal-security main restricted universe multiverse
deb-src https://mirrors.ustc.edu.cn/ubuntu/ focal-security main restricted universe multiverse
deb https://mirrors.ustc.edu.cn/ubuntu/ focal-proposed main restricted universe multiverse
deb-src https://mirrors.ustc.edu.cn/ubuntu/ focal-proposed main restricted universe multiverse

对于腾讯云/阿里云上的服务器,默认都配置好了腾讯云和阿里云的源,直接用即可,速度超好。

偶尔会遇到阿里云或者中科大源不可用的情况(以前没遇到过,最近很频繁),可以切换其他源试试。

网易163源:

deb http://mirrors.163.com/ubuntu/ focal main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ focal-security main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ focal-updates main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ focal-proposed main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ focal-backports main restricted universe multiverse
# deb-src http://mirrors.163.com/ubuntu/ focal main restricted universe multiverse
# deb-src http://mirrors.163.com/ubuntu/ focal-security main restricted universe multiverse
# deb-src http://mirrors.163.com/ubuntu/ focal-updates main restricted universe multiverse
# deb-src http://mirrors.163.com/ubuntu/ focal-proposed main restricted universe multiverse
# deb-src http://mirrors.163.com/ubuntu/ focal-backports main restricted universe multiverse

清华源:

deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ eoan main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ eoan main restricted universe multiverse
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ eoan-updates main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ eoan-updates main restricted universe multiverse
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ eoan-backports main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ eoan-backports main restricted universe multiverse
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ eoan-security main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ eoan-security main restricted universe multiverse
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ eoan-proposed main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ eoan-proposed main restricted universe multiverse

更新系统

先更新各种软件到最新,简单的 apt upgrade 搞定:

sudo apt update
sudo apt upgrade

取消PPA仓库

当添加太多的 PPA 仓库之后,apt update 的速度就会慢很多。

考虑到大多数软件不会经常更新,而且我们也没有立即更新的迫切需求,因此建议取消这些 PPA 仓库。

终端操作

PPA仓库存储位置:

$ cd /etc/apt/sources.list.d
$ ls -l
git-core-ubuntu-ppa-focal.list

打开具体要取消的PPA仓库,如上面的git的ppa仓库文件,注释掉相关的内容。

不要直接删,以后如果需要更新,可以手工取消注释而不用重新找ppa仓库地址再添加。

UI操作

“开始菜单” -> “系统管理” -> “软件源” -> “PPA”, 将不需要及时更新的软件的 PPA 取消。

2.4.2 - 修改hostname

修改Ubuntu Server的hostname

背景

在 esxi 等虚拟平台上复制虚拟机时,就会遇到 hostname 重复的问题,这时最好是能在虚拟机复制之后永久性的修改 hostname 。

查看当前 hostname :

hostname

ubuntu 20.04

可以通过 hostnamectl 来修改:

sudo hostnamectl set-hostname newNameHere

完成后再额外修改一下 hosts 文件中的 hostname:

sudo nano /etc/hosts

完成后重启即可:

sudo reboot

参考资料

2.4.3 - SSH登录

使用新用户SSH登录服务器

安装openssh(可选)

安装ssh软件,以便从外部直接ssh进来,而不用在服务器上操作。通常选择安装openssh,有些服务器默认有安装openssh,如果服务器默认没有安装ssh软件,则:

sudo apt-get install openssh-server

使用密码远程登录

执行:

ssh sky@ubuntu.server.ip

输入密码即可。

使用密钥远程登录

为了进一步方便使用,不必每次输入密码, 还可以通过authorized_keys 的方式来自动登录。

上传本机的 .ssh/id_isa.pub 文件到ubuntu server服务器端:

scp ~/.ssh/id_rsa.pub sky@192.168.0.10:/home/sky 

在ubuntu server服务器上运行:

mkdir -p .ssh
touch ~/.ssh/authorized_keys
cat ~/id_rsa.pub >> ~/.ssh/authorized_keys

以后简单输入 “ssh ubuntu.server.ip” 即可自动登录。

快捷登录

如果服务器ip地址固定,则可以进一步简化,修改本机的/etc/hosts文件,加入下面内容:

# local machine list
192.168.100.10            skywork
192.168.100.20            skywork2
192.168.100.30            skydownload
192.168.100.40            skyserver
192.168.100.50            skyserver2
192.168.100.60            skyserver3

以后简单输入 “ssh skyserver” 即可。

指定特定端口登录

ssh默认采用22端口,对于需要进行端口映射导致不能访问22端口的情况,需要在ssh时通过 -p 参数指定具体的端口。

如下面的命令,有一台服务器的22端口是通过路由器的2122端口进行端口映射,则远程ssh连接的命令为:

ssh -p 2122 sky@dev.sky.io

修改本机的 ~/.bash_profile 或者 ~/.zshrc 文件,加入以下内容,以后就可以一个简单命令直接ssh到远程服务器了:

# ssh to home
alias sshwork="ssh sky@skywork"
alias sshwork2="ssh sky@skywor2"
alias sshserver="ssh sky@skyserver"
alias sshserver2="ssh sky@skyserver2"
alias sshserver3="ssh sky@skyserver3"

特别补充

腾讯云

在腾讯云上购买的linux主机,使用 SSH 登录:

https://cloud.tencent.com/document/product/1207/44643

特别提醒:首次通过本地 SSH 客户端登录 Linux 实例之前,您需要重置默认用户名(root)的密码,或者绑定密钥。否则会直接报错。

2.4.4 - 安装配置git

安装配置git

参见:

https://skyao.io/learning-git/docs/installation/

2.4.5 - 安装配置zsh作为默认shell

安装配置zsh和ohmyzsh,替代默认的bash

背景

zsh的功能极其强大,只是配置过于复杂,起初只有极客才在用。后来,一些极客为了让更多人受益,开发了Oh-my-zsh这个插件,用来简化zsh的配置,让zsh更加容易上手。

官网地址:

https://github.com/ohmyzsh/ohmyzsh

安装

首先安装 zsh:

sudo apt install zsh zsh-doc

然后安装ohmyzsh:

sh -c "$(wget -O- https://raw.githubusercontent.com/ohmyzsh/ohmyzsh/master/tools/install.sh)"

DNS 污染问题:

如果遇到 DNS 污染,导致 raw.githubusercontent.com 被解析到 127.0.0.1 或者 0.0.0.1 导致无法访问。需要修改 hosts 文件:

sudo vi /etc/hosts

增加一行:

199.232.68.133 raw.githubusercontent.com

中途询问是否把zsh作为默认 shell 时选择Y:

Do you want to change your default shell to zsh? [Y/n] Y
Changing the shell...

配置

关闭自动粘贴转义

Oh-my-zsh 默认开启自动粘贴转义,容易造成问题,建议关闭。

vi ~/.zshrc

修改 .zshrc 文件:

# Uncomment the following line if pasting URLs and other text is messed up.
# 把这里的注释取消即可
DISABLE_MAGIC_FUNCTIONS="true"

配置插件

https://github.com/ohmyzsh/ohmyzsh/wiki/Plugins

Oh-my-zsh 默认将插件存放在 ~/.oh-my-zsh/plugins 目录下,数量非常多:

➜  ~ cd .oh-my-zsh/plugins 
➜  plugins git:(master) ls
adb                composer        frontend-search           ipfs              n98-magerun            redis-cli      terraform
ag                 copybuffer      fzf                       isodate           nanoc                  repo           textastic
aliases            copydir         gas                       iterm2            ng                     ripgrep        textmate
alias-finder       copyfile        gatsby                    jake-node         nmap                   ros            thefuck
ansible            cp              gb                        jenv              node                   rsync          themes
ant                cpanm           gcloud                    jfrog             nomad                  ruby           thor
apache2-macports   dash            geeknote                  jhbuild           npm                    rust           tig
arcanist           debian          gem                       jira              npx                    rustup         timer
archlinux          deno            genpass                   jruby             nvm                    rvm            tmux
asdf               dircycle        gh                        jsontools         oc                     safe-paste     tmux-cssh
autoenv            direnv          git                       jump              octozen                salt           tmuxinator
autojump           dirhistory      git-auto-fetch            kate              osx                    samtools       torrent
autopep8           dirpersist      git-escape-magic          keychain          otp                    sbt            transfer
aws                django          git-extras                kitchen           pass                   scala          tugboat
battery            dnf             gitfast                   knife             paver                  scd            ubuntu
bazel              dnote           git-flow                  knife_ssh         pep8                   screen         ufw
bbedit             docker          git-flow-avh              kops              percol                 scw            universalarchive
bedtools           docker-compose  github                    kubectl           per-directory-history  sdk            urltools
bgnotify           docker-machine  git-hubflow               kubectx           perl                   sfdx           vagrant
boot2docker        doctl           gitignore                 kube-ps1          perms                  sfffe          vagrant-prompt
bower              dotenv          git-lfs                   lando             phing                  shell-proxy    vault
branch             dotnet          git-prompt                laravel           pip                    shrink-path    vim-interaction
brew               droplr          glassfish                 laravel4          pipenv                 singlechar     vi-mode
bundler            drush           globalias                 laravel5          pj                     spring         virtualenv
cabal              eecms           gnu-utils                 last-working-dir  please                 sprunge        virtualenvwrapper
cake               emacs           golang                    lein              pm2                    ssh-agent      vscode
cakephp3           ember-cli       gpg-agent                 lighthouse        pod                    stack          vundle
capistrano         emoji           gradle                    lol               postgres               sublime        wakeonlan
cargo              emoji-clock     grails                    lxd               pow                    sublime-merge  wd
cask               emotty          grc                       macports          powder                 sudo           web-search
catimg             encode64        grunt                     magic-enter       powify                 supervisor     wp-cli
celery             extract         gulp                      man               profiles               suse           xcode
chruby             fabric          hanami                    marked2           pyenv                  svcat          yarn
chucknorris        fancy-ctrl-z    helm                      mercurial         pylint                 svn            yii
cloudfoundry       fasd            heroku                    meteor            python                 svn-fast-info  yii2
codeclimate        fastfile        history                   microk8s          rails                  swiftpm        yum
coffee             fbterm          history-substring-search  minikube          rake                   symfony        z
colemak            fd              hitchhiker                mix               rake-fast              symfony2       zbell
colored-man-pages  firewalld       hitokoto                  mix-fast          rand-quote             systemadmin    zeus
colorize           flutter         homestead                 mongocli          rbenv                  systemd        zoxide
command-not-found  fnm             httpie                    mosh              rbfu                   taskwarrior    zsh-interactive-cd
common-aliases     forklift        invoke                    mvn               react-native           terminitor     zsh-navigation-tools
compleat           fossil          ionic                     mysql-macports    rebar                  term_tab       zsh_reload

比较常用的:

  • git
  • golang
  • rust / rustup
  • docker / docker-compose / docker-machine
  • kubectl
  • npm / node
  • mvn
  • sudo
  • helm
  • redis-cli
  • ubuntu / ufw
  • wd
  • zsh-autosuggestions
  • zsh-syntax-highlighting
  • history-substring-search

小结

最后启动的插件如下所示:

plugins=(git golang rust docker docker-compose docker-machine kubectl npm node mvn sudo helm redis-cli ubuntu ufw wd zsh-autosuggestions zsh-syntax-highlighting history-substring-search)

配置样式

https://github.com/ohmyzsh/ohmyzsh/wiki/Themes

暂时先用默认。

附录:常用插件用法

git插件

开启 git 插件后就可以使用以下简写命令了:

更多细节参见:

https://github.com/ohmyzsh/ohmyzsh/tree/master/plugins/git

wd插件

可以快速切换到常用目录。

先安装 wd:

wget --no-check-certificate https://github.com/mfaerevaag/wd/raw/master/install.sh -O - | sh

使用方法:

# 进入某个目录
cd work/code/learning
pwd
/home/sky/work/code/learning
# 添加到wd
wd add learning
# 之后就可以用wd命令直接进入了
wd learning

# 方便期间,常见的学习笔记都加入wd,以l为前缀
/home/sky/work/code/learning/learning-rust
wd add lrust

详细使用参考:https://github.com/mfaerevaag/wd

sudo插件

按 2 次 esc 会在命令前自动输入 sudo

zsh-autosuggestions插件

根据历史记录智能自动补全命令,输入命令时会以暗色补全,按方向键右键完成输入。

安装命令:

git clone https://github.com/zsh-users/zsh-autosuggestions $ZSH_CUSTOM/plugins/zsh-autosuggestions

zsh-syntax-highlighting插件

fish shell 风格的语法高亮插件。输入的命令根据主题自动高亮。

安装命令:

git clone https://github.com/zsh-users/zsh-syntax-highlighting.git $ZSH_CUSTOM/plugins/zsh-syntax-highlighting

history-substring-search 插件

历史命令搜索插件,如果和 zsh-syntax-highlighting 插件共用,要配置到语法高亮插件之后。

安装命令:

git clone https://github.com/zsh-users/zsh-history-substring-search.git $ZSH_CUSTOM/plugins/history-substring-search

参考资料

2.4.6 - 配置网络

配置网络

问题:卡住两分钟

开机启动时出现两分钟的停顿,显示提醒如下:

A start job is running for wait for Network to be configured

然后2分钟超时才能继续启动。

登录之后检查:

$ sudo systemctl status systemd-networkd-wait-online.service

输出为:

$ sudo systemctl status systemd-networkd-wait-online.service
systemd-networkd-wait-online.service - Wait for Network to be Configured
     Loaded: loaded (/lib/systemd/system/systemd-networkd-wait-online.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Sat 2022-01-15 08:12:57 UTC; 7min ago
       Docs: man:systemd-networkd-wait-online.service(8)
    Process: 1272 ExecStart=/lib/systemd/systemd-networkd-wait-online (code=exited, status=1/FAILURE)
   Main PID: 1272 (code=exited, status=1/FAILURE)

Jan 15 08:10:57 skywork2 systemd[1]: Starting Wait for Network to be Configured...
Jan 15 08:11:28 skywork2 systemd-networkd-wait-online[1272]: managing: ens1
Jan 15 08:11:28 skywork2 systemd-networkd-wait-online[1272]: managing: enp6s0
Jan 15 08:11:28 skywork2 systemd-networkd-wait-online[1272]: managing: ens1
Jan 15 08:11:28 skywork2 systemd-networkd-wait-online[1272]: managing: enp6s0
Jan 15 08:12:57 skywork2 systemd-networkd-wait-online[1272]: Event loop failed: Connection timed out
Jan 15 08:12:57 skywork2 systemd[1]: systemd-networkd-wait-online.service: Main process exited, code=exited, status=1/FAILURE
Jan 15 08:12:57 skywork2 systemd[1]: systemd-networkd-wait-online.service: Failed with result 'exit-code'.
Jan 15 08:12:57 skywork2 systemd[1]: Failed to start Wait for Network to be Configured.

问题所在在于网卡的配置:

networkctl 
IDX LINK   TYPE       OPERATIONAL SETUP      
  1 lo     loopback   carrier     unmanaged  
  2 enp5s0 ether      no-carrier  configuring
  3 enp6s0 ether      routable    configured 
  4 ens1   ether      routable    configured 
  5 ibs1d1 infiniband off         unmanaged  

5 links listed.

解决方案1-减少超时时间

cd /etc/systemd/system/network-online.target.wants/
sudo vi systemd-networkd-wait-online.service

[Service]下添加一行 TimeoutStartSec=2sec

[Service]
Type=oneshot
ExecStart=/lib/systemd/systemd-networkd-wait-online
RemainAfterExit=yes
TimeoutStartSec=15sec			# 增加这一行

这样15秒钟之后就会继续启动,而不是卡住两分钟,虽然治标不治本。

TBD: 发现我的40G网络会有dhcp获取IP地址很慢的问题,基本要30秒左右才能拿到IP地址,导致启动时很慢。即使这里设置timeout可以继续启动操作系统, 但是进入桌面之后由于40G网络的IP尚未能获取,用40G网络的IP地址会无法访问。千兆网络dhcp的速度就非常快。

解决方案2-配置网络

终极解决方案还是要配置好网络。用 ip 命令查看当前网卡情况:

ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 00:e0:4c:68:f7:da brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.50/24 brd 192.168.0.255 scope global dynamic enp4s0
       valid_lft 81706sec preferred_lft 81706sec
    inet6 fe80::2e0:4cff:fe68:f7da/64 scope link 
       valid_lft forever preferred_lft forever
3: enp5s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN group default qlen 1000
    link/ether 00:e0:4c:54:17:3a brd ff:ff:ff:ff:ff:ff
4: enp6s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN group default qlen 1000
    link/ether 00:e0:4c:54:17:3b brd ff:ff:ff:ff:ff:ff
5: ens4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 48:0f:cf:ef:08:11 brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.50/24 brd 10.0.0.255 scope global dynamic ens4
       valid_lft 38533sec preferred_lft 38533sec
    inet6 fe80::4a0f:cfff:feef:811/64 scope link 
       valid_lft forever preferred_lft forever
6: ibs4d1: <BROADCAST,MULTICAST> mtu 4092 qdisc noop state DOWN group default qlen 256
    link/infiniband a0:00:03:00:fe:80:00:00:00:00:00:00:48:0f:cf:ff:ff:ef:08:12 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff

参考资料:

但上面的方案对于不准备使用的网口(比如连网线都没插)来说是无效的。对于这样的网口,我们需要彻底的禁用他们。

systemctl | grep net-devices        
  sys-subsystem-net-devices-enp4s0.device                                                   loaded active plugged   RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller                                      
  sys-subsystem-net-devices-enp5s0.device                                                   loaded active plugged   RTL810xE PCI Express Fast Ethernet controller                                                  
  sys-subsystem-net-devices-enp6s0.device                                                   loaded active plugged   RTL810xE PCI Express Fast Ethernet controller                                                  
  sys-subsystem-net-devices-ens4.device                                                     loaded active plugged   MT27520 Family [ConnectX-3 Pro] (InfiniBand FDR/Ethernet 10Gb/40Gb 2-port 544+FLR-QSFP Adapter)
  sys-subsystem-net-devices-ibs4d1.device                                                   loaded active plugged   MT27520 Family [ConnectX-3 Pro] (InfiniBand FDR/Ethernet 10Gb/40Gb 2-port 544+FLR-QSFP Adapter)

实践中发现网上的很多方法都无效,比如 ifconfig / ip 中的 down/up 命令,重启之后就无效了。

目前 systemd 接管的网卡情况如下:

networkctl 
$ networkctl 
IDX LINK   TYPE       OPERATIONAL SETUP      
  1 lo     loopback   carrier     unmanaged  
  2 enp4s0 ether      routable    configured 
  3 enp5s0 ether      no-carrier  configuring
  4 enp6s0 ether      no-carrier  configuring
  5 ens4   ether      routable    configured 
  6 ibs4d1 infiniband off         unmanaged  

在这里,enp5s0 和 enp6s0 这两张网卡是我们希望禁用的。

参考以下资料的说明:

我们在 /usr/lib/systemd/network 目录下放置两个文件来申明我们要禁用两块网卡,操作如下:

cd /usr/lib/systemd/network
sudo vi 01-disable-enp5s0.network

创建文件,内容如下:

[Match]
MACAddress=00:e0:4c:54:17:3a

[Link]
Unmanaged=yes
cd /usr/lib/systemd/network
sudo vi 02-disable-enp6s0.network

创建文件,内容如下:

[Match]
MACAddress=00:e0:4c:54:17:3b

[Link]
Unmanaged=yes

注意:[Match] 这里用 mac 地址来匹配,不要用 name,实际测试中我发现用 name 做匹配和设置 Unmanaged 后,系统中的网卡会发生名字变化的情况,导致匹配出现问题。切记用 mac 地址进行匹配。

重启机器之后,看效果:

$ networkctl
$ networkctl
IDX LINK   TYPE       OPERATIONAL SETUP     
  1 lo     loopback   carrier     unmanaged 
  2 enp4s0 ether      routable    configured
  3 enp5s0 ether      off         unmanaged 			# 被禁用了
  4 enp6s0 ether      off         unmanaged  			# 被禁用了
  5 ens4   ether      routable    configured
  6 ibs4d1 infiniband off         unmanaged 			# 这个口本来被禁用了

增加要管理的网卡

有某台机器,出现了网卡状态为 down 的情况,造成无法使用,原因不明。

$ ip addr                                     
......
5: ens4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 70:10:6f:aa:2a:81 brd ff:ff:ff:ff:ff:ff
    
$ networkctl
IDX LINK   TYPE       OPERATIONAL SETUP     
  1 lo     loopback   carrier     unmanaged 
  2 enp4s0 ether      routable    configured
  3 enp5s0 ether      off         unmanaged 
  4 enp6s0 ether      off         unmanaged 
  5 ens4   ether      off         unmanaged 
  6 ibs4d1 infiniband off         unmanaged 

解决方法:

cd /usr/lib/systemd/network
vi 03-ens4-dhcp.network

输入以下内容:

[Match]
MACAddress=70:10:6f:aa:2a:81

[Link]
Unmanaged=no

[Network]
DHCP=yes

重启即可。

2.4.7 - 安装Homebrew

Homebrew 是一个linux和mac上的应用管理工具,方便安装其他软件

安装

准备工作:

sudo apt-get install build-essential

开始安装:

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
==> This script will install:
/home/linuxbrew/.linuxbrew/bin/brew
/home/linuxbrew/.linuxbrew/share/doc/homebrew
/home/linuxbrew/.linuxbrew/share/man/man1/brew.1
/home/linuxbrew/.linuxbrew/share/zsh/site-functions/_brew
/home/linuxbrew/.linuxbrew/etc/bash_completion.d/brew
/home/linuxbrew/.linuxbrew/Homebrew
==> The following new directories will be created:
/home/linuxbrew/.linuxbrew/bin
/home/linuxbrew/.linuxbrew/etc
/home/linuxbrew/.linuxbrew/include
/home/linuxbrew/.linuxbrew/lib
/home/linuxbrew/.linuxbrew/sbin
/home/linuxbrew/.linuxbrew/share
/home/linuxbrew/.linuxbrew/var
/home/linuxbrew/.linuxbrew/opt
/home/linuxbrew/.linuxbrew/share/zsh
/home/linuxbrew/.linuxbrew/share/zsh/site-functions
/home/linuxbrew/.linuxbrew/var/homebrew
/home/linuxbrew/.linuxbrew/var/homebrew/linked
/home/linuxbrew/.linuxbrew/Cellar
/home/linuxbrew/.linuxbrew/Caskroom
/home/linuxbrew/.linuxbrew/Frameworks

==> Downloading and installing Homebrew...

==> Installation successful!

==> Homebrew has enabled anonymous aggregate formulae and cask analytics.
Read the analytics documentation (and how to opt-out) here:
  https://docs.brew.sh/Analytics
No analytics data has been sent yet (nor will any be during this install run).

==> Homebrew is run entirely by unpaid volunteers. Please consider donating:
  https://github.com/Homebrew/brew#donations

==> Next steps:
- Run these two commands in your terminal to add Homebrew to your PATH:
    echo 'eval "$(/home/linuxbrew/.linuxbrew/bin/brew shellenv)"' >> /home/sky/.zprofile
    eval "$(/home/linuxbrew/.linuxbrew/bin/brew shellenv)"
- Install Homebrew's dependencies if you have sudo access:
    sudo apt-get install build-essential
  For more information, see:
    https://docs.brew.sh/Homebrew-on-Linux
- We recommend that you install GCC:
    brew install gcc
- Run brew help to get started
- Further documentation:
    https://docs.brew.sh

按照提示执行:

echo 'eval "$(/home/linuxbrew/.linuxbrew/bin/brew shellenv)"' >> /home/sky/.zprofile
eval "$(/home/linuxbrew/.linuxbrew/bin/brew shellenv)"

建议安装 gcc,那就装吧:

brew install gcc

参考资料

2.4.8 - [归档]添加新用户

添加新用户以便日常使用

ubuntu server 20.04 版本在安装时就需要设置用户名,而不是默认只有root账号。

添加新用户

默认安装后只有root账户,肯定不能直接用root。

注: 如果是用vmware自动安装,则会提前录入一个用户名,安装完成之后就可以直接用这个用户名。这样就可以跳过这个步骤。

因此增加一个日常使用的用户,这个用户需要拥有 sudo 的权限,以便在必要时可以得到 root 权限:

sudo adduser sky
sudo adduser sky sudo

后续可以通过 passwd 命令修改密码:

sudo passwd sky

2.5 - 网络配置

Ubuntu Server安装后的网络配置工作

2.5.1 - 配置静态IP

配置静态IP地址

通常ip地址采用 dhcp,然后在路由器上绑定 IP 地址和 mac 地址即可。但在使用大量虚拟机时,每次都去修改路由器增加静态地址绑定比较麻烦,因此还是有必要在需要时设置静态IP地址。

使用 NetworkManager

适用于 ubuntu 20.04 版本

设置前先安装 network-manager:

sudo apt install network-manager

安全起见,备份原有的配置文件:

sudo cp /etc/netplan/00-installer-config.yaml /etc/netplan/00-installer-config.yaml.original

修改 /etc/netplan 的配置文件,如 00-installer-config.yaml

sudo vi /etc/netplan/00-installer-config.yaml

将需要配置为静态IP地址的网卡设置 dhcp4: false ,然后手工配置:

network:
  version: 2
  renderer: NetworkManager
  ethernets:
    wan1:
      match:
        macaddress: 00:0c:29:23:d3:de
      set-name: wan1
      dhcp4: false
      addresses: [192.168.0.21/24]
      gateway4: 192.168.0.1
      nameservers:
        addresses: [192.168.0.1]
    wan2:
      match:
        macaddress: 48:0f:cf:ef:08:11
      set-name: wan2
      dhcp4: true

使用 Networkd

适用于 ubuntu 22.04 / 22.10 / 23.04 版本

安全起见,备份原有的配置文件:

sudo cp /etc/netplan/00-installer-config.yaml /etc/netplan/00-installer-config.yaml.original

修改 /etc/netplan 的配置文件,如 00-installer-config.yaml

sudo vi /etc/netplan/00-installer-config.yaml

手工配置:

network:
  renderer: networkd
  ethernets:
    ens160:
      addresses:
        - 192.168.0.56/24
      nameservers:
        addresses: [192.168.0.1]
      routes:
        - to: default
          via: 192.168.0.1
  version: 2

2.5.2 - 网络代理快捷命令

设置启用网络代理的快捷命令,方便随时开启和关闭网络代理

手动启用代理

将以下内容添加到 .zshrc :

# proxy
alias proxyon='export all_proxy=socks5://192.168.0.1:7891;export http_proxy=http://192.168.0.1:7890;export https_proxy=http://192.168.0.1:7890;export no_proxy=127.0.0.1,localhost,local,.local,.lan,192.168.0.0/16,10.0.0.0/16'
alias proxyoff='unset all_proxy http_proxy https_proxy no_proxy'

背景:我的代理安装在路由器上,http端口为 7890, socks5 端口为 7891

给git配置代理

vi ~/.ssh/config

内容如下:

Host github.com
HostName github.com
User git
# http proxy
#ProxyCommand socat - PROXY:192.168.0.1:%h:%p,proxyport=7890
# socks5 proxy
ProxyCommand nc -v -x 192.168.0.1:7891 %h %p

2.5.3 - 安装配置Samba文件共享

在linux mint上安装Samba,进行文件共享

安装samba

直接apt安装,然后设置数据所在的路径。

sudo apt-get install samba

cd
mkdir -p data/samba
chmod 777 data/samba

配置samba

sudo vi /etc/samba/smb.conf

打开配置文件,在文件末尾添加以下内容:

[share]
path = /home/sky/data/samba
valid users = sky
writable = yes

创建samba用户:

sudo smbpasswd -a sky

重启samba服务

sudo service smbd restart

访问samba

在其他linux机器上使用地址 smb://172.168.0.10 访问,在windows下使用地址 \\172.0.0.10

参考资料

2.5.4 - 安装配置nfs文件共享

在ubuntu server上安装nfs,进行文件共享

配置nfs服务器端

安装nfs-server

sudo apt update
sudo apt install nfs-kernel-server -y

查看 nfs-server 的状态:

$ sudo systemctl status nfs-server

● nfs-server.service - NFS server and services
     Loaded: loaded (/lib/systemd/system/nfs-server.service; enabled; vendor pr>
     Active: active (exited) since Wed 2021-12-29 00:45:44 CST; 5min ago
   Main PID: 758742 (code=exited, status=0/SUCCESS)
      Tasks: 0 (limit: 154080)
     Memory: 0B
     CGroup: /system.slice/nfs-server.service

Dec 29 00:45:43 skyserver systemd[1]: Starting NFS server and services...
Dec 29 00:45:44 skyserver systemd[1]: Finished NFS server and services.

创建nfs共享目录

sudo mkdir /mnt/nfs-share

让所有的客户端都可以访问所有的文件,修改文件的所有者和许可:

sudo chown nobody:nogroup /mnt/nfs-share
sudo chmod -R 777 /mnt/nfs-share

授权客户端访问nfs server

sudo vi /etc/exports 打开文件,为每个客户端授权访问:

/mnt/nfs-share client-IP(rw,sync,no_subtree_check)

如果有多个客户端则需要重复多次授权,也可以通过子网掩码一次性授权:

/mnt/nfs-share 192.168.0.0/24(rw,sync,no_subtree_check)
/mnt/nfs-share 10.0.0.0/24(rw,sync,no_subtree_check)

参数解释:

  • rw (Read and Write)
  • sync (Write changes to disk before applying them)
  • no_subtree_check (Avoid subtree checking )

执行下面命令进行export:

sudo exportfs -a

配置防火墙

关闭防火墙,或者设置防火墙规则:

sudo ufw allow from 192.168.0.0/24 to any port nfs
sudo ufw allow from 10.0.0.0/24 to any port nfs

增加nfs共享的硬盘

服务器机器上有一块4t的ssd和两块3t的旧硬盘,准备通过nfs共享出来,方便其他机器访问。

可以通过fdisk命令获取相关的硬盘和分区信息:

$ fdisk -l
......
Disk /dev/sda: 2.75 TiB, 3000878383104 bytes, 5861090592 sectors
Device     Start        End    Sectors  Size Type
/dev/sda1   2048 5861089279 5861087232  2.7T Linux filesystem

Disk /dev/sdb: 2.75 TiB, 3000592982016 bytes, 5860533168 sectors
Device     Start        End    Sectors  Size Type
/dev/sdb1   2048 5860532223 5860530176  2.7T Linux filesystem

Disk /dev/nvme1n1: 3.5 TiB, 3840755982336 bytes, 7501476528 sectors
Device     Start        End    Sectors  Size Type
/dev/nvme1n1p1  2048 7501475839 7501473792  3.5T Linux filesystem

然后查分区对应的uuid备用:

$ ls -l /dev/disk/by-uuid/
......
lrwxrwxrwx 1 root root 10 Jan 16 12:34 7c3a3aca-9cde-48a0-957b-eead5b2ab7dc -> ../../sda1
lrwxrwxrwx 1 root root 10 Jan 16 12:34 fcae6bde-4789-4afe-b164-c7189a0bdf5f -> ../../sdb1
lrwxrwxrwx 1 root root 15 Jan 17 01:35 561fe530-4888-4759-97db-f36f607ca18e -> ../../nvme1n1p1

$ sudo mkdir /mnt/e
$ sudo mkdir /mnt/f

sudo vi /etc/fstab 增加挂载信息:

# two old disks
/dev/disk/by-uuid/7c3a3aca-9cde-48a0-957b-eead5b2ab7dc /mnt/e ext4 defaults 0 1
/dev/disk/by-uuid/fcae6bde-4789-4afe-b164-c7189a0bdf5f /mnt/f ext4 defaults 0 1
# one ssd disk
/dev/disk/by-uuid/561fe530-4888-4759-97db-f36f607ca18e /mnt/d ext4 defaults 0 1

执行 sudo mount -av 立即生效。

加到nfs共享中:

sudo chown nobody:nogroup /mnt/d
sudo chmod -R 777 /mnt/d
sudo chown nobody:nogroup /mnt/e
sudo chmod -R 777 /mnt/e
sudo chown nobody:nogroup /mnt/f
sudo chmod -R 777 /mnt/f

sudo vi /etc/exports 增加授权访问:

/mnt/d 192.168.0.0/24(rw,sync,no_subtree_check)
/mnt/d 10.0.0.0/24(rw,sync,no_subtree_check)
/mnt/e 192.168.0.0/24(rw,sync,no_subtree_check)
/mnt/e 10.0.0.0/24(rw,sync,no_subtree_check)
/mnt/f 192.168.0.0/24(rw,sync,no_subtree_check)
/mnt/f 10.0.0.0/24(rw,sync,no_subtree_check)

执行 sudo exportfs -a 立即生效。

配置nfs客户端

安装nfs软件

sudo apt update
sudo apt install nfs-common

挂载nfs server到本地

创建用来挂载 nfs server的本地目录:

sudo mkdir -p /mnt/nfs-skyserver
sudo mkdir -p /mnt/d
sudo mkdir -p /mnt/e
sudo mkdir -p /mnt/f

挂载 nfs server 共享目录到这个客户端本地目录:

sudo mount 10.0.0.40:/mnt/nfs-share /mnt/nfs-skyserver
sudo mount 10.0.0.40:/mnt/d /mnt/d
sudo mount 10.0.0.40:/mnt/e /mnt/e
sudo mount 10.0.0.40:/mnt/f /mnt/f

验证一下:

cd /mnt/nfs-skyserver 
touch a.txt

回到服务器端那边检查一下看文件是否创建。

为了方便使用,创建一些软链接:

mkdir -p ~/data
cd ~/data
ln -s /mnt/nfs-skyserver skyserver
ln -s /mnt/d d
ln -s /mnt/e e
ln -s /mnt/f f

设置永久挂载

上面的挂载在重启之后就会消失,/mnt/nfs-skyserver 会变成一个普通的目录。

为了在机器重启之后继续自动挂载, sudo vi /etc/fstab 打开文件增加以下内容:

# nfs from skyserver
10.0.0.40:/mnt/nfs-share /mnt/nfs-skyserver   nfs   defaults,timeo=15,retrans=5,_netdev	0 0
10.0.0.40:/mnt/d /mnt/d   nfs   defaults,timeo=15,retrans=5,_netdev	0 0
10.0.0.40:/mnt/e /mnt/e   nfs   defaults,timeo=15,retrans=5,_netdev	0 0
10.0.0.40:/mnt/f /mnt/f   nfs   defaults,timeo=15,retrans=5,_netdev	0 0

timeout 时间不要放太长,以备skyserver服务器没有开机时其他机器不至于在启动时阻塞太长时间。

参考资料

2.5.5 - 基于raid的nfs文件共享

在 ubuntu server上安装基于raid的nfs,进行文件共享

背景

安装的 ubuntu 20.04 版本,内核为 5.4:

$ uname -a
Linux switch99 5.4.0-192-generic #212-Ubuntu SMP Fri Jul 5 09:47:39 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

硬盘为两块三星 PM983a 900G 22110 ssd 企业版硬盘:

lspci | grep Non-Volatile
03:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983
04:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983

两块硬盘组成 raid0,有两块分区,md0p1安装有 ubuntu 操作系统,md0p2 是用于 timeshift 的备份分区:

Disk /dev/md0: 1.65 TiB, 1799020871680 bytes, 3513712640 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 524288 bytes / 1048576 bytes
Disklabel type: gpt
Disk identifier: 9B47B927-9C18-4026-AD6D-E68F9F3F8751

Device          Start        End    Sectors  Size Type
/dev/md0p1       2048 3405912063 3405910016  1.6T Linux filesystem
/dev/md0p2 3405912064 3513706495  107794432 51.4G Linux filesystem

硬盘速度测试

分别写入 1g / 10g / 100g 三个文件,速度如下:

$ dd if=/dev/zero of=/home/sky/temp/1g.img bs=1G count=1 oflag=dsync 
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 1.39956 s, 767 MB/s

$ dd if=/dev/zero of=/home/sky/temp/10g.img bs=1G count=10 oflag=dsync
10+0 records in
10+0 records out
10737418240 bytes (11 GB, 10 GiB) copied, 12.4784 s, 860 MB/s

$ dd if=/dev/zero of=/home/sky/temp/100g.img bs=1G count=100 oflag=dsync
100+0 records in
100+0 records out
107374182400 bytes (107 GB, 100 GiB) copied, 127.14 s, 845 MB/s

$ ls -lh
total 112G
-rw-rw-r-- 1 sky sky 100G Aug  2 09:43 100g.img
-rw-rw-r-- 1 sky sky  10G Aug  2 09:41 10g.img
-rw-rw-r-- 1 sky sky 1.0G Aug  2 09:40 1g.img

分别读取三个文件,速度都在3g附近:

$ dd if=/home/sky/temp/1g.img of=/dev/null bs=8M
128+0 records in
128+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.366136 s, 2.9 GB/s

$ dd if=/home/sky/temp/10g.img of=/dev/null bs=8M
1280+0 records in
1280+0 records out
10737418240 bytes (11 GB, 10 GiB) copied, 3.29498 s, 3.3 GB/s

$ dd if=/home/sky/temp/100g.img of=/dev/null bs=8M
12800+0 records in
12800+0 records out
107374182400 bytes (107 GB, 100 GiB) copied, 37.4048 s, 2.9 GB/s

比较遗憾的是,raid0 并没有代理读写速度上的提升,基本维持在单块硬盘的水准。原因暂时不明。

后续考虑换一个大容量的硬盘(2t或者4t)取代这两块小容量(1t)。

搭建 nas 服务器端

安装 nfs server

# 安装
sudo apt install nfs-kernel-server -y

# 开机自启
sudo systemctl start nfs-kernel-server
sudo systemctl enable nfs-kernel-server

# 验证
sudo systemctl status nfs-kernel-server
Jan 29 20:40:15 skynas3 systemd[1]: Starting nfs-server.service - NFS server and services...
Jan 29 20:40:15 skynas3 exportfs[1422]: exportfs: can't open /etc/exports for reading
Jan 29 20:40:16 skynas3 systemd[1]: Finished nfs-server.service - NFS server and services.

配置 UFW 防火墙

安装 nfs 之后必须配置防火墙。先安装 ufw:

sudo apt install ufw -y

安装完成之后第一个必须执行的步骤就是开放 ssh 登录:

sudo ufw allow ssh
sudo ufw enable

然后是容许访问 nfs

sudo ufw allow from 192.168.0.0/16 to any port nfs

重启 ufw 并查看 ufw 状态:

sudo ufw reload
sudo ufw status

可以看到 2049 端口开放给 nfs 了。

$ sudo ufw status
[sudo] password for sky: 
Firewall reloaded
Status: active

To                         Action      From
--                         ------      ----
22/tcp                     ALLOW       Anywhere                  
2049                       ALLOW       192.168.0.0/16            
22/tcp (v6)                ALLOW       Anywhere (v6) 

备注:暂时不知道为什么 ufw 开启之后就无法访问 nfs 了,上面的 allow rule 没能生效。只好暂时先 sudo ufw disable 关闭防火墙先。

准备共享目录

为了方便后续的管理,采用伪文件系统

sudo mkdir -p /data/{share,pve-share}

sudo chown -R nobody:nogroup /data/share
sudo chown -R nobody:nogroup /data/pve-share

创建 export 目录:

sudo mkdir -p /exports/{share,pve-share}

sudo chown -R nobody:nogroup /exports

修改 /etc/fstab 文件来 mount 伪文件系统和 exports

 sudo vi /etc/fstab

增加如下内容:

# nfs exports
/data/share /exports/share     none bind
/data/pve-share /exports/pve-share    none bind

配置 nfs export

sudo vi /etc/exports

修改 nfs exports 的内容:

/exports/share   192.168.0.0/16(rw,no_root_squash,no_subtree_check,crossmnt,fsid=0)
/exports/pve-share   192.168.0.0/16(rw,no_root_squash,no_subtree_check,crossmnt,fsid=0)

重启 nfs-kernel-server,查看 nfs-kernel-server 的状态:

sudo systemctl restart nfs-kernel-server
sudo systemctl status nfs-kernel-server

输出为:

● nfs-server.service - NFS server and services
     Loaded: loaded (/lib/systemd/system/nfs-server.service; enabled; vendor preset: enabled)
     Active: active (exited) since Sun 2024-08-04 01:31:09 UTC; 3h 35min ago
    Process: 10626 ExecStartPre=/usr/sbin/exportfs -r (code=exited, status=0/SUCCESS)
    Process: 10627 ExecStart=/usr/sbin/rpc.nfsd $RPCNFSDARGS (code=exited, status=0/SUCCESS)
   Main PID: 10627 (code=exited, status=0/SUCCESS)

Aug 04 01:31:08 switch99 systemd[1]: Starting NFS server and services...
Aug 04 01:31:09 switch99 systemd[1]: Finished NFS server and services.

验证:

ps -ef | grep nfs

输出为:

root       10636       2  0 01:31 ?        00:00:00 [nfsd]
root       10637       2  0 01:31 ?        00:00:00 [nfsd]
root       10638       2  0 01:31 ?        00:00:00 [nfsd]
root       10639       2  0 01:31 ?        00:00:00 [nfsd]
root       10640       2  0 01:31 ?        00:00:00 [nfsd]
root       10641       2  0 01:31 ?        00:00:00 [nfsd]
root       10642       2  0 01:31 ?        00:00:00 [nfsd]
root       10643       2  0 01:31 ?        00:00:01 [nfsd]

查看当前挂载情况:

sudo showmount -e

输出为:

Export list for switch99:
/exports/pve-share 192.168.0.0/16
/exports/share     192.168.0.0/16

总结

  1. 其实和是不是 raid 没啥关系
  2. ubuntu 的软raid的性能加成为0是万万没有想到的,准备放弃

2.5.6 - 安装配置sftp

在ubuntu server上安装sftp

准备group和user

sudo addgroup sftpgroup
sudo useradd -m sftpuser -g sftpgroup
sudo passwd sftpuser

sudo chmod 700 /home/sftpuser/

配置 ssh service

修改 /etc/ssh/sshd_config

Match group sftpgroup
ChrootDirectory /home
X11Forwarding no
AllowTcpForwarding no
ForceCommand internal-sftp

2.6 - 内核配置

Ubuntu Server内核更新配置工作

2.6.1 - 更新Linux内核

更新Linux内核

简单更新小版本

在ssh登录到ubuntu server时,有时会看到类似的提示:

10 updates can be applied immediately.
To see these additional updates run: apt list --upgradable

查看具体内容:

sudo apt list --upgradable
[sudo] password for sky: 
Listing... Done
linux-generic/focal-proposed 5.4.0.97.101 amd64 [upgradable from: 5.4.0.96.100]
linux-headers-generic/focal-proposed 5.4.0.97.101 amd64 [upgradable from: 5.4.0.96.100]
linux-image-generic/focal-proposed 5.4.0.97.101 amd64 [upgradable from: 5.4.0.96.100]
linux-libc-dev/focal-proposed 5.4.0-97.110 amd64 [upgradable from: 5.4.0-96.109]

通常这种都是小版本的更新提示,比如我这里就是安装了 5.4.0-96 然后提示有 5.4.0-97 版本的更新。

升级也非常简单:

sudo apt upgrade

过程中相对复杂的已有的 dkms 模块要在新内核上重新编译,一般时间会比较长。

升级晚之后重启,然后检查一下 dkms:

dkms status
iser, 4.9, 5.4.0-94-generic, x86_64: installed
iser, 4.9, 5.4.0-96-generic, x86_64: installed
iser, 4.9, 5.4.0-97-generic, x86_64: installed
isert, 4.9, 5.4.0-94-generic, x86_64: installed
isert, 4.9, 5.4.0-96-generic, x86_64: installed
isert, 4.9, 5.4.0-97-generic, x86_64: installed
kernel-mft-dkms, 4.15.1, 5.4.0-94-generic, x86_64: installed
kernel-mft-dkms, 4.15.1, 5.4.0-96-generic, x86_64: installed
kernel-mft-dkms, 4.15.1, 5.4.0-97-generic, x86_64: installed
knem, 1.1.4.90mlnx1, 5.4.0-94-generic, x86_64: installed
knem, 1.1.4.90mlnx1, 5.4.0-96-generic, x86_64: installed
knem, 1.1.4.90mlnx1, 5.4.0-97-generic, x86_64: installed
mlnx-ofed-kernel, 4.9, 5.4.0-94-generic, x86_64: installed
mlnx-ofed-kernel, 4.9, 5.4.0-96-generic, x86_64: installed
mlnx-ofed-kernel, 4.9, 5.4.0-97-generic, x86_64: installed
rshim, 1.18, 5.4.0-94-generic, x86_64: installed
rshim, 1.18, 5.4.0-96-generic, x86_64: installed
rshim, 1.18, 5.4.0-97-generic, x86_64: installed
srp, 4.9, 5.4.0-94-generic, x86_64: installed
srp, 4.9, 5.4.0-96-generic, x86_64: installed
srp, 4.9, 5.4.0-97-generic, x86_64: installed

我这里因为有多个内核版本,所以模块比较多,后面会删除不用的旧版本内核。

手动更新大版本

对于大版本更新,需要手工。

考虑到 22.04 版本不稳定,试用之下发现有一些莫名其妙的问题,不想折腾,继续试用 20.04版本,但是希望可以把内核从 5.4 升级到更新的版本,比如 5.15.

先看一下有哪些版本可以选择:


sudo apt update
# 下面这个 apt list 命令要在 bash 下才能成功,先临时切换到 bash
bash
sudo apt list linux-headers-5.15.*-*-generic linux-image-5.15.*-*-generic
linux-headers-5.15.0-33-generic/focal-updates,focal-security 5.15.0-33.34~20.04.1 amd64
linux-headers-5.15.0-41-generic/focal-updates,focal-security 5.15.0-41.44~20.04.1 amd64
linux-headers-5.15.0-43-generic/focal-updates,focal-security 5.15.0-43.46~20.04.1 amd64
linux-headers-5.15.0-46-generic/focal-updates,focal-security 5.15.0-46.49~20.04.1 amd64
linux-headers-5.15.0-48-generic/focal-updates,focal-security 5.15.0-48.54~20.04.1 amd64
linux-headers-5.15.0-50-generic/focal-updates,focal-security 5.15.0-50.56~20.04.1 amd64
linux-headers-5.15.0-52-generic/focal-updates,focal-security 5.15.0-52.58~20.04.1 amd64
linux-headers-5.15.0-53-generic/focal-updates,focal-security 5.15.0-53.59~20.04.1 amd64
linux-headers-5.15.0-56-generic/focal-updates,focal-security 5.15.0-56.62~20.04.1 amd64
linux-headers-5.15.0-57-generic/focal-updates,focal-security 5.15.0-57.63~20.04.1 amd64
linux-headers-5.15.0-58-generic/focal-updates,focal-security 5.15.0-58.64~20.04.1 amd64
linux-image-5.15.0-33-generic/focal-updates,focal-security 5.15.0-33.34~20.04.1 amd64
linux-image-5.15.0-41-generic/focal-updates,focal-security 5.15.0-41.44~20.04.1 amd64
linux-image-5.15.0-43-generic/focal-updates,focal-security 5.15.0-43.46~20.04.1 amd64
linux-image-5.15.0-46-generic/focal-updates,focal-security 5.15.0-46.49~20.04.1 amd64
linux-image-5.15.0-48-generic/focal-updates,focal-security 5.15.0-48.54~20.04.1 amd64
linux-image-5.15.0-50-generic/focal-updates,focal-security 5.15.0-50.56~20.04.1 amd64
linux-image-5.15.0-52-generic/focal-updates,focal-security 5.15.0-52.58~20.04.1 amd64
linux-image-5.15.0-53-generic/focal-updates,focal-security 5.15.0-53.59~20.04.1 amd64
linux-image-5.15.0-56-generic/focal-updates,focal-security 5.15.0-56.62~20.04.1 amd64
linux-image-5.15.0-57-generic/focal-updates,focal-security 5.15.0-57.63~20.04.1 amd64
linux-image-5.15.0-58-generic/focal-updates,focal-security 5.15.0-58.64~20.04.1 amd64

试试最新的 5.15.0-58

sudo apt install linux-headers-5.15.0-58-generic linux-image-5.15.0-58-generic

安装完成后重启,检查:

uname -a
Linux skyserver 5.15.0-58-generic #64~20.04.1-Ubuntu SMP Fri Jan 6 16:42:31 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

可以看到 linux 内核已经更新到 5.15。

但这只是升级内核,发行版本还是会继续保持不变:

lsb_release -a
$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 20.04.5 LTS
Release:	20.04
Codename:	focal

参考资料:

2.6.2 - 删除Linux内核

删除不用的Linux内核

多次升级之后,系统内就会累计有多个内核版本,可以考虑删除旧的不用的内核。

参考:

https://askubuntu.com/questions/1253347/how-to-easily-remove-old-kernels-in-ubuntu-20-04-lts

vi remove_old_kernels.sh

新建一个文件内容如下:

#!/bin/bash
# Run this script without any param for a dry run
# Run the script with root and with exec param for removing old kernels after checking
# the list printed in the dry run

uname -a
IN_USE=$(uname -a | awk '{ print $3 }')
if [[ $IN_USE == *-generic ]]
then
  IN_USE=${IN_USE::-8}
fi
echo "Your in use kernel is $IN_USE"

OLD_KERNELS=$(
    dpkg --list |
        grep -v "$IN_USE" |
        grep -v "linux-headers-generic" |
        grep -v "linux-image-generic"  |
        grep -Ei 'linux-image|linux-headers|linux-modules' |
        awk '{ print $2 }'
)
echo "Old Kernels to be removed:"
echo "$OLD_KERNELS"

if [ "$1" == "exec" ]; then
    for PACKAGE in $OLD_KERNELS; do
        yes | apt purge "$PACKAGE"
    done
fi

执行

bash ./remove_old_kernels.sh

看查看到要删除的内核版本和相关的包,确认没有问题之后再通过

sudo bash ./remove_old_kernels.sh exec

进行实际删除。

之后重启,执行:

dpkg --list | grep -Ei 'linux-image|linux-headers|linux-modules' 

检查现有的内核:

ii  linux-headers-5.4.0-97               5.4.0-97.110                            all          Header files related to Linux kernel version 5.4.0
ii  linux-headers-5.4.0-97-generic       5.4.0-97.110                            amd64        Linux kernel headers for version 5.4.0 on 64 bit x86 SMP
ii  linux-headers-generic                5.4.0.97.101                            amd64        Generic Linux kernel headers
ii  linux-image-5.4.0-97-generic         5.4.0-97.110                            amd64        Signed kernel image generic
ii  linux-image-generic                  5.4.0.97.101                            amd64        Generic Linux kernel image
ii  linux-modules-5.4.0-97-generic       5.4.0-97.110                            amd64        Linux kernel extra modules for version 5.4.0 on 64 bit x86 SMP
ii  linux-modules-extra-5.4.0-97-generic 5.4.0-97.110                            amd64        Linux kernel extra modules for version 5.4.0 on 64 bit x86 SMP

2.7 - 硬件配置

Ubuntu Server硬件相关配置工作

2.7.1 - 查看cpu频率

查看当前cpu各个核心的实时频率

cpufreq-info

需要安装 cpufrequtils :

sudo apt-get install cpufrequtils

然后执行:

$ cpufreq-info
                  
cpufrequtils 008: cpufreq-info (C) Dominik Brodowski 2004-2009
Report errors and bugs to cpufreq@vger.kernel.org, please.
analyzing CPU 0:
  driver: intel_pstate
  CPUs which run at the same hardware frequency: 0
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency: 4294.55 ms.
  hardware limits: 1.20 GHz - 3.50 GHz
  available cpufreq governors: performance, powersave
  current policy: frequency should be within 1.20 GHz and 3.50 GHz.
                  The governor "powersave" may decide which speed to use
                  within this range.
  current CPU frequency is 1.20 GHz.
analyzing CPU 1:
  driver: intel_pstate
  CPUs which run at the same hardware frequency: 1
  CPUs which need to have their frequency coordinated by software: 1
  maximum transition latency: 4294.55 ms.
  hardware limits: 1.20 GHz - 3.50 GHz
  available cpufreq governors: performance, powersave
  current policy: frequency should be within 1.20 GHz and 3.50 GHz.
                  The governor "powersave" may decide which speed to use
                  within this range.
  current CPU frequency is 1.20 GHz.
......

如果想快速概括的查看所有内核的实时频率,可以这样:

$ cpufreq-info | grep "current CPU frequency"
  current CPU frequency is 1.38 GHz.
  current CPU frequency is 1.23 GHz.
  current CPU frequency is 1.20 GHz.
  current CPU frequency is 1.20 GHz.
  current CPU frequency is 1.20 GHz.
  current CPU frequency is 1.20 GHz.
  current CPU frequency is 1.20 GHz.
......

也可以这样每秒钟刷新一下:

watch -n1 "lscpu | grep MHz | awk '{print $1}'";

auto-cpufreq

https://snapcraft.io/auto-cpufreq

参考资料

2.7.2 - cpu压力测试

对cpu进行压力测试

主要是想看一下压力测试时cpu的频率,看是否工作在性能模式。

备注: 对于x99主板,则同时可以检验一下鸡血bios是否生效

sysbench

安装sysbench:

sudo apt install sysbench

执行cpu压力测试:

sysbench cpu --threads=40 run

很欣喜的看到x99双路主板上两个e5 2666 v3 cpu在鸡血bios之后都可以跑在全核3.5G的频率:

cpufreq-info | grep "current CPU"
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.
  current CPU frequency is 3.49 GHz.

stress

参考资料

2.7.3 - 电源模式

设置CPU电源模式

查看电源模式

$ cpufreq-info
                  
cpufrequtils 008: cpufreq-info (C) Dominik Brodowski 2004-2009
Report errors and bugs to cpufreq@vger.kernel.org, please.
analyzing CPU 0:
  driver: intel_pstate
  CPUs which run at the same hardware frequency: 0
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency: 4294.55 ms.
  hardware limits: 1.20 GHz - 3.50 GHz
  available cpufreq governors: performance, powersave
  current policy: frequency should be within 1.20 GHz and 3.50 GHz.
                  The governor "powersave" may decide which speed to use
                  within this range.
  current CPU frequency is 1.20 GHz.

设置电源模式

设置电源模式为 “performance”:

sudo bash -c 'for i in {0..31}; do cpufreq-set -c $i -g performance; done'

设置电源模式为 “powersave”:

sudo bash -c 'for i in {0..31}; do cpufreq-set -c $i -g powersave; done'

设置电源模式为 “ondemand”:

sudo bash -c 'for i in {0..31}; do cpufreq-set -c $i -g ondemand; done'

参考资料

3 - 开发环境

在Ubuntu Server上搭建开发测试环境

3.1 - 通用工具

在Ubuntu Server上进行编程开发的通用工具

3.1.1 - sdkman

管理 SDK 的多个版本

SDKMAN 是一款管理多版本 SDK 的工具,可以实现在多个版本间的快速切换。

sdkman 支持 macos 和 linux,可以参考:

https://skyao.io/learning-macos/docs/programing/common/sdkman.html

安装sdkman

macOS

$ curl -s "https://get.sdkman.io" | bash

$ source "/home/sky/.sdkman/bin/sdkman-init.sh"
$ sdk version
SDKMAN 5.15.0

ubuntu

sudo apt install unzip zip
curl -s "https://get.sdkman.io" | bash

安装完成后:

source "/home/sky/.sdkman/bin/sdkman-init.sh"
sdk version

3.1.2 - azure-cli

azure cli 安装配置

参考官方文档:

Install the Azure CLI on Linux

LTS 版本

对于 18.04 LTS (Bionic Beaver), 20.04 LTS (Focal Fossa), 22.04 (Jammy Jellyfish) 这些 LST 版本,可以通过下面的命令一次性安装:

curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash

也可以一步一步安装,详情见官方文档。

az --version    

检查一下:

                                      
azure-cli                         2.49.0

core                              2.49.0
telemetry                          1.0.8

Dependencies:
msal                              1.20.0
azure-mgmt-resource               22.0.0

Python location '/opt/az/bin/python3'
Extensions directory '/home/sky/.azure/cliextensions'

Python (Linux) 3.10.10 (main, May 19 2023, 08:20:32) [GCC 9.4.0]

Legal docs and information: aka.ms/AzureCliLegal


Your CLI is up-to-date.

非 LTS 版本

理论上非 LTS 版本是不支持的。

22.10

3.2 - 概述

Ubuntu Server上搭建开发测试环境的概述

Java相关

Git相关

git和gitlab等相关的内容在 Git学习笔记 中。

Jenkins相关

Jenkins等相关的内容在 Jenkins2学习笔记 中。

3.3 - Java

和Java相关的内容

3.3.1 - 安装JDK

安装JDK

使用sdkman安装

使用 sdkman 管理JDK:

$ sdk list java
$ sdk install java 11.0.15-zulu

Downloading: java 11.0.15-zulu

In progress...

###################################################################################### 100.0%

Repackaging Java 11.0.15-zulu...

Done repackaging...

Installing: java 11.0.15-zulu
Done installing!


Setting java 11.0.15-zulu as default.

apt-get 自动安装

直接 apt-get 安装最新版本的oracle jdk8:

sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer

备注:ubuntu 20.04 下已经不支持,报错" 此 PPA 不支持 focal"。只能改用手工安装方式。

中间选择接收 oracle 的协议,然后继续安装。(吐糟一下: oracle 服务器的下载速度真慢,即使是我放在国外的VPS,号称千兆网络,也只有300-500k的速度……)

如果安装有多个版本的 JDK,可以用下面的命令来设置默认的JDK:

sudo apt-get install oracle-java8-set-default

安装完成之后运行 java –version 查看安装结果:

# java -version
java version "1.8.0_161"
Java(TM) SE Runtime Environment (build 1.8.0_161-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.161-b12, mixed mode)

此时java命令是通过将 /usr/lib/jvm/java-8-oracle/jre/bin/java 文件一路链接到 /etc/alternatives/java 和 /usr/bin/java 的:

root@sky2:~# which java
/usr/bin/java
root@sky2:~# ls -l /usr/bin/java
lrwxrwxrwx 1 root root 22 Oct  5 08:40 /usr/bin/java -> /etc/alternatives/java
root@sky2:~# ls -l /etc/alternatives/java
lrwxrwxrwx 1 root root 39 Oct  5 08:40 /etc/alternatives/java -> /usr/lib/jvm/java-8-oracle/jre/bin/java

并且没有设置 JAVA_HOME。推荐设置好 JAVA_HOME 并将 JAVA_HOME/bin 目录加入PATH,修改 .bashrc 文件:

# java
export JAVA_HOME=/usr/lib/jvm/java-8-oracle/
export PATH=$JAVA_HOME/bin:$PATH

再执行 source ~/.bashrc 载入即可。

手工安装JDK

如果手工安装,可以从oracle网站下载安装文件,如 jdk-8u281-linux-x64.tar.gz(对于ubuntu不要下载rpm版本)。

手工解压缩到安装路径:

gunzip jdk-8u281-linux-x64.tar.gz
tar xvf jdk-8u281-linux-x64.tar
# ubuntu20.04下这个这个目录已经存在
# sudo mkdir /usr/lib/jvm
sudo mv jdk1.8.0_281 /usr/lib/jvm/jdk8

JAVA_HOME 和 PATH 设置类似:

# java
export JAVA_HOME=/usr/lib/jvm/jdk8/
export PATH=$JAVA_HOME/bin:$PATH

参考资料

3.3.2 - 安装Maven

安装并配置Maven

安装maven

用sdkman安装

$ sdk install maven

Downloading: maven 3.8.5

In progress...

Setting maven 3.8.5 as default.

配置代理服务器

export MAVEN_OPTS="-DsocksProxyHost=192.168.0.1 -DsocksProxyPort=7891"

3.3.3 - 安装Artifacory

安装并配置Artifacory

安装

准备安装:

wget -qO - https://api.bintray.com/orgs/jfrog/keys/gpg/public.key | sudo apt-key add -
echo "deb https://jfrog.bintray.com/artifactory-debs focal main" | sudo tee /etc/apt/sources.list.d/jfrog.list
sudo apt update

执行安装:

sudo apt install jfrog-artifactory-oss

Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following NEW packages will be installed:
  jfrog-artifactory-oss
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 484 MB of archives.
After this operation, 989 MB of additional disk space will be used.
Get:1 https://jfrog.bintray.com/artifactory-debs focal/main amd64 jfrog-artifactory-oss amd64 7.37.15 [484 MB]
Fetched 484 MB in 38s (12.9 MB/s)                                              
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
	LANGUAGE = (unset),
	LC_ALL = (unset),
	LC_TIME = "zh_CN.UTF-8",
	LC_MONETARY = "zh_CN.UTF-8",
	LC_ADDRESS = "zh_CN.UTF-8",
	LC_TELEPHONE = "zh_CN.UTF-8",
	LC_NAME = "zh_CN.UTF-8",
	LC_MEASUREMENT = "zh_CN.UTF-8",
	LC_IDENTIFICATION = "zh_CN.UTF-8",
	LC_NUMERIC = "zh_CN.UTF-8",
	LC_PAPER = "zh_CN.UTF-8",
	LANG = "en_US.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to a fallback locale ("en_US.UTF-8").
locale: Cannot set LC_ALL to default locale: No such file or directory
Selecting previously unselected package jfrog-artifactory-oss.
(Reading database ... 105976 files and directories currently installed.)
Preparing to unpack .../jfrog-artifactory-oss_7.37.15_amd64.deb ...
dpkg-query: no packages found matching artifactory
Checking if group artifactory exists...
Group artifactory doesn't exist. Creating ...
Checking if user artifactory exists...
User artifactory doesn't exist. Creating ...
Checking if artifactory data directory exists
Removing tomcat work directory
Unpacking jfrog-artifactory-oss (7.37.15) ...
Setting up jfrog-artifactory-oss (7.37.15) ...
Adding the artifactory service to auto-start... DONE

************ SUCCESS ****************
The Installation of Artifactory has completed successfully.

NOTE: It is highly recommended to use Artifactory with an external database (MyS
QL, Oracle, Microsoft SQL Server, PostgreSQL, MariaDB).
      For details about how to configure the database, refer to https://service.
jfrog.org/installer/Configuring+the+Database

Start Artifactory with:
> systemctl start artifactory.service

Check Artifactory status with:
> systemctl status artifactory.service


Installation directory was set to /opt/jfrog/artifactory
You can find more information in the log directory /opt/jfrog/artifactory/var/lo
g
System configuration templates can be found under /opt/jfrog/artifactory/var/etc
Copy any configuration you want to modify from the template to /opt/jfrog/artifa
ctory/var/etc/system.yaml

Triggering migration script, this will migrate if needed ...
Processing triggers for systemd (245.4-4ubuntu3.17) ...

start 并 enable artifactory:

sudo systemctl start artifactory.service
sudo systemctl enable artifactory.service

检验一下:

$ systemctl status artifactory.service
● artifactory.service - Artifactory service
     Loaded: loaded (/lib/systemd/system/artifactory.service; enabled; vendor preset: enabled)
     Active: active (running) since Fri 2022-05-06 03:06:34 UTC; 55s ago
   Main PID: 44611 (java)
      Tasks: 0 (limit: 154104)
     Memory: 20.8M
     CGroup: /system.slice/artifactory.service
44611 /opt/jfrog/artifactory/app/third-party/java/bin/java -Djava.util.logging.config.file=/opt/jfrog/artifactory/app/>

May 06 03:06:33 skyserver su[45384]: (to artifactory) root on none
May 06 03:06:33 skyserver su[45384]: pam_unix(su:session): session opened for user artifactory by (uid=0)
May 06 03:06:33 skyserver su[45384]: pam_unix(su:session): session closed for user artifactory
May 06 03:06:33 skyserver su[45497]: (to artifactory) root on none
May 06 03:06:33 skyserver su[45497]: pam_unix(su:session): session opened for user artifactory by (uid=0)
May 06 03:06:34 skyserver su[45497]: pam_unix(su:session): session closed for user artifactory
May 06 03:06:34 skyserver su[45628]: (to artifactory) root on none
May 06 03:06:34 skyserver su[45628]: pam_unix(su:session): session opened for user artifactory by (uid=0)
May 06 03:06:34 skyserver su[45628]: pam_unix(su:session): session closed for user artifactory
May 06 03:06:34 skyserver systemd[1]: Started Artifactory service.

$ ps -ef | grep artifactory                               
sky        39708   33533  0 03:05 pts/0    00:00:00 grep --color=auto --exclude-dir=.bzr --exclude-dir=CVS --exclude-dir=.git --exclude-dir=.hg --exclude-dir=.svn --exclude-dir=.idea --exclude-dir=.tox artifactory

$ nc  -zv  127.0.0.1 8081     
Connection to 127.0.0.1 8081 port [tcp/tproxy] succeeded!

配置Artifactory

访问:

http://192.168.0.40:8081

会被重定向到:

http://192.168.0.40:8082/ui/login/

默认账号为: Username: admin Password: password

设置 baseURL: http://sz.springmesh.io/

(注意在路由器中开启 8081 、 8082 两个端口的端口映射,以便远程访问)

创建仓库时选择 maven ,Repositories created during the Onboarding:

  • libs-snapshot-local
  • maven-remote
  • libs-snapshot

登录之后会要求修改默认的 admin 密码。

配置密码加密方式

为了避免在 maven 的 settings.xml 文件中出现密码的明文,要启用密码加密方式,在 “security” -> “settings” 中修改 “Password Encryption Policy” 为 “support”。

备注: unsupport 是明文,support 是可以明文也可以加密,required 是必须加密。

遇到的特殊问题:我在 settings.xml 中使用明文密码 + artifactory 设置 unsupport 可以工作,在 settings.xml 中使用加密密码 + artifactory 设置 support 也可以工作,但我在 settings.xml 中使用明文密码 + artifactory 设置 required 就会报错 401,不清楚为什么。

配置 maven

修改 maven 的 settings 配置文件 (~/.m2/settings.xml),内容如下:

<?xml version="1.0" encoding="UTF-8"?>
<settings xsi:schemaLocation="http://maven.apache.org/SETTINGS/1.2.0 http://maven.apache.org/xsd/settings-1.2.0.xsd" xmlns="http://maven.apache.org/SETTINGS/1.2.0"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <servers>
    <server>
      <username>admin</username>
      <password>{lnSEkwEnmhaaaaaaaaaaaaaaaaaaaaaa=}</password>
      <id>central</id>
    </server>
    <server>
      <username>admin</username>
      <password>{lnSEkwEnmaaaaaaaaaaaaaaaaaaaaa=}</password>
      <id>snapshots</id>
    </server>
  </servers>
  <mirrors>
    <mirror>
      <mirrorOf>*</mirrorOf>
      <name>libs-release</name>
      <url>http://192.168.0.40:8081/artifactory/libs-release</url>
      <id>central</id>
    </mirror>
  </mirrors>
  <profiles>
    <profile>
      <repositories>
        <repository>
          <snapshots>
            <enabled>false</enabled>
          </snapshots>
          <id>central</id>
          <name>libs-release</name>
          <url>http://192.168.0.40:8081/artifactory/libs-release</url>
        </repository>
        <repository>
          <snapshots />
          <id>snapshots</id>
          <name>libs-snapshot</name>
          <url>http://192.168.0.40:8081/artifactory/libs-snapshot</url>
        </repository>
      </repositories>
      <pluginRepositories>
        <pluginRepository>
          <snapshots>
            <enabled>false</enabled>
          </snapshots>
          <id>central</id>
          <name>libs-release</name>
          <url>http://192.168.0.40:8081/artifactory/libs-release</url>
        </pluginRepository>
        <pluginRepository>
          <snapshots />
          <id>snapshots</id>
          <name>libs-snapshot</name>
          <url>http://192.168.0.40:8081/artifactory/libs-snapshot</url>
        </pluginRepository>
      </pluginRepositories>
      <id>artifactory</id>
    </profile>
  </profiles>
  <activeProfiles>
    <activeProfile>artifactory</activeProfile>
  </activeProfiles>
</settings>

注意:在使用mirror的情况下,mirror的id要和 server 的 id 保持一致,否则会报错 401。

其中密码加密的方式参考官方文档:

https://maven.apache.org/guides/mini/guide-encryption.html

mvn --encrypt-master-password
{jSMOWnoPFgsHVpMvz5VrIt5kRbzGpI8u+9EF1iFQyJQ=}

输入一个master密码,可以理解为是一个加密因子,总之和服务器端admin的实际密码没任何关系,随便输入一堆数字和字母,然后就会得到一组类似 {jSMOWnoPFgsHVpMvz5VrIt5kRbzGpI8u+9EF1iFQyJQ=} 的字符串。

新建 ${user.home}/.m2/settings-security.xml 文件,内容为:

<settingsSecurity>
  <master>{jSMOWnoPFgsHVpMvz5VrIt5kRbzGpI8u+9EF1iFQyJQ=}</master>
</settingsSecurity>

再执行下面的命令加密admin的实际密码:

mvn --encrypt-password
{COQLCE6DU6GtcS5P=}

将得到的 类似 {COQLCE6DU6GtcS5P=} 的加密后的字符串保存在 settings.xml 文件中。

  <servers>
    <server>
      <username>admin</username>
      <password>{lnSEkwEnmhaaaaaaaaaaaaaaaaaaaaaa=}</password>
      <id>central</id>
    </server>
    <server>
      <username>admin</username>
      <password>{lnSEkwEnmaaaaaaaaaaaaaaaaaaaaa=}</password>
      <id>snapshots</id>
    </server>
  </servers>

参考资料

3.3.4 - 安装Quarkus

安装并配置Quarkus

安装

参考: https://quarkus.io/get-started/

sdk install quarkus

备注:下载速度有点问题,建议开启代理

4 - 软交换

将Ubuntu Server用作软交换,通过多网卡连接多台机器

4.1 - 简单软路由

使用 linux bridge 简单实现软路由

准备工作

网卡信息

主板上插了四块 hp544+ 双头网卡,芯片是 Mellanox ConnectX-3 Pro:

lspci | grep Eth
01:00.0 Ethernet controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro]
02:00.0 Ethernet controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro]
05:00.0 Ethernet controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro]
06:00.0 Ethernet controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro]

安装 ubuntu server 之后,只使用了其他一个网口作为 wan:

2: ens2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether e0:07:1b:77:41:31 brd ff:ff:ff:ff:ff:ff
3: ens2d1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether e0:07:1b:77:41:32 brd ff:ff:ff:ff:ff:ff
4: ens6: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether e0:07:1b:78:3e:a1 brd ff:ff:ff:ff:ff:ff
5: ens6d1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether e0:07:1b:78:3e:a2 brd ff:ff:ff:ff:ff:ff
6: ens4: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether 50:65:f3:89:24:21 brd ff:ff:ff:ff:ff:ff
7: ens4d1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether 50:65:f3:89:24:22 brd ff:ff:ff:ff:ff:ff
8: enp6s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether c4:34:6b:df:e1:81 brd ff:ff:ff:ff:ff:ff
9: enp6s0d1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether c4:34:6b:df:e1:82 brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.99/24 brd 192.168.0.255 scope global enp6s0d1
       valid_lft forever preferred_lft forever
    inet6 fe80::c634:6bff:fedf:e182/64 scope link 
       valid_lft forever preferred_lft forever

配置软交换

安装 network-manager

首先需要安装 network-manager:

sudo apt install network-manager

配置 linux bridge

备份原有的网络配置文件:

sudo cp /etc/netplan/00-installer-config.yaml /etc/netplan/00-installer-config.yaml.original

然后修改文件内容:

sudo vi /etc/netplan/00-installer-config.yaml

内容如下:

network:
  version: 2
  renderer: NetworkManager
  ethernets:
    wan1:
      match:
        macaddress: c4:34:6b:df:e1:82
      set-name: wan1
      dhcp4: no
    lan1:
      match:
        macaddress: c4:34:6b:df:e1:81
      set-name: lan1
      dhcp4: no
    lan2:
      match:
        macaddress: 50:65:f3:89:24:21
      set-name: lan2
      dhcp4: no
    lan3:
      match:
        macaddress: 50:65:f3:89:24:22
      set-name: lan3
      dhcp4: no
    lan4:
      match:
        macaddress: e0:07:1b:78:3e:a1
      set-name: lan4
      dhcp4: no
    lan5:
      match:
        macaddress: e0:07:1b:78:3e:a2
      set-name: lan5
      dhcp4: no
    lan6:
      match:
        macaddress: e0:07:1b:77:41:31
      set-name: lan6
      dhcp4: no
    lan7:
      match:
        macaddress: e0:07:1b:77:41:32
      set-name: lan7
      dhcp4: no
  bridges:
    br0:
      interfaces:
        - wan1
        - lan1
        - lan2
        - lan3
        - lan4
        - lan5
        - lan6
        - lan7
      addresses:
        - 192.168.0.99/24
      dhcp4: no

四块网卡8个网口,一个作为wan,7个作为lan,配置好之后都作为 br0 的 interfaces 。

总结

这是配置软交换最简单的方法,不需要建立子网,也不需要配置 DHCP server。

4.2 - 配置ubuntu server开启路由功能

配置ubuntu server,开启路由功能,支持多台机器接入

背景

有一个普通的家用网络环境,硬件有:

  • 光猫:桥接模式,关闭了路由功能
  • 路由器:负责拨号上网,dhcp服务器,地址为 192.168.0.1
  • 千兆交换机:解决路由器LAN口不足的问题,WAN口接在路由器上,其他LAN口连接各台物理机
  • 五台物理机:四台服务器(skyserver/skyserver2/skyserver3/skyserver5),一台工作机(skywork),各个千兆网卡分别连接在路由器和交换机的千兆LAN口上,网段为 192.168.0.x

计划升级内部网络为 40G/56G ,网卡采用拆机二手的 HP 544+ 网卡,支持 40G/56G,支持IB和以太网,价格便宜(100元附近)。

但 40G/56G 的交换机价格太贵,而且体积噪音也不适合家庭使用。

因此考虑简单处理,采用多网卡直联的方案,用其他一台机器(skyserver5,安装有 ubuntu server 20.04 )作为中央节点,为其他机器提供路由服务。

需要的网络设备有:

  • HP 544+ 网卡6块:中央节点上插两块,提供4个网卡;其他四台机器上各插一块
  • 40G/56G DAC直联线4根

步骤1: 组建高速网络

这个步骤的目标是在原有的千兆网络基础上,完成新的高速40g网络的配置,实现两个网络并存。

使用到的网卡

中央节点 skyserver5 上有多块网卡,可以通过 ifconfig 命令查看。但有时某些网卡会无法在 ifconfig 命令的输出中显示出来,典型如HP 544网卡,在没有插入网线时 ifconfig 命令是不会列出其上的网卡设备的。如下面命令所示,我这块主板上板载两块intel 万兆网卡,和两块HP 544+网卡(每个上面有两个网口,总共应该是四个网卡)。lspci可以正常发现这些网卡设备:

$ lspci | grep X540
01:00.0 Ethernet controller: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 (rev 01)
01:00.1 Ethernet controller: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 (rev 01)
$ lspci | grep Mellanox
81:00.0 Ethernet controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro]
82:00.0 Network controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro]

TBD:暂时不清楚为什么这两块 HP 544+ 网卡的信息会不一样,一个是 Ethernet controller,一个是 Network controller

注意: 当网卡没有插网线或者网线那边的主机没有启动,就会出现 ifconfig 命令无法列出网络适配器的情况,此时需要把网线拔出来插到刚才这块空闲的网卡上,才能看到对应的网络设备。也可以用 ip 命令:

$ ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether e0:4f:43:ca:82:48 brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.50/24 brd 192.168.0.255 scope global eno1
3: eno2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether e0:4f:43:ca:82:49 brd ff:ff:ff:ff:ff:ff
4: ens3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 70:10:6f:aa:2a:81 brd ff:ff:ff:ff:ff:ff
5: ens3d1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 70:10:6f:aa:2a:82 brd ff:ff:ff:ff:ff:ff
6: ens4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 24:be:05:bd:88:e1 brd ff:ff:ff:ff:ff:ff
7: ens4d1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 24:be:05:bd:88:e2 brd ff:ff:ff:ff:ff:ff

主板板载两块 intel 万兆网卡 x540,准备其中一个连接到路由器,作为网络出口,改名wan1。

  • inet: 192.168.0.50
  • ether: 04:d4:c4:5a:e2:77

另外一个万兆电口和两块 hp544+ 网卡上的4个40g网口作为 lan 使用。

  1. eno1 作为 wan 口 (万兆) mac: e0:4f:43:ca:82:48 ip地址: 192.168.0.50
  2. eno2 作为 lan 口备用 (万兆) mac: e0:4f:43:ca:82:49 brd ff:ff:ff:ff:ff:ff
  3. ens3 作为 lan (40g) mac: 70:10:6f:aa:2a:81
  4. ens3d1 作为 lan (40g) mac: 70:10:6f:aa:2a:82
  5. ens4 作为 lan (40g) mac: 24:be:05:bd:88:e1
  6. ens4d1 作为 lan (40g) mac: 24:be:05:bd:88:e2

网卡改名和增加网桥

默认的名字不太好读,需要改名,另外我们要为这五个 LAN 创建一个网桥(llinux bridge),这个网桥就相当于一个虚拟的交换机。

在 ubuntu 20.04 上,可以采用 network-manager 来管理网路,方便后续操作。首先需要安装 network-manager:

sudo apt install network-manager

备份原有的网络配置文件:

sudo cp /etc/netplan/00-installer-config.yaml /etc/netplan/00-installer-config.yaml.original

然后修改文件内容:

sudo vi /etc/netplan/00-installer-config.yaml

内容如下:

network:
  version: 2
  renderer: NetworkManager
  ethernets:
    wan1:
      match:
        macaddress: e0:4f:43:ca:82:48
      set-name: wan1
      dhcp4: true
    lan1:
      match:
        macaddress: 70:10:6f:aa:2a:81
      set-name: lan1
      dhcp4: no
    lan2:
      match:
        macaddress: 24:be:05:bd:88:e2
      set-name: lan2
      dhcp4: no
    lan3:
      match:
        macaddress: 24:be:05:bd:88:e1
      set-name: lan3
      dhcp4: no
    lan4:
      match:
        macaddress: 70:10:6f:aa:2a:82
      set-name: lan4
      dhcp4: no
    lan5:
      match:
        macaddress: e0:4f:43:ca:82:49
      set-name: lan5
      dhcp4: no
    lan6:
      match:
        macaddress: e0:07:1b:78:3e:a1
      set-name: lan6
      dhcp4: no
    lan7:
      match:
        macaddress: e0:07:1b:78:3e:a2
      set-name: lan7
      dhcp4: no
  bridges:
    br:
      interfaces:
        - lan1
        - lan2
        - lan3
        - lan4
        - lan5
        - lan6
        - lan7
      addresses:
        - 192.168.100.50/24
      dhcp4: no

通过 mac 地址匹配,将配置参数应用到匹配的网卡上,比如为了提高可读性,将网卡重命名为 wan1 / lan2 / lan3 / lan4 / lan5

网桥上需要列出所有包含的 interface, 另外配置网桥地址,我这里为了方便,使用 192.168.0.1 和 192.168.100.1 两个网段,每台机器的两个网卡在这两个网段上的地址保持一致:

机器 192.168.0.1 网段 192.168.100.1 网段
skyserver 192.168.0.10 192.168.100.10
skyserver2 192.168.0.20 192.168.100.20
skyserver3 192.168.0.30 192.168.100.30
Skyserver4 192.168.0.40 192.168.100.40
skyserver5 192.168.0.50 (wan1) 192.168.100.50(网桥)
Skyserver6 192.168.0.60 192.168.100.60
skywork 192.168.0.90 192.168.100.90

保存后执行:

sudo netplan apply

最好重启一下机器, 有时网卡的重命名需要重启之后才能完全生效,为避免麻烦最好是重启之后再进行后续操作。

重启完成之后,查看网卡信息,可以看到全部网卡都按照上述的配置进行了重命名,而且网桥也添加好了:

$ ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
2: wan1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether e0:4f:43:ca:82:48 brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.50/24 brd 192.168.0.255 scope global noprefixroute wan1
3: lan5: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether e0:4f:43:ca:82:49 brd ff:ff:ff:ff:ff:ff
4: lan1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master br state UP group default qlen 1000
    link/ether 70:10:6f:aa:2a:81 brd ff:ff:ff:ff:ff:ff
5: lan4: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether 70:10:6f:aa:2a:82 brd ff:ff:ff:ff:ff:ff
6: lan3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master br state UP group default qlen 1000
    link/ether 24:be:05:bd:88:e1 brd ff:ff:ff:ff:ff:ff
7: lan2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master br state UP group default qlen 1000
    link/ether 24:be:05:bd:88:e2 brd ff:ff:ff:ff:ff:ff
8: br: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 24:be:05:bd:88:e1 brd ff:ff:ff:ff:ff:ff
    inet 192.168.100.50/24 brd 192.168.100.255 scope global noprefixroute br

安装 dnsmsaq

安装 dnsmasq 以提供 DHCP 和 DNS 服务:

sudo apt-get install dnsmasq

dnsmasq 安装完成后会在启动时会因为 53 端口被占用而失败:

......
invoke-rc.d: initscript dnsmasq, action "start" failed.
● dnsmasq.service - dnsmasq - A lightweight DHCP and caching DNS server
     Loaded: loaded (/lib/systemd/system/dnsmasq.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Wed 2023-03-22 09:03:53 UTC; 10ms ago
    Process: 1834 ExecStartPre=/usr/sbin/dnsmasq --test (code=exited, status=0/SUCCESS)
    Process: 1835 ExecStart=/etc/init.d/dnsmasq systemd-exec (code=exited, status=2)

Mar 22 09:03:53 skyserver5 systemd[1]: Starting dnsmasq - A lightweight DHCP and caching DNS server...
Mar 22 09:03:53 skyserver5 dnsmasq[1834]: dnsmasq: syntax check OK.
Mar 22 09:03:53 skyserver5 dnsmasq[1835]: dnsmasq: failed to create listening socket for port 53: Address already in use
Mar 22 09:03:53 skyserver5 dnsmasq[1835]: failed to create listening socket for port 53: Address already in use
Mar 22 09:03:53 skyserver5 dnsmasq[1835]: FAILED to start up
Mar 22 09:03:53 skyserver5 systemd[1]: dnsmasq.service: Control process exited, code=exited, status=2/INVALIDARGUMENT
Mar 22 09:03:53 skyserver5 systemd[1]: dnsmasq.service: Failed with result 'exit-code'.
Mar 22 09:03:53 skyserver5 systemd[1]: Failed to start dnsmasq - A lightweight DHCP and caching DNS server.
Processing triggers for systemd (245.4-4ubuntu3.20) ...

此时需要先停止 systemd-resolved 服务:

sudo systemctl stop systemd-resolved

可以取消 systemd-resolved 的自动启动,后面我们用 dnsmasq 替代它:

sudo systemctl disable systemd-resolved

备注:不要先执行 systemctl stop systemd-resolved 再去执行 apt-get install dnsmasq,因为 systemd-resolved stop 之后就不能做 dns 解析了,会导致 apt-get install 命令因为dns无法解析而失败。

随后重启 dnsmasq :

sudo systemctl restart dnsmasq.service

检查 dnsmasq 的状态:

sudo systemctl status dnsmasq.service

可以看到 dnsmasq 顺利启动:

● dnsmasq.service - dnsmasq - A lightweight DHCP and caching DNS server
     Loaded: loaded (/lib/systemd/system/dnsmasq.service; enabled; vendor preset: enabled)
     Active: active (running) since Wed 2023-03-22 09:07:04 UTC; 7s ago
    Process: 1988 ExecStartPre=/usr/sbin/dnsmasq --test (code=exited, status=0/SUCCESS)
    Process: 1989 ExecStart=/etc/init.d/dnsmasq systemd-exec (code=exited, status=0/SUCCESS)
    Process: 1998 ExecStartPost=/etc/init.d/dnsmasq systemd-start-resolvconf (code=exited, status=0/SUCCESS)
   Main PID: 1997 (dnsmasq)
      Tasks: 1 (limit: 38374)
     Memory: 820.0K
     CGroup: /system.slice/dnsmasq.service
             └─1997 /usr/sbin/dnsmasq -x /run/dnsmasq/dnsmasq.pid -u dnsmasq -7 /etc/dnsmasq.d,.dpkg-dist,.dpkg-old,.dpkg-new --local-service --trust-anchor=.,20326,8,2,e06>

Mar 22 09:07:04 skyserver5 systemd[1]: Starting dnsmasq - A lightweight DHCP and caching DNS server...
Mar 22 09:07:04 skyserver5 dnsmasq[1988]: dnsmasq: syntax check OK.
Mar 22 09:07:04 skyserver5 dnsmasq[1997]: started, version 2.80 cachesize 150
Mar 22 09:07:04 skyserver5 dnsmasq[1997]: DNS service limited to local subnets
Mar 22 09:07:04 skyserver5 dnsmasq[1997]: compile time options: IPv6 GNU-getopt DBus i18n IDN DHCP DHCPv6 no-Lua TFTP conntrack ipset auth nettlehash DNSSEC loop-detect ino>
Mar 22 09:07:04 skyserver5 dnsmasq[1997]: reading /etc/resolv.conf
Mar 22 09:07:04 skyserver5 dnsmasq[1997]: using nameserver 127.0.0.53#53
Mar 22 09:07:04 skyserver5 dnsmasq[1997]: read /etc/hosts - 7 addresses
Mar 22 09:07:04 skyserver5 systemd[1]: Started dnsmasq - A lightweight DHCP and caching DNS server.

注意 dnsmasq 读取了 /etc/resolv.conf 文件,目前这个文件的内容默认是 “nameserver 127.0.0.53”。这个肯定不合适作为 dnsmasq 的上流 dns 服务器,考虑到 /etc/resolv.conf 文件会被其他文件使用,直接修改这个文件也不合适。

比较合适的做法是为 dnsmasq 单独配置 resolv.conf 文件,比如:

sudo vi /etc/resolv.dnsmasq.conf

内容设置为路由器的地址:

nameserver 192.168.0.1

然后修改 dnsmasq 配置文件:

sudo vi /etc/dnsmasq.conf

设置 resolv-file 指向前面我们添加的 /etc/resolv.dnsmasq.conf 配置文件:

# Change this line if you want dns to get its upstream servers from
# somewhere other that /etc/resolv.conf
resolv-file=/etc/resolv.dnsmasq.conf

重启 dnsmasq 并检查 dnsmasq 的状态:

sudo systemctl restart dnsmasq.service
sudo systemctl status dnsmasq.service

可以看到 dnsmasq 这次正确读取到了上游dns服务器 :

Mar 22 09:31:07 skyserver5 dnsmasq[2222]: reading /etc/resolv.dnsmasq.conf
Mar 22 09:31:07 skyserver5 dnsmasq[2222]: using nameserver 192.168.0.1#53

nslookup 命令验证一下 dns 解析:

nslookup www.baidu.com

可以看到解析的结果:

Server:		127.0.0.53
Address:	127.0.0.53#53

Non-authoritative answer:
www.baidu.com	canonical name = www.a.shifen.com.
Name:	www.a.shifen.com
Address: 14.119.104.189
Name:	www.a.shifen.com
Address: 14.215.177.38

备注:

  • 一定要设置好 dnsmasq 的上游 dns 服务器,默认 127.0.0.53 指向 dnsmasq 自身,会造成无限循环
  • 在进行下一步之前,一定要确保 dnsmasq 正常工作,否则后续步骤报错会因为各种干扰不好排查

配置 dnsmsaq

安装好之后继续配置 dnsmsaq:

sudo vi /etc/dnsmasq.conf

修改 dnsmasq 配置文件内容如下:

# Change this line if you want dns to get its upstream servers from
# somewhere other that /etc/resolv.conf
resolv-file=/etc/resolv.dnsmasq.conf

no-hosts
listen-address=127.0.0.1,192.168.100.50
port=53

dhcp-range=192.168.100.200,192.168.100.250,255.255.255.0,12h
dhcp-option=option:router,192.168.100.50
dhcp-option=option:dns-server,192.168.100.50
dhcp-option=option:netmask,255.255.255.0

dhcp-host=24:be:05:bd:08:02,192.168.100.10                              # skyserver
dhcp-host=48:0f:cf:ef:08:11,192.168.100.20                              # skyserver2
dhcp-host=48:0f:cf:f7:89:c2,192.168.100.30                              # skyserver3
dhcp-host=9c:dc:71:57:bb:e2,192.168.100.90                              # skywork

其中

  • no-hosts: 不读取本地的 /etc/hosts 文件
  • listen-address 配置监听地址,这里使用 127.0.0.1 和网桥的地址
  • dhcp-range: 配置DHCP 的 IP 范围, 子网掩码, 租期
  • dhcp-option: 用来配置网关地址和DNS 地址(这里都使用网桥的地址),还有子网掩码
  • dhcp-host: 用来配置静态地址分配

之后重启 dnsmasq:

sudo systemctl restart dnsmasq.service
sudo systemctl status dnsmasq.service

备注: 这里也最好是重启一下机器,验证一下上面的改动。尤其是四个网卡都接好网线之后,验证一下自动获取的IP地址是否如预期。

特别注意: 需要检查 dnsmasq

验证高速网络

做一下网络验证,网桥所在的 192.168.100.0/24 号段和普通网络的 192.168.0.0/24 号段:

ping 192.168.100.10
ping 192.168.100.20
ping 192.168.100.30
ping 192.168.100.50
ping 192.168.100.90

ping 192.168.0.1
ping 192.168.0.10
ping 192.168.0.20
ping 192.168.0.30
ping 192.168.0.50
ping 192.168.0.90

这一步完成之后,由于目前每个机器上都有两个网卡,两个网段都可以连接,整个网络就基本联通了,每台机器都可以通过这两个网段的 IP 地址访问到其他机器。

但是,192.168.100 号段是不能连接外网的,两个网段之间也不能互相访问,如果只接一个网卡就不能访问另一个号段了。后面继续进行配置。

步骤2:40g网络连接千兆网络

这个步骤的目标是实现 40g 网络可以访问千兆网络和外网,这样可以不需要同时接 40g 网卡和千兆网卡,只要有 40g 网卡也可以正常访问两个网络和外网。

开启端口转发

sudo vi /etc/sysctl.conf

并取消下面这行的注释,开启端口转发功能:

net.ipv4.ip_forward=1

执行 sudo sysctl –p 令其立即生效。

sudo sysctl –p

备注:

有时这个命令会报错: sysctl: cannot stat /proc/sys/–p: No such file or directory 。不用管,后面重启即可。

sudo vi /etc/default/ufw

修改 DEFAULT_FORWARD_POLICY 为 ACCEPT :

# Set the default forward policy to ACCEPT, DROP or REJECT.  Please note that
# if you change this you will most likely want to adjust your rules
# DEFAULT_FORWARD_POLICY="DROP"
DEFAULT_FORWARD_POLICY="ACCEPT"
sudo vi /etc/ufw/before.rules

增加以下内容:

# 这是已有内容
# allow all on loopback
-A ufw-before-input -i lo -j ACCEPT
-A ufw-before-output -o lo -j ACCEPT

# 这是新增内容
# allow all on LAN: lan1/lan2/lan3/lan4/lan5
-A ufw-before-input -i lan1 -j ACCEPT
-A ufw-before-output -o lan1 -j ACCEPT
-A ufw-before-input -i lan2 -j ACCEPT
-A ufw-before-output -o lan2 -j ACCEPT
-A ufw-before-input -i lan3 -j ACCEPT
-A ufw-before-output -o lan3 -j ACCEPT
-A ufw-before-input -i lan4 -j ACCEPT
-A ufw-before-output -o lan4 -j ACCEPT
-A ufw-before-input -i lan5 -j ACCEPT
-A ufw-before-output -o lan5 -j ACCEPT
-A ufw-before-input -i lan6 -j ACCEPT
-A ufw-before-output -o lan6 -j ACCEPT
-A ufw-before-input -i lan7 -j ACCEPT
-A ufw-before-output -o lan7 -j ACCEPT

然后在 *filter :ufw-before-input - [0:0] 之前加入以下内容:

*nat
:POSTROUTING ACCEPT [0:0]

# Forward traffic through wan1 - Change to match you out-interface
-A POSTROUTING -s 192.168.100.0/24 -o wan1 -j MASQUERADE

# don't delete the 'COMMIT' line or these nat table rules won't
# be processed
COMMIT

两次改动的位置如图所示:

ufw

修改完成之后重启机器。

验证网络访问

验证一下,我在 skyserver 这台机器上,IP地址为 192.168.0.10 和 192.168.100.10。拔掉千兆网线,这样 192.168.0.0/24 网段不可用,就只剩下 192.168.100.0/24 网段了。route -n 看一下现在的路由表:

$ route -n
内核 IP 路由表
目标            网关            子网掩码        标志  跃点   引用  使用 接口
0.0.0.0         10.0.0.1        0.0.0.0         UG    20102  0        0 ens1
10.0.0.0        0.0.0.0         255.255.255.0   U     102    0        0 ens1
systemd-resolve --status

看一下目前的 dns 解析设置:

Link 4 (ens4d1)
      Current Scopes: DNS           
DefaultRoute setting: yes           
       LLMNR setting: yes           
MulticastDNS setting: no            
  DNSOverTLS setting: no            
      DNSSEC setting: no            
    DNSSEC supported: no            
  Current DNS Server: 192.168.100.50
         DNS Servers: 192.168.100.50

可以看到目前 ens4d1 这个40g网卡的 dns 服务器正确的指向了网桥地址(192.168.100.50)。

测试一下连通性,这些地址都可以ping通:

ping 192.168.100.10     # OK,自己的地址
ping 192.168.100.50     # OK,自己的地址

ping 192.168.0.1		# OK
ping 192.168.0.50
ping 192.168.0.90

ssh到其他机器,同样也可以ping通这些地址。这说明网桥(192.168.100.50)和 wan1(192.168.0.50)之间的转发已经生效。

其他几台机器上同样测试,拔掉千兆网线,只使用40g网络也可以正常访问外网(注意处理dns解析)和 192 号段内网。

步骤3:千兆网络连接40g网络

步骤1完成后两个网络可以相互访问,但前提是每台机器上都插有千兆网卡(192.168.0网段)和40g网卡(192.168.100网段)。步骤2完成后,插有40g网卡(192.168.100网段)的机器通过转发可以访问插有千兆网卡(192网段)的机器。但是,此时如果一台机器只有千兆网卡(192.168.0网段)是无法访问 40g网络(192.168.100网段)的机器的。

验证:插上千兆网线,拔掉40g网线,这样 192.168.0.0/24 网段可用,10.0.0.0/24 网段不可用。route -n 看一下现在的路由表:

$ route -n
内核 IP 路由表
目标            网关            子网掩码        标志  跃点   引用  使用 接口
0.0.0.0         192.168.0.1     0.0.0.0         UG    100    0        0 eno1
192.168.0.0     0.0.0.0         255.255.255.0   U     100    0        0 eno1

测试一下连通性,192.168.0.0/24 网段的地址都可以ping通:

ping 192.168.0.1		# OK
ping 192.168.0.10		# OK.自己
ping 192.168.0.20		# OK
ping 192.168.0.50   	# OK.路由节点机器,网桥在这里

而 192.168.100.0/24 网段因为没有路由导致无法访问,解决的方式有两个: 全家静态路由和本地单独路由。

全局静态路由

直接在路由器上搞定,在我的网络中的路由器(192.168.0.1) 配置静态路由,增加一个条目:

static-route-table

备注:图中的 10.0.0.0 是最初使用的号段,后来发现和容器网络冲突就改成 192.168.100.0 了。

这样所有访问 192.168.100.0 号段的请求包都会在抛到路由器(192.168.0.1网关)之后,就会被路由器转发给到网桥所在的机器上(192.168.0.50),而在 192.168.0.50 这台机器上的路由表有 192.168.100 号段的条目,这样请求包就进去了 192.168.100 号段。

这个方案的好处是超级简单,只要改动路由器即可,无需在每台机器上配置路由,方便。缺点是需要路由器支持,有些路由器(比如 tp link)会故意在家用型号中不支持静态路由。

在 openwrt 下,也有类似的配置,打开 “网络“-》 “静态路由”,新建一个 ipv4 静态路由即可:

openwrt

本地单独路由

另外一个办法就是在每台机器上增加路由信息,对于 192.168.100 号段的请求包直接路由给到 192.168.0.50 机器。这是现在的路由表:

$ route -n
内核 IP 路由表
目标            网关            子网掩码        标志  跃点   引用  使用 接口
0.0.0.0         192.168.0.1     0.0.0.0         UG    100    0        0 eno1
192.168.0.0     0.0.0.0         255.255.255.0   U     100    0        0 eno1

因为我这台工作机也是ubuntu,所以就简单了,直接配置 netplan:

network:
  version: 2
  renderer: NetworkManager
  ethernets:
    wan1:
      match:         
        macaddress: 40:b0:76:9e:9e:7e
      set-name: wan1 
      dhcp4: false
      addresses: 
        - 192.168.0.40/24
      gateway4: 192.168.0.1
      nameservers:
        addresses:
          - 192.168.0.1
      routes:
      - to: 192.168.100.0/24
        via: 192.168.0.40

sudo netplan apply 之后,再看路由表信息,增加了一条:

route -n      
内核 IP 路由表
目标            网关            子网掩码        标志  跃点   引用  使用 接口
0.0.0.0         192.168.0.1     0.0.0.0         UG    100    0        0 wan1
192.168.100.0        192.168.0.50    255.255.255.0   UG    100    0        0 wan1
192.168.0.0     0.0.0.0         255.255.255.0   U     100    0        0 wan1

但这个方案需要每台机器都改一遍,麻烦,还是路由器静态路由表更方便。

测试速度

在两台机器上都安装 iperf3:

sudo apt install iperf3

启动服务器端:

iperf3 -s 192.168.100.50

启动客户端:

iperf3 -c 192.168.100.50 -P 5 -t 100

速度一般能在30g+,最高能到 37.7g,接近 40g 网卡的理论上限。

总结

配置过程整体来说还算清晰,熟练之后一套配置下来也就十几分钟,就可以得到一台性能非常优越而且特别静音的高速网络交换机。

不过就是子网使用起来还是有点麻烦,没有足够的必要,还是用简单软交换方案更方便。