安装必要包
1
|
apt-get update && apt-get install sudo vim curl wget gpg
|
安装NTP服务和设置时区
1
|
apt-get install systemd-timesyncd
|
设置时区
1
|
timedatectl set-timezone Asia/Shanghai
|
部署 InfluxDB
添加官方的 APT 源
1
2
3
4
|
# Ubuntu and Debian
wget -q https://repos.influxdata.com/influxdata-archive_compat.key
echo '393e8779c89ac8d958f81f942f9ad7fb82a25e133faddaf92e15b16e6ac9ce4c influxdata-archive_compat.key' | sha256sum -c && cat influxdata-archive_compat.key | gpg --dearmor | sudo tee /etc/apt/trusted.gpg.d/influxdata-archive_compat.gpg > /dev/null
echo 'deb [signed-by=/etc/apt/trusted.gpg.d/influxdata-archive_compat.gpg] https://repos.influxdata.com/debian stable main' | sudo tee /etc/apt/sources.list.d/influxdata.list
|
申请证书
下载 acme.sh
创建 alias
1
|
alias acme.sh=~/.acme.sh/acme.sh
|
启用自动更新
1
|
acme.sh --upgrade --auto-upgrade
|
导入 DNS API,怎么使用 DNS API wiki
1
2
3
4
5
6
7
|
# 操作前备份 ~/.bashrc 及写入 CF_Token 到 ~/.bashrc
cp ~/.bashrc ~/.bashrc.backup && cat ~/.bashrc
# Cloudflare DNS API Token
export CF_Token="your_CF_Token".
export CF_Account_ID="CF_Account_ID"
export CF_Zone_ID="CF_Zone_ID"
|
加载环境变量
检查是否生效
删除 ~/.bashrc
的备份
正式申请证书 可以参考本博客 使用 Acme.sh 申请 Google 的免费 SSL 证书
1
2
3
4
5
6
7
|
# 申请通配符证书
acme.sh --issue --dns dns_cf -d '*.example.com'
# 或者
# 申请子域名证书
acme.sh --issue --dns dns_cf -d 'influxdb.example.com' -d 'grafana.example.com'
|
安装influxdb2
1
|
apt-get update && apt-get install influxdb2
|
删除 key 文件
1
|
rm influxdata-archive_compat.key
|
创建目录及更改目录所属用户
创建专门存放 InfluxDB 的数据的目录 创建专门存放 InfluxDB 的数据的目录 /mnt/data/influxdb
1
|
mkdir -p /mnt/data/influxdb
|
更改 /mnt/data/influxdb
的所属用户
1
|
chown influxdb /mnt/data/influxdb && chgrp influxdb /mnt/data/influxdb && chmod 750 /mnt/data/influxdb
|
安装证书和私钥
先创建一个脚本
1
|
mkdir -p /etc/influxdb/tls && nano /etc/influxdb/tls/set-permissions-and-restart.sh
|
输入以下内容
1
2
3
4
5
6
7
8
9
10
11
12
|
#!/bin/bash
# 设置证书、私钥的用户
chown influxdb /etc/influxdb/tls/key.pem /etc/influxdb/tls/fullchain.pem
# 设置证书、私钥的权限为 400
chmod 400 /etc/influxdb/tls/key.pem /etc/influxdb/tls/fullchain.pem
# 检查并重启 influxdb.service
if systemctl list-units | grep influxdb.service; then
systemctl restart influxdb.service
fi
|
acme 安装证书和私钥
1
2
3
4
|
acme.sh --install-cert -d '*.example.com' \
--key-file /etc/influxdb/tls/key.pem \
--fullchain-file /etc/influxdb/tls/fullchain.pem \
--reloadcmd "bash /etc/influxdb/tls/set-permissions-and-restart.sh"
|
编辑配置文件vim /etc/influxdb/config.toml
输入以下内容,详细的配置说明请查阅官网:
1
2
3
4
5
6
7
|
bolt-path = "/mnt/data/influxdb/influxd.bolt" # 单磁盘忽略
engine-path = "/mnt/data/influxdb/engine" # 单磁盘忽略
sqlite-path = "/mnt/data/influxdb/influxd.sqlite" # 单磁盘忽略
tls-cert = "/etc/influxdb/tls/fullchain.pem"
tls-key = "/etc/influxdb/tls/key.pem"
tls-min-version = "1.3"
|
设置环境变量
检查 $XDG_RUNTIME_DIR/bus
是否存在
1
|
echo $XDG_RUNTIME_DIR/bus
|
如果存在,可以将 DBUS_SESSION_BUS_ADDRESS=$XDG_RUNTIME_DIR/bus
写入到 /etc/default/influxdb2
1
|
echo -e "DBUS_SESSION_BUS_ADDRESS=\$XDG_RUNTIME_DIR/bus" >> /etc/default/influxdb2
|
如果不存在,就将 $XDG_RUNTIME_DIR/bus
改为 /dev/null
写入到 /etc/default/influxdb2
1
|
echo "DBUS_SESSION_BUS_ADDRESS=/dev/null" >> /etc/default/influxdb2
|
启动
1
|
systemctl start influxdb
|
浏览器 » https://influxdb.example.com:8086

初始化信息随意填写,但要记住后续登录 WebUI 都需要。点击 CONTINUE

把生成的 Token 保存下来,后续不再显示。点击 CONFIGURE LATER

进入之后,需要创建一个新的 Bucket,用于存放采集到的监控数据,与 Initial Bucket 的数据分离。路径:
Load Data
> Bukets
> CREATE BUCKET

Name 我就命名为 Telegraf 吧,然后选择数据保留的时间,可以选择从不删除和选择预设时间、自定义时间

部署 Grafana
官方下载地址
1
2
3
|
apt-get update && apt-get install libfontconfig1 musl
wget https://dl.grafana.com/oss/release/grafana_11.2.0_amd64.deb
dpkg -i grafana_11.2.0_amd64.deb
|
删除包
1
|
rm grafana_11.2.0_amd64.deb
|
设置自启动
1
|
systemctl enable grafana-server
|
安装证书
先创建证书更新脚本
1
|
mkdir -p /etc/grafana/tls && nano /etc/grafana/tls/set-permissions-and-restart.sh
|
输入以下内容
1
2
3
4
5
6
7
8
9
10
11
12
|
#!/bin/bash
# 设置证书、私钥的用户
chown grafana /etc/grafana/tls/key.pem /etc/grafana/tls/fullchain.pem
# 设置证书、私钥的权限为 400
chmod 400 /etc/grafana/tls/key.pem /etc/grafana/tls/fullchain.pem
# 检查并重启 grafana-server.service
if systemctl list-units | grep grafana-server.service; then
systemctl restart grafana-server.service
fi
|
执行 acme.sh
1
2
3
4
|
acme.sh --install-cert -d '*.example.com' \
--key-file /etc/grafana/tls/key.pem \
--fullchain-file /etc/grafana/tls/fullchain.pem \
--reloadcmd "bash /etc/grafana/tls/set-permissions-and-restart.sh"
|
配置 HTTPS
1
|
nano /etc/grafana/grafana.ini
|
修改以下内容
1
2
3
4
5
6
7
8
9
10
|
[server]
protocol = h2
min_tls_version = "TLS1.3"
http_addr = 0.0.0.0
http_port = 3000
domain = grafana.example.com
enforce_domain = false
root_url = https://grafana.example.com:3000
cert_file = /etc/grafana/tls/fullchain.pem
cert_key = /etc/grafana/tls/key.pem
|
启动
1
|
systemctl start grafana-server
|
访问
浏览器 » https://grafana.example.com:3000,默认账号密码均是 admin
Grafana 连接 InfluxDB
回到 InfluxDB WebUI 创建允许 Grafana 读取数据的 Token。路径:
InfluxDB
> Load Data
> API Tokens
> GENERATE API TOKEN
> Custom API Token

在 Telegraf 的权限栏里勾上 Read。点击 GENERATE
把生成的 Token 保存下来,后续不再显示。(不小心没保存也不要紧,可以凭在初始化时生成的 Token 在命令行里查看)

回到 Grafana WebUI 添加数据源。路径:
Grafana
> Home
> Connection
> Data Sources
> Add data source
> InfluxDB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
|
Query language
InfluxQL
HTTP
url: https://localhost:8086
Auth
Skip TLS Verify 开启
Custom HTTP Headers
Header: Authorization | Value: Token+空格+<Token> 示例: Token zA61yxXG_SoS-VeWYTXBi27Dg8RDcCiMKne2kyafXU7jRAFgNzreFVKhazrxTl7W00_CJjG-cKbEzcdqkmKz1w==
InfluxDB Details**
Database: Telegraf
HTTP Method: GET
|

部署 Telegraf
安装必要的包
1
2
|
#Telegraf 与 InfluxDB 部署在同一台服务器上不需要此步骤,前面已经安装过了
apt-get update && apt-get install sudo wget gpg
|
添加 InfluxData 官方的 APT 源
RedHat系请查阅官网:https://www.influxdata.com/downloads
1
2
3
4
|
# Ubuntu and Debian Telegraf 与 InfluxDB 部署在同一台服务器上不需要此步骤,前面已经添加过了
wget -q https://repos.influxdata.com/influxdata-archive_compat.key
echo '393e8779c89ac8d958f81f942f9ad7fb82a25e133faddaf92e15b16e6ac9ce4c influxdata-archive_compat.key' | sha256sum -c && cat influxdata-archive_compat.key | gpg --dearmor | sudo tee /etc/apt/trusted.gpg.d/influxdata-archive_compat.gpg > /dev/null
echo 'deb [signed-by=/etc/apt/trusted.gpg.d/influxdata-archive_compat.gpg] https://repos.influxdata.com/debian stable main' | sudo tee /etc/apt/sources.list.d/influxdata.list
|
安装
1
|
apt-get update && apt-get install telegraf
|
设置 Telegraf 环境变量
检查 $XDG_RUNTIME_DIR/bus
是否存在
1
2
|
echo $XDG_RUNTIME_DIR/bus
/run/user/0/bus
|
如果存在,可以将 DBUS_SESSION_BUS_ADDRESS=$XDG_RUNTIME_DIR/bus
写入到 /etc/default/telegraf
1
|
echo -e "DBUS_SESSION_BUS_ADDRESS=\$XDG_RUNTIME_DIR/bus" >> /etc/default/telegraf
|
如果不存在,就将 $XDG_RUNTIME_DIR/bus
改为 /dev/null
写入到 /etc/default/telegraf
1
|
echo "DBUS_SESSION_BUS_ADDRESS=/dev/null" >> /etc/default/telegraf
|
创建配置文件
Telegraf 加载配置的方式有两种,一是本地配置文件,二是远程配置文件,其中二选一
本地配置文件
在 /etc/telegraf/telegraf.conf 下直接编辑就行。
远程配置文件
因为第一次部署还没有配置文件,需要要先创建。后续部署到其他服务器无需重复此步骤
回到 InfluxDB WebUI 创建远程配置文件。路径:
InfluxDB
> Load Data
> Telegraf
> CREATE CONFIGURATION

Bucket 选择 Telegraf ,模板随便选一个。点击 CONTINUE CONFIGURING

Configuration Name 我就命名为 Example 吧,然后删掉模板的全部内容,填入自己的配置,点击 SAVE AND TEST。
详细的配置说明请查阅官方Github文档
也可以参考我的配置:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
|
[global_tags]
host_info = "$HOST_INFO"
[agent]
interval = "5s"
round_interval = true
metric_batch_size = 200
metric_buffer_limit = 2000
collection_jitter = "0s"
# collection_offset = "0s"
flush_interval = "6s"
flush_jitter = "0s"
precision = "0s"
debug = false
quiet = false
logformat = "text"
# logfile = "/var/log/telegraf/run.log"
logfile_rotation_interval = "24h"
logfile_rotation_max_size = "32MB"
logfile_rotation_max_archives = 7
log_with_timezone = "Asia/Shanghai"
# hostname = ""
omit_hostname = true
# snmp_translator = "netsnmp"
# statefile = ""
# skip_processors_after_aggregators = false
[[inputs.cpu]]
## Plugin configuration
percpu = false
totalcpu = true
collect_cpu_time = false
report_active = true
core_tags = false
## Modifier filters
fieldinclude = ["usage_system", "usage_user"]
[[inputs.mem]]
## Plugin configuration
# no configuration
## Modifier filters
fieldinclude = ["total", "used", "used_percent"]
[[inputs.swap]]
## Plugin configuration
# no configuration
## Modifier filters
fieldinclude = ["total", "used", "used_percent"]
[[inputs.disk]]
## Plugin configuration
# mount_points = ["/"]
ignore_fs = ["tmpfs", "devtmpfs"]
# ignore_mount_opts = []
## Modifier filters
tagexclude = ["fstype", "mode", "path"]
fieldinclude = ["total", "used", "used_percent"]
[[inputs.diskio]]
## Plugin configuration
devices = ["sd[a-z]", "vd[a-z]", "xvd[a-z]"]
skip_serial_number = true
# device_tags = ["ID_FS_TYPE", "ID_FS_USAGE"]
# name_templates = ["$ID_FS_LABEL","$DM_VG_NAME/$DM_LV_NAME"]
## Modifier filters
fieldinclude = ["reads", "writes"]
[[inputs.net]]
## Plugin configuration
interfaces = ["enX*", "ens*", "eth*"]
ignore_protocol_stats = true
## Modifier filters
fieldinclude = ["bytes_sent", "bytes_recv"]
[[inputs.netstat]]
## Plugin configuration
# no configuration
## Modifier filters
fieldinclude = ["tcp_established", "udp_socket"]
[[inputs.system]]
## Plugin configuration
# no configured.
## Modifier filters
fieldinclude = ["load1", "load15", "load5"]
[[inputs.system]]
interval = "60s"
## Plugin configuration
# no configured.
## Modifier filters
fieldinclude = ["uptime_format"]
[[inputs.ping]]
interval = "30s"
tags = {city = "Haikou", company = "Telecom"}
## Plugin configuration
urls = ["124.225.43.220"]
method = "exec"
count = 1
ping_interval = 1.0
timeout = 1.0
size = 64
## Modifier filters
fieldinclude = ["average_response_ms", "percent_packet_loss", "result_code"]
[[inputs.ping]]
interval = "30s"
tags = {city = "Haikou", company = "Unicom"}
## Plugin configuration
urls = ["153.0.226.35"]
method = "exec"
count = 1
ping_interval = 1.0
timeout = 1.0
size = 64
## Modifier filters
fieldinclude = ["average_response_ms", "percent_packet_loss", "result_code"]
[[inputs.ping]]
interval = "30s"
tags = {city = "Haikou", company = "Mobile"}
## Plugin configuration
urls = ["111.29.29.219"]
method = "exec"
count = 1
ping_interval = 1.0
timeout = 1.0
size = 64
## Modifier filters
fieldinclude = ["average_response_ms", "percent_packet_loss", "result_code"]
[[inputs.ping]]
interval = "30s"
tags = {city = "Guangzhou", company = "Telecom"}
## Plugin configuration
urls = ["183.47.126.35"]
method = "exec"
count = 1
ping_interval = 1.0
timeout = 1.0
size = 64
## Modifier filters
fieldinclude = ["average_response_ms", "percent_packet_loss", "result_code"]
[[inputs.ping]]
interval = "30s"
tags = {city = "Guangzhou", company = "Unicom"}
## Plugin configuration
urls = ["157.148.58.29"]
method = "exec"
count = 1
ping_interval = 1.0
timeout = 1.0
size = 64
## Modifier filters
fieldinclude = ["average_response_ms", "percent_packet_loss", "result_code"]
[[inputs.ping]]
interval = "30s"
tags = {city = "Guangzhou", company = "Mobile"}
## Plugin configuration
urls = ["120.233.18.250"]
method = "exec"
count = 1
ping_interval = 1.0
timeout = 1.0
size = 64
## Modifier filters
fieldinclude = ["average_response_ms", "percent_packet_loss", "result_code"]
[[inputs.ping]]
interval = "30s"
tags = {city = "Shanghai", company = "Telecom"}
## Plugin configuration
urls = ["114.80.236.139"]
method = "exec"
count = 1
ping_interval = 1.0
timeout = 1.0
size = 64
## Modifier filters
fieldinclude = ["average_response_ms", "percent_packet_loss", "result_code"]
[[inputs.ping]]
interval = "30s"
tags = {city = "Shanghai", company = "Unicom"}
## Plugin configuration
urls = ["210.22.97.1"]
method = "exec"
count = 1
ping_interval = 1.0
timeout = 1.0
size = 64
## Modifier filters
fieldinclude = ["average_response_ms", "percent_packet_loss", "result_code"]
[[inputs.ping]]
interval = "30s"
tags = {city = "Shanghai", company = "Mobile"}
## Plugin configuration
urls = ["221.183.90.237"]
method = "exec"
count = 1
ping_interval = 1.0
timeout = 1.0
size = 64
## Modifier filters
fieldinclude = ["average_response_ms", "percent_packet_loss", "result_code"]
[[inputs.ping]]
interval = "30s"
tags = {city = "Beijing", company = "Telecom"}
## Plugin configuration
urls = ["49.7.37.74"]
method = "exec"
count = 1
ping_interval = 1.0
timeout = 1.0
size = 64
## Modifier filters
fieldinclude = ["average_response_ms", "percent_packet_loss", "result_code"]
[[inputs.ping]]
interval = "30s"
tags = {city = "Beijing", company = "Unicom"}
## Plugin configuration
urls = ["111.206.209.44"]
method = "exec"
count = 1
ping_interval = 1.0
timeout = 1.0
size = 64
## Modifier filters
fieldinclude = ["average_response_ms", "percent_packet_loss", "result_code"]
[[inputs.ping]]
interval = "30s"
tags = {city = "Beijing", company = "Mobile"}
## Plugin configuration
urls = ["112.34.111.194"]
method = "exec"
count = 1
ping_interval = 1.0
timeout = 1.0
size = 64
## Modifier filters
fieldinclude = ["average_response_ms", "percent_packet_loss", "result_code"]
[[inputs.internet_speed]] #
interval = "6h"
## Plugin configuration
memory_saving_mode = false
cache = false
test_mode = "single"
## Modifier filters
tagexclude = ["server_id", "source", "test_mode"]
fieldinclude = ["download", "upload"]
[[processors.enum]]
namepass = ["swap"]
[[processors.enum.mapping]]
field = "total"
dest = "used_percent"
[processors.enum.mapping.value_mappings]
0 = -1.0
[[processors.enum]]
namepass = ["swap"]
[[processors.enum.mapping]]
field = "total"
dest = "status_code"
default = 0
[processors.enum.mapping.value_mappings]
0 = 1
[[processors.enum]]
namepass = ["ping"]
[[processors.enum.mapping]]
field = "result_code"
dest = "average_response_ms"
[processors.enum.mapping.value_mappings]
1 = 0.000
2 = 0.000
[[processors.split]]
namepass = ["swap"]
drop_original = true
[[processors.split.template]]
name = "swap"
tags = ["*"]
fields = ["total", "used", "used_percent"]
[[processors.split.template]]
name = "swap"
tags = ["*"]
fields = ["status_code"]
[[processors.filter]]
default = "pass"
[[processors.filter.rule]]
name = ["swap"]
fields = ["in", "out"]
action = "drop"
[[processors.filter.rule]]
name = ["system"]
fields = ["uptime"]
action = "drop"
[[outputs.influxdb_v2]]
urls = ["https://influxdb.example.com:8086"]
token = "$INFLUX_TOKEN"
organization = "Monitor"
bucket = "Telegraf"
# bucket_tag = ""
# exclude_bucket_tag = false
timeout = "5s"
# http_headers = {"X-Special-Header" = "Special-Value"}
# http_proxy = "http://corporate.proxy:3128"
# user_agent = "telegraf"
content_encoding = "gzip"
# influx_uint_support = false
# influx_omit_timestamp = false
# ping_timeout = "0s"
# read_idle_timeout = "0s"
# tls_ca = "/etc/telegraf/ca.pem"
# tls_cert = "/etc/telegraf/cert.pem"
# tls_key = "/etc/telegraf/key.pem"
insecure_skip_verify = false
|

把生成的 Configuration API Token 和 Configuration URL 都复制下来,保存

再次编辑配置,把 [global_tags]
以上 InfluxDB 自动添加的,多余的内容删除,只保留自己的。点击 SAVE CHANGES

加载配置
本地配置加载的方式
1
2
3
4
5
6
7
8
9
10
11
|
cat >> /etc/profile.d/telegraf.sh << EOF
# Telegraf environment
export HOST_INFO="这里填服务器的信息,随意填写,知道是哪一台服务器就行,比如:xxCloud_HK-1C1G"
export INFLUX_TOKEN="<Configuration API Token>"
EOF
# 加载环境变量
source /etc/profile
# 检查是否生效
echo -e "$HOST_INFO\n$INFLUX_TOKEN"
|
启动
1
|
systemctl start telegraf
|
远程配置加载的方式
设置 Telegraf
环境变量
1
2
3
4
5
|
cat >> /etc/default/telegraf << EOF
HOST_INFO="这里填服务器的信息,随意填写,知道是哪一台服务器就行,比如:xxCloud_HK-1C1G"
TELEGRAF_OPTS="-config <Configuration URL>"
INFLUX_TOKEN="<Configuration API Token>"
EOF
|
另外,使用远程配置文件需要清空 /etc/telegraf/telegraf.conf 里的内容或修改 /lib/systemd/system/telegraf.service
1
2
3
4
5
6
7
|
echo "" > /etc/telegraf/telegraf.conf
# 或者
nano /lib/systemd/system/telegraf.service # 将 ExecStart 参数修改成以下
ExecStart=/usr/bin/telegraf $TELEGRAF_OPTS
|
可视化数据
路径:Grafana
> Dashboards
> New
> New dashboard
> Add visualization
> Select data source
