[转帖]玩火的容器内存控制 CGroup - 容器基础拾遗 Part 1

玩火,容器,内存,控制,cgroup,基础,拾遗,part · 浏览次数 : 0

小编点评

**container_memory_usage_bytes** 指的是容器中使用内存的字节数。 **container_memory_mapped_file** 指的是容器中映射在内存中的文件大小。 ** page cache usage by apps** 指的是容器中应用程序使用的页面缓存大小。 ** OOM Kill** 指的是 Kubernetes 中当容器内存使用达到限制时被杀死的现象。 ** cgroup memory.usage_in_bytes** 指的是 cgroup 中每个节点可使用的内存量。 ** container_memory_working_set_bytes** 指的是容器内正在使用的页面缓存数量。 ** container_memory_rss** 指的是容器内已使用的总内存量。

正文

https://www.modb.pro/db/555818

 


 
    • 我们在谈容器内存时,到底在谈什么?

  • CGroup 内存说明

    • 强制回收内存 memory.force_empty

    • 基于内存阈值水位的通知

    • 不要 OOM Kill,只是暂停

    • memory.stat

    • memory.usage_in_bytes

    • memory.numa_stat

    • memory.failcnt

    • 主记账范围

    • 内核内存记账范围

    • 内存回收

    • Page

    • anonymous pages

    • LRU list

    • LRU list 组

    • 基础概念

    • CGroup 概述

    • 状态文件

    • 控制与事件通知

  • 监控与指标

    • container_memory_usage_bytes

    • container_memory_working_set_bytes

    • kubectl top

    • container_memory_cache 与 container_memory_mapped_file 的关系

    • 什么 metric 才是 OOM Kill 相关

    • K8s 的容器指标

    • 常被误解的 K8s 指标

  • 内存限制对 IO 性能的影响

  • 了解内核数据结构

  • Page Cache 占用分析工具

  • POD 的另一个杀手 驱逐 eviction

    • 驱逐 eviction 机制

    • 驱逐 eviction 算法参考指标

  • 容器内存 kill 法大全

  • 参考

容器内存限制是个矛盾而重要的选择,给多了浪费资源,给少了服务随机崩溃。

CGroup 内存控制是容器资源控制的核心。她是个规律严明的看守者
,在应用超限时狠心地 OOM Klll。她同时也有宽容的一面,在应用资源不足时,调配和释放 Cache 给应用使用。而其内心的记账算法却耐人寻味。要观察和监控她的状态和行为,更是千条万绪。本文尝试用作分析和梳理。

🤫 看完上面的看守者
的比喻,如果你想到身边的那位,就与我无关了。

在去年,我写了一篇:《把大象装入货柜里——Java容器内存拆解》其中简单提到了 CGroup 内存控制 和 Java 的关系。而最近在工作上又遇到到容器的内存问题,于是想比较深入和系统地去了解一下这个天坑。每个做容器化(上云)都必须面对,但最想逃避的大坑。

如发现本文图片不清,或链接失效,请点阅读更多或阅读原文。

我们在谈容器内存时,到底在谈什么?

先列几个问题:

  • 容器内存使用,只是进程用户态使用的内存?如果是,包括哪些?
  • 如果包括内核态的内存,有哪些内核态内存?
  • 所有内存都是强占用的吗?还是可以由内核必要时伸缩和释放?在监控到个别指标接近甚至超过上限时,一定会 OOM Kill 吗?

本文尝试分析上面问题,并发掘 CGroup 背后鲜为人知的秘技 ✨🔮 帮助精确分析 OOM Kill 问题。同时说说容器内存的估算与监控上的坑。

CGroup 内存说明

基础概念

基础概念
这一节有点枯燥和 TL;DR,不喜可跳过。

Page

操作系统为提高管理内存的效率,是以 Page 为单位管理内存的,一般一个 Page 是 4kb:

以下图片和部分文字来源:https://lwn.net/Articles/443241/ 与 https://biriukov.dev/docs/page-cache/4-page-cache-eviction-and-page-reclaim/

[System memory map]

anonymous pages

anonymous pages
,直译为 匿名页面
,就是没有实际文件对应的内存 Page 了。如 应用的 Heap,Stack 空间。相反的概念就是 file pages,包括 page cache、file mapping 、tmpfs 等等。

LRU list

Linux 维护一个 LRU list 去保存 Page 的元信息,这就可以按最近访问页面的顺序去遍历页面了:

[Active LRU]

而为了提高回收时的效率,Linux 使用了双 LRU list:

[Two LRUs]

即 active LRU
与 inactive LRU

他们的定位是:

如果内核认为在不久的将来可能不需要它们,它会将页面从active LRU
移动到inactive LRU
。如果某些进程试图访问它们,则可以将inactive LRU
中的页面快速移回active LRU
inactive LRU
可以被认为是系统正在考虑很快回收的页面的一种缓刑区域。

active LRU
与 inactive LRU
的流转关系如下:

inactive active LRU page cache lists

LRU list 组

当然,实现上比这还要复杂。当前的内核实际上维护了五个 LRU 列表。

  • anonymous pages
    有单独的active LRU
    inactive LRU
    - 这些页面的回收策略是不同的,如果系统在没有交换分区(swap)
    的情况下运行,它们不会被回收(这是一般 Kubernetes 的情况了,尽管新版本有出入)。

  • 不可回收的页面列表(LRU_UNEVICTABLE
    ) , 例如,已锁定到内存中的页面。

  • 文件相关的列表

5 个列表的源码见:

enum lru_list {
LRU_INACTIVE_ANON = LRU_BASE,//anonymous pages
LRU_ACTIVE_ANON = LRU_BASE + LRU_ACTIVE,
LRU_INACTIVE_FILE = LRU_BASE + LRU_FILE,//file pages
LRU_ACTIVE_FILE = LRU_BASE + LRU_FILE + LRU_ACTIVE,
LRU_UNEVICTABLE,//不可回收
NR_LRU_LISTS//边界
};

在未有 CGroup 之前,每个 NUMA Node + Zone 内存区域
都存在一组这些列表,被称为“全局 LRU
”:

img

图:NUMA 下的 LRU,图片来源

有了 cgroup mem controller
后,因为要控制和跟踪每个 cgroup 的内存使用,增加了另一个级别的复杂性。cgroup mem controller
需要跟踪每个页面的更多信息,将每个页面与所属的cgroup mem controller
相关联。将该信息添加到 struct page 页面元信息
中。

有了 cgroup mem controller
后,LRU 列表处于每个 cgroup 和 每个 NUMA node 一组的级别。每个内存 cgroup 都有自己的内存回收器专用 LRU 列表:

img

图:CGroup + NUMA 下的 LRU,图片来源

CGroup 概述

Linux 的 kernel.org 官方文档:

Memory Resource Controller from kernel.org

这个文档的读者对象是内核开发者或研究者,对新人不太好入门。如果硬翻译写在这里,就是赶客出门了。不过一些核心信息还得说说。

主记账范围

主记账内存的上限配置在文件 memory.limit_in_bytes
中,对应 k8s 中的  spec.containers[].resources.limits.memory

CGROUP 主要记账以下部分:

  • anon pages (RSS),文件不相关的内存,如 heap
  • cache pages (Page Cache), 文件相关的内存,如 File Cache/File Mapping

对于可回收的内存(如文件的 Cache 或 File Mapping),CGROUP 通过 LRU 链表记录 Page 的访问顺序。按 FIFO(LRU) 方法回收。

对于初学 Linux 的同学,有一点需要强调的是,内存记账是以实际访问的 Page 为准的,不是分配的 Page。说白了,就是只对有访问过的页才记账。

如果你是 Java 程序员,这就是 -Xms xG -Xmx xG
 之外 还要-XX:+AlwaysPreTouch
原因之一。尽早记账和发现问题。

内核内存记账范围

内核空间的内存也可以记账,也可以配置上限。不过 k8s 暂时好像不支持这个 cgroup 限制。需要注意的是,cache pages 算入主记账范围
,不算入内核内存记账范围

内核内存上限配置在文件 memory.kmem.limit_in_bytes
中。它主要包括:

  • stack pages
  • slab pages
  • sockets memory pressure
  • tcp memory pressure

内存回收

Memory Resource Controller from kernel.org

Each cgroup maintains a per cgroup LRU which has the same structure as global VM. When a cgroup goes over its limit, we first try to reclaim memory from the cgroup so as to make space for the new pages that the cgroup has touched. If the reclaim is unsuccessful, an OOM routine is invoked to select and kill the bulkiest task in the cgroup. (See 10. OOM Control below.)

The reclaim algorithm has not been modified for cgroups, except that pages that are selected for reclaiming come from the per-cgroup LRU list.

每个 cgroup 维护一个 cgroup LRU 组
,其结构与全局 LRU 组
相同。当一个 cgroup 「访问的内存」 超过它的限制时,我们首先尝试从 cgroup 中回收内存,以便为新内存页腾出空间。如果回收不成功,则会调用 OOM 流程来选择并终止 cgroup 中最庞大的任务。

cgroups 的回收算法和之前的全局 LRU
一样的,除了被选择用于回收的页面来自 per-cgroup LRU
列表。其实主要可以回收的,就是 Page Cache 了。

状态文件

文件比较多,这里只说几个重要的。

memory.stat

memory.stat
文件包括一些统计信息:

cache# of bytes of page cache memory.
rss # of bytes of anonymous and swap cache memory (includes transparent hugepages).
rss_huge # of bytes of anonymous transparent hugepages.
mapped_file # of bytes of mapped file (includes tmpfs/shmem)
pgpgin # of charging events to the memory cgroup. The charging event happens each time a page is accounted as either mapped anon page(RSS) or cache page(Page Cache) to the cgroup.
pgpgout # of uncharging events to the memory cgroup. The uncharging event happens each time a page is unaccounted from the cgroup.
swap # of bytes of swap usage
dirty # of bytes that are waiting to get written back to the disk.
writeback # of bytes of file/anon cache that are queued for syncing to disk.
inactive_anon # of bytes of anonymous and swap cache memory on inactive LRU list.
active_anon # of bytes of anonymous and swap cache memory on active LRU list.
inactive_file # of bytes of file-backed memory on inactive LRU list.
active_file # of bytes of file-backed memory on active LRU list.
unevictable # of bytes of memory that cannot be reclaimed (mlocked etc).

注:表中的 #
是 number,即数量的意思。

注意:只有 anonymous
和 swap cache memory
被纳入为“rss
”统计的一部分。这不应与真正的“resident set size 常驻内存大小
” 或 cgroup 使用的物理内存量相混淆。

rss
mapped_file
”  可以认为是 cgroup 的 resident set size 常驻内存大小

(注意:file 相关的内存页 和 shmem 可以和其他 cgroup 共享。在这种情况下,mapped_file
只包括  page cache 的所有者为本 cgroup 的 page cache。第一个访问这个页的 cgroup 即为所有者 )

memory.usage_in_bytes

这相当于 memory.stat
中的 RSS
+CACHE
(+SWAP
) 。不过有一定更新延迟。

memory.numa_stat

提供 cgroup 中的 numa node 内存分配信息。如之前的图所说,每个 NUMA node 有自己的 LRU list 组。

memory.failcnt

cgroup 提供 memory.failcnt
文件。显示达到 CGroup 限制的次数。当内存 cgroup 达到限制时,failcnt 会增加,触发回收内存。

控制与事件通知

文件比较多,这里只说几个重要的。

强制回收内存 memory.force_empty

$ echo 0 > memory.force_empty

回收所有可以回收的内存,K8s 禁用 Swap 情况下,主要是 文件相关的 Page Cache 了。

这里有一件有趣的事。

Kernel v3.0 时:

memory.force_empty
interface is provided to make cgroup's memory usage empty. You can use this interface only whenthe cgroup has no tasks. When writing anything to this

Kernel v4.0 后:

memory.force_empty
interface is provided to make cgroup's memory usage empty. When writing anything to this

即 kernel v4.0 后,没说需要 cgroup 一定要没进程下,才能强制回收。这个小插曲是发生在我推荐这功能给一个同学使用时,同学找到了一份 旧 Redhat 6.0 文档说不能在有进程的情况下强制回收。

基于内存阈值水位的通知

CGroup 有一个神秘隐藏秘技,k8s 好像没使用,就是可以设置内存使用的多个水位的通知。其中使用了文件 cgroup.event_control
这里不细说,有兴趣的同学移步:

https://www.kernel.org/doc/html/v5.4/admin-guide/cgroup-v1/memory.html#memory-thresholds

这个隐藏秘技有什么用?想想,如果我们要定位一个容器的 OOM  问题,在快要到 limit 和 OOM 前,core dump 或 java heap dump 一下,是不是就可以分析一下进程当时的内存使用情况?

不要 OOM Kill,只是暂停

这个,也是 CGroup 的神秘隐藏秘技,k8s 好像没使用。这个隐藏秘技有什么用?想想,如果我们要定位一个容器的 OOM  问题,在将要发生 OOM 时,容器暂停了。这时,我们可以:

  • core dump 进程
  • 或 配置更大的 CGroup 上限,让 jvm 继续跑,然后,java heap dump

是不是就可以分析一下进程当时的内存使用情况?

有兴趣移步:

https://www.kernel.org/doc/html/v5.4/admin-guide/cgroup-v1/memory.html#oom-control

相关文件:memory.oom_control

监控与指标

有实际运维经验的同学都知道,运维最难的是监控,监控最难的是选择和理解指标,还有指标间千丝万缕的关系。

K8s 的容器指标

K8s 自带容器指标数据源是来自 kubelet 中运行的 cAdvisor 模块 的。

而 cAdvisor 的官方 Metric 说明文档在这:Monitoring cAdvisor with Prometheus 。这个官方文档是写得太简单了,简单到不太适合问题定位……

好在,高手在民间:Out-of-memory (OOM) in Kubernetes – Part 3: Memory metrics sources and tools to collect them:

「cAdvisor metric」「Source OS metric(s)」「Explanation of source OS metric(s)」「What does the metric mean?」
container_memory_cache
total_cache
 value in the memory.stat
 file inside the container’s cgroup directory folder
number of bytes of page cache memory Size of memory used by the cache that’s automatically populated when reading/writing files
container_memory_rss
total_rss
 value in the memory.stat
 file inside the container’s cgroup directory folder
number of bytes of anonymous and swap cache memory (includes transparent hugepages). […]This should not be confused with the true ‘resident set size’ or the amount of physical memory used by the cgroup. ‘rss + mapped_file’ will give you resident set size of cgroup” Size of memory not used for mapping files from the disk
container_memory_mapped_file
total_mapped_file
 value in the memory.stat
 file inside the container’s cgroup directory folder
number of bytes of mapped file (includes tmpfs/shmem) Size of memory that’s used for mapping files
container_memory_swap
total_swap
 value in the memory.stat
 file inside the container’s cgroup directory folder
number of bytes of swap usage  
container_memory_failcnt
The value inside the memory.failcnt
 file
shows the number of times that a usage counter hit its limit  
container_memory_usage_bytes
The value inside the memory.usage_in_bytes
 file
doesn’t show ‘exact’ value of memory (and swap) usage, it’s a fuzz value for efficient access. (Of course, when necessary, it’s synchronized.) If you want to know more exact memory usage, you should use RSS+CACHE(+SWAP) value in memory.stat Size of overall memory used, regardless if it’s for mapping from disk or just allocating
container_memory_max_usage_bytes
The value inside the memory.max_usage_in_bytes
 file
max memory usage recorded  
container_memory_working_set_bytes
Deduct inactive_file
 inside the **memory.stat
**file from the value inside the memory.usage_in_bytes
 file. If result is negative then use 0
inactive_file
: number of bytes of file-backed memory on inactive LRU list usage_in_bytes
: doesn’t show ‘exact’ value of memory (and swap) usage, it’s a fuzz value for efficient access. (Of course, when necessary, it’s synchronized.) If you want to know more exact memory usage, you should use RSS+CACHE(+SWAP) value in memory.stat
A heuristic for the minimum size of memory required for the app to work.
the amount of memory in-use that cannot be freed under memory pressure[…] It includes all anonymous (non-file-backed) memory since Kubernetes does not support swap. The metric typically also includes some cached (file-backed) memory, because the host OS cannot always reclaim such pages. See the cAdvisor table for the formula containing base OS metrics

表:CAdvisor 的指标和来源

如果上面的描述还不足以满足你的好奇心,那么这里有更多:

  • https://jpetazzo.github.io/2013/10/08/docker-containers-metrics/

常被误解的 K8s 指标

container_memory_usage_bytes

A Deep Dive into Kubernetes Metrics — Part 3 Container Resource Metrics

You might think that memory utilization is easily tracked with container_memory_usage_bytes
, however, this metric also includes cached (think filesystem cache) items that can be evicted under memory pressure. The better metric is container_memory_working_set_bytes
as this is what the OOM killer is watching for.

container_memory_working_set_bytes

Memory usage discrepancy: cgroup memory.usage_in_bytes vs. RSS inside docker container

container_memory_working_set_bytes
container_memory_usage_bytes
total_inactive_file
(from sys/fs/cgroup/memory/memory.stat), this is calculated in cAdvisor and is <= container_memory_usage_bytes

kubectl top

Memory usage discrepancy: cgroup memory.usage_in_bytes vs. RSS inside docker container

when you use the kubectl top pods
command, you get the value of container_memory_working_set_bytes
not container_memory_usage_bytes
metric.

container_memory_cache 与 container_memory_mapped_file 的关系

Out-of-memory (OOM) in Kubernetes – Part 3: Memory metrics sources and tools to collect them:

Notice the “page cache” term on the definition of the container_memory_cache
metric. In Linux the page cache is “used to cache the content of files as IO is performed upon them” as per the “Linux Kernel Programming
” book by Kaiwan N Billimoria(本文作者注:这本书我看过,是我看到的,最近最好解理的内核图书). You might be tempted as such to think that container_memory_mapped_file
pretty much refers to the same thing, but that’s actually just a subset: e.g. a file can be mapped in memory (whole or parts of it) or it can be read in blocks, but the page cache will include data coming from either way of accessing that file. See https://stackoverflow.com/questions/258091/when-should-i-use-mmap-for-file-access for more info.

什么 metric 才是 OOM Kill 相关

Memory usage discrepancy: cgroup memory.usage_in_bytes vs. RSS inside docker container

It is also worth to mention that when the value of container_memory_usage_bytes
reaches to the limits, your pod will NOT get oom-killed. BUT if container_memory_working_set_bytes
or container_memory_rss
reached to the limits, the pod will be killed.

内存限制对 IO 性能的影响

本节主要参考:Don’t Let Linux Control Groups Run Uncontrolled

Page cache usage by apps is counted towards a cgroup’s memory limit, and anonymous memory usage can steal page cache for the same cgroup

应用程序的页面缓存使用量计入 cgroup 的内存限制,anonymous memory 匿名内存
使用可以窃取同一 cgroup 的页面缓存

这句话就是说,容器内存紧张时,文件缓存会减少,重复读的性能会下降。写入(Writeback) 的性能也可能会下降。所以如果你的容器有比较多 文件 IO,请谨慎配置内存 limit。

了解内核数据结构

下面是最近画的一张相关的内核数据结构图草稿。只作个参考,还有很多关系和细节未理清……

Page Cache 占用分析工具

找到占用 Page Cache 的文件:

  • vmtouch - the Virtual Memory Toucher
  • pcstat

POD 的另一个杀手 驱逐 eviction

本节内容多数来自:Out-of-memory (OOM) in Kubernetes – Part 4: Pod evictions, OOM scenarios and flows leading to them

Out-of-memory (OOM) in Kubernetes – Part 4: Pod evictions, OOM scenarios and flows leading to them

But when does the Kubelet decide to evict pods? “Low memory situation” is a rather fuzzy concept: we’ve seen that the OOM killer acts at the system level (in OOM killer) when memory is critically low (essentially almost nothing left), so it follows that pod evictions should happen before that. But when exactly?

As per the official Kubernetes docs “‘Allocatable’ on a Kubernetes node is defined as the amount of compute resources that are available for pods“. This feature is enabled by default via the --enforce-node-allocatable=pods
and once the memory usage for the pods crosses this value, the Kubelet triggers the eviction mechanism: “Enforcement is performed by evicting pods whenever the overall usage across all pods exceeds ‘Allocatable’” as documented here.

We can easily see the value by checking the output of kubectl describe node
. Here’s how the section of interest looks like for one of nodes of the Kubernetes cluster used throughout this article (a 7-GiB Azure DS2_v2 node):

Pods 在 Node 中的可用内存:

img

图:Worknode 内存分类

img

图:Worknode 可分配内存容量分布

驱逐 eviction 机制

How does all we’ve seen in the previous section reflect in the eviction mechanism? Pods are allowed to use memory as long as the overall usage across all pods is less than the allocatable memory value. Once this threshold is exceeded, you’re at the mercy of the Kubelet – as every 10s it checks the memory usage against the defined thresholds. Should the Kubelet decide an eviction is necessary the pods are sorted based on an internal algorithm described in Pod selection for kubelet eviction – which includes QoS class and individual memory usage as factors – and evicts the first one. The Kubelet continues evicting pods as long as thresholds are still being breached. Should one or more pods allocate memory so rapidly that the Kubelet doesn’t get a chance to spot it inside its 10s window, and the overall pod memory usage attempts to grow over the sum of allocatable memory plus the hard eviction threshold value, then the kernel’s OOM killer will step in and kill one or more processes inside the pods’ containers, as we’ve seen at length in the Cgroups and the OOM killer section.

说了那么多,可以 Get 到一点比较简单直接的是, POD 的 memory  实际使用超过 memory request 越多,越有可能被选中 驱逐 eviction。

驱逐 eviction 算法参考指标

We’re talking about memory usage a lot, but what exactly do we mean by that? Take the “allocatable” amount of memory that pods can use overall: for our DS2_v2 AKS nodes that is 4565 MiB. Is it when the RAM of a node has 4565 MiB filled with data for the pods means it’s right next to the threshold of starting evictions? In other words, what’s the metric used?

Back in the Metrics values section we’ve seen there are quite a few metrics that track memory usage per type of object. Take the container object, for which cAdvisor will return half a dozen metrics such as container_memory_rss
container_memory_usage_bytes
container_memory_working_set_bytes
etc.

So when the Kubelet looks at the eviction thresholds, what memory metric is it actually comparing against? The official Kubernetes documentation provides the answer to this: it’s the working set. There’s even a small script included there that shows the computations for deciding evictions at the node level. Essentially it computes the working_set metric for the node as the root memory cgroup’s memory.usage_in_bytes
minus the inactive_file
field in the root memory cgroup’s memory.stat
file.

And that is the exact formula we’ve come across in the past when we’ve looked at how are the node metrics computed through the Resource Metrics API (see the last row of the cAdvisor metrics table). Which is good news, as we’ll be able to plot the same exact metrics used in Kubelet’s eviction decisions on our charts in the following sections, by choosing the metric source which currently gives almost all of the memory metrics: cAdvisor.

As a side-note, if you want to see that the Kubelet’s code reflects what was said above – both for the “allocatable” threshold as well as the --eviction-hard
one – have a look at What is the memory metric that the Kubelet is using when making eviction decisions?.

容器内存 kill 法大全

这里有个容器内存事件处理逻辑图,一句说了,玩火要小心,不然,总有一种 kill 法适合你。

from Out-of-memory (OOM) in Kubernetes – Part 4: Pod evictions, OOM scenarios and flows leading to them

img

上文未提到的有用的参考

  • Out-of-memory (OOM) in Kubernetes – Part 1: Intro and topics discussed
  • SRE deep dive into Linux Page Cache
  • Dropping cache didn’t drop cache
  • Memory Measurements Complexities and Considerations - Part 1
  • Overcoming challenges with Linux cgroups memory accounting

与[转帖]玩火的容器内存控制 CGroup - 容器基础拾遗 Part 1相似的内容:

[转帖]玩火的容器内存控制 CGroup - 容器基础拾遗 Part 1

https://www.modb.pro/db/555818 引 我们在谈容器内存时,到底在谈什么? CGroup 内存说明 强制回收内存 memory.force_empty 基于内存阈值水位的通知 不要 OOM Kill,只是暂停 memory.stat memory.usage_in_byte

[转帖]玩转 Ceph 的正确姿势

玩转 Ceph 的正确姿势 https://www.cnblogs.com/me115/p/6366374.html Ceph 客户端 Ceph 服务端 总结 参考 玩转 Ceph 的正确姿势本文先介绍 Ceph, 然后会聊到一些正确使用 Ceph 的姿势;在集群规模小的时候,Ceph 怎么玩都没问

[转帖]玩转 Ceph 的正确姿势

https://www.cnblogs.com/me115/p/6366374.html 内容目录: Ceph 客户端 Ceph 服务端 总结 参考 玩转 Ceph 的正确姿势本文先介绍 Ceph, 然后会聊到一些正确使用 Ceph 的姿势;在集群规模小的时候,Ceph 怎么玩都没问题;但集群大了(

[转帖]玩转zabbix之超详细的二进制安装

https://zhuanlan.zhihu.com/p/212281069 #初始配置 #centos7添加阿里云镜像 wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo #安

[转帖]玩转REDIS-删除了两百万KEY,为什么内存依旧未释放?

https://www.freesion.com/article/87101375552/ 《玩转Redis》系列文章主要讲述Redis的基础及中高级应用。本文是《玩转Redis》系列第【12】篇,最新系列文章请前往公众号“zxiaofan”(点我点我)查看,或百度搜索“玩转Redis zxiaof

[转帖] 原来awk真是神器啊

https://www.cnblogs.com/codelogs/p/16060082.html 简介# 刚开始入门awk时,觉得awk很简单,像是一个玩具,根本无法应用到工作之中,但随着对awk的了解不断加深,就会越发觉得这玩意的强大,大佬们称其为上古神器,绝不是空穴来风。这也可以说明,一些热门的

[转帖]一文带你玩转 Redis 的 RESP 协议 !

https://zhuanlan.zhihu.com/p/384251739 RESP 是 Redis 客户端与 Redis 服务器相互通信时使用的一个协议, 全称 REdis Serialization Protocol ,即 redis 串行协议,通俗易懂,也表明了 redis 的特点,串行化(

[转帖]关于一致性哈希算法在游戏服务器端的思考

https://www.jianshu.com/p/b8ae27cf22a9 突然想明白 其实网易的将军令 就是一个一致性哈希的玩法 关于一致性哈希算法在游戏服务器端的思考 需求分析 后端有很多逻辑node节点(not-section binded),节点启动后注册到注册中心 node本身有状态,有

[转帖]fastJson与一起堆内存溢出'血案'

https://www.jianshu.com/p/876d443c2162 现象 QA同学反映登录不上服务器 排查问题1--日志级别 查看log,发现玩家登录的时候抛出了一个java.lang.OutOfMemoryError 大概代码是向Redis序列化一个PlayerMirror镜像数据,但是

[转帖]看看 Jmeter 是如何玩转 redis 数据库的

柠檬小欧 2021-08-31 20:06420 Jmeter 作为当前非常受欢迎的接口测试和性能测试的工具,在企业中得到非常广泛的使用,而 Redis 作为缓存数据库,也在企业中得到普遍使用,那如何使用 jmeter 来测试 Redis 数据库呢?今天我们就来讲一讲怎么使用 jmeter 来调用