2023-10-15

LinuxMemoryManage

Memory management is described and explained in detail next.

why we need memory management

For example, if a operating system have not memory management,the hardware memory will be directly accessed when the application need use memory, and this may result in other applications not being able to use the memory area and crush the system.In addition, if several applications require a large amount of memory, but the memory is not allocated and freed properly, it may lead to memory fragmentation, which may reduce the performance of the system. Therefore, memory management is a very important part of the operating system.

thread management

#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <errno.h>
#include <sys/syscall.h>
#include <sched.h>
#include <pthread.h>

static int set_affinity(pthread thread,int cpu_core)
{
    cpu_set_t cpu_set;
    CPU_ZERO(&cpu_set);
    CPU_SET(cpu_core,&cpu_set);

    return pthread_setaffinity_np(thread,sizeof(cpu_set_t),&cpuset);
}

int create_rt_thread(pthread_t *pth, void*(*func)(void *), void *arg, int policy, int prio, int cpu_core)
{
	struct sched_param schedp;
	pthread_attr_t attr;
    //cpu_set_t set;
	int ret;

	pthread_attr_init(&attr);
	memset(&schedp, 0, sizeof(schedp));

	ret = pthread_attr_setinheritsched(&attr, PTHREAD_EXPLICIT_SCHED);
	if (ret) {
		error("pthread_attr_setinheritsched\n", ret);
		return -1;
	}

	ret = pthread_attr_setschedpolicy(&attr, policy);
	if (ret) {
		error("pthread_attr_setschedpolicy\n", ret);
		return -1;
	}

	schedp.sched_priority = prio;
	ret = pthread_attr_setschedparam(&attr, &schedp);
	if (ret) {
		error("pthread_attr_setschedparam\n", ret);
		return -1;
	}

	ret = pthread_create(pth, &attr, func, arg);
	if (ret) {
		error("pthread_create\n", ret);
		return -1;
	}

    ret = set_affinity(*pth,cpu_core);
	return 0;
}

void set_affinity_public(int cpu_core)
{
    int policy = 0;
    struct sched_param pthread_param;
    pid_t thread_id = syscall(SYS_gettid);
    pthread_getschedparam(pthread_self(),&policy,&pthread_param);
    cpu_set_t cpu_set;
    CPU_ZERO(&cpu_set);
    CPU_SET(cpu_core,&cpu_set);

    if(0 != sched_setaffinity(thread_id,sizeof(cpu_set_t),&cpuset))
    {
        printf("thread %d set success core %d",thread_id,cpu_core);
    }
}

thread task table

pipeline timing test(八级流水线)

‘’’C
for(int i;i <1000000000,i++)
{
fact *= i;
}

for(int i;i <1000000000,i+=8)
{
fact0 *= i;
fact1 *= i+1;
fact2 *= i+2;
fact3 *= i+3;
fact4 *= i+4;
fact5 *= i+5;
fact6 *= i+6;
fact7 *= i+7;
}
‘’’

实际测量数据 : 第一个耗时大概5s左右，第二个耗时1.3s
原因分析：即使在使用流水线技术的情况下，对于不同类型的计算和循环，你也不一定会看到线性的性能增长，尤其是在一些特定的情况下，因为流水线技术本身存在一些限制。
在你的例子中，第一个循环是一个简单的累积乘法运算，而第二个循环则是对于每次迭代同时进行了 8 个乘法运算。在第二个循环中，尽管你利用了流水线并行处理 8 个不相关的乘法操作，但仍然受到了数据相关性和其他因素的限制。
尽管处理器在第二个循环中可以并行处理多个操作，但也有一些因素可能限制了性能增长：
数据相关性和依赖：即使你并行处理了 8 个操作，但在多数情况下，这些操作依然存在数据相关性，可能导致某些乘法操作需要等待之前的操作结果，从而限制了流水线的效率。
资源竞争：在同时处理多个操作时，各个乘法操作之间可能会竞争同样的资源（比如寄存器），造成一定的资源争用，降低流水线效率。
其他因素：像分支预测、数据访问模式等因素也可能对性能造成一定的影响。
虽然你利用了流水线并行处理多个操作，但并不是所有类型的计算都能够简单地获得线性的性能提升。对于某些类型的任务，流水线技术可能并不能完全发挥其优势。针对这种情况，需要更深入地了解处理器的特性、任务本身的特性以及代码的优化，以最大化地利用流水线并行性带来的性能优势。

Linux Command

mpstat -P ALL 1 命令用于显示每个CPU的性能统计信息，包括软中断等。下面是命令输出的详解：

Linux 5.4.0-81-generic (hostname)  01/31/2024  _x86_64_ (4 CPU)

09:35:33 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
09:35:34 PM  all    4.25    0.00    4.25    0.00    0.00    0.00    0.00    0.00    0.00   91.50
09:35:34 PM    0    2.97    0.00    2.97    0.00    0.00    0.00    0.00    0.00    0.00   94.06
09:35:34 PM    1    4.97    0.00    4.97    0.00    0.00    0.00    0.00    0.00    0.00   90.06
09:35:34 PM    2    5.97    0.00    5.97    0.00    0.00    0.00    0.00    0.00    0.00   88.06
09:35:34 PM    3    2.97    0.00    2.97    0.00    0.00    0.00    0.00    0.00    0.00   94.06

每一行的含义如下：

Linux 5.4.0-81-generic (hostname): 显示了Linux内核版本和主机名。
01/31/2024: 显示了当前日期。
x86_64 (4 CPU): 显示了系统架构和CPU核心数量。
09:35:33 PM: 显示了当前的时间戳。
CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle: 列出了不同的CPU使用率统计项。
- %usr: 用户空间程序的CPU使用率。
- %nice: 优先级较高的用户空间程序的CPU使用率。
- %sys: 内核空间程序的CPU使用率。
- %iowait: CPU等待I/O完成的时间百分比。
- %irq: 处理硬中断的CPU使用率。
- %soft: 处理软中断的CPU使用率。
- %steal: 被虚拟化环境偷走的CPU时间。
- %guest: 运行虚拟机的CPU时间。
- %gnice: 优先级较高的NICE值的用户空间程序的CPU使用率。
- %idle: CPU处于空闲状态的时间百分比。
all 4.25 0.00 4.25 0.00 0.00 0.00 0.00 0.00 0.00 91.50: 这一行是所有CPU的平均统计。例如，用户空间程序使用了4.25%，系统空间程序使用了4.25%，空闲时间为91.50%。
0 2.97 0.00 2.97 0.00 0.00 0.00 0.00 0.00 0.00 94.06: 这一行是第一个CPU的具体统计信息，包括用户空间、系统空间、空闲时间等。
1 4.97 0.00 4.97 0.00 0.00 0.00 0.00 0.00 0.00 90.06: 同上，但是是第二个CPU的统计信息。
2 5.97 0.00 5.97 0.00 0.00 0.00 0.00 0.00 0.00 88.06: 同上，但是是第三个CPU的统计信息。
3 2.97 0.00 2.97 0.00 0.00 0.00 0.00 0.00 0.00 94.06: 同上，但是是第四个CPU的统计信息。

这个命令的输出提供了每个CPU的详细性能统计信息，可用于监控系统中的CPU利用率情况。

perf命令

在 perf list | grep irq 输出中，列举了一些与中断（irq）相关的事件。对于软中断，主要关注 irq:softirq_entry、irq:softirq_exit 和 irq:softirq_raise 事件。

irq:softirq_entry：
- 表示软中断开始执行的事件。当内核启动执行软中断时，会记录这个事件。
irq:softirq_exit：
- 表示软中断执行结束的事件。当软中断执行完成后，会记录这个事件。
irq:softirq_raise：
- 表示软中断被触发的事件。当软中断被激活时，会记录这个事件。

针对软中断的分析，可以使用以下步骤：

1. 使用 `perf record` 捕获软中断事件：

1	sudo perf record -e irq:softirq_entry -e irq:softirq_exit -e irq:softirq_raise -a

这会记录系统范围内软中断相关的事件。

2. 使用 `perf report` 查看分析报告：

1	sudo perf report

在报告中，你可以查看软中断相关的信息，了解软中断的执行时间、次数等。

3. 使用 `perf script` 输出文本信息：

1	sudo perf script > perf_script.txt

这会将详细的性能数据输出到 perf_script.txt 文件中，你可以在文本文件中搜索软中断相关的信息。

4. 使用 `perf top` 查看实时性能数据：

1	sudo perf top

perf top 提供了实时的性能数据，可以用于监视系统中的软中断活动。

注意事项：

请根据你的系统和内核版本调整命令和选项，确保其兼容性。
软中断的具体含义和影响可能取决于系统的配置和应用程序的特性。你可能需要进一步了解系统的软中断处理机制和软中断的触发方式。

这些步骤应该能够帮助你获取有关软中断的性能数据并进行分析。

性能分析命令

trace -p pid //分析进程的调用关系和开销
irqtop //irqtop显示
lsirq //列举所有irq
mpstat -P ALL 1 //1s显示一次所有信息中断，软中断等
cat /proc/interrupts
cat /proc/softirq
cat /proc/irq
cat /proc/irq/90/smp_affinity
pidstat -t -p 416 //显示pid 416的状态
perf list | grep irq //列举所有可以分析的irq事件，其中perf record -h是查看命令
ps -T -p 416 //ps –help all 显示pid 416的相关信息