前言去年处理过一次安全事件有人删除了生产环境的核心ConfigMap导致服务中断。事后排查时我们无法确定是谁在什么时间执行了删除操作——因为apiserver的审计日志没有开启。这次事件让我深刻认识到审计Audit的重要性。审计日志是K8s安全运营的基石它记录了谁在什么时间做了什么操作是事后追溯和安全分析的关键证据。今天就带大家深入源码看看kube-apiserver的审计机制是如何实现的。什么是审计K8s审计功能提供了按时间顺序排列的安全相关记录集记录了每个用户对API的操作使用K8s API的应用的行为控制面自身的活动审计能回答的问题问题审计日志字段发生了什么verb, resource, name什么时候发生的timestamp谁触发的user, groups发生在哪个对象上namespace, name, resource从哪触发的sourceIPs后续处理行为是什么responseStatus, stage审计架构概览用户请求 │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ HTTP Handler Chain │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ WithAudit Filter │ │ │ │ │ │ │ │ 1. 创建审计事件 │ │ │ │ 2. 根据策略评估审计级别 │ │ │ │ 3. 记录请求接收StageRequestReceived │ │ │ │ 4. 包装ResponseWriter │ │ │ │ 5. 记录响应开始StageResponseStarted │ │ │ │ 6. 记录响应完成StageResponseComplete │ │ │ └─────────────────────────────────────────────────────────┘ │ └─────────────────────────┬───────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ Audit Backend │ │ │ │ ┌─────────────────────┐ ┌─────────────────────┐ │ │ │ Log Backend │ │ Webhook Backend │ │ │ │ │ │ │ │ │ │ --audit-log-path │ │ --audit-webhook- │ │ │ │ │ │ config-file │ │ │ └─────────────────────┘ └─────────────────────┘ │ │ │ │ │ │ └──────────┬───────────────┘ │ │ ▼ │ │ Union Backend │ └─────────────────────────────────────────────────────────────────┘审计策略的4个级别审计策略定义了记录什么内容有4个级别级别说明适用场景None不记录高频、不敏感的操作如健康检查Metadata只记录元数据用户、时间、资源、动词等不记录请求/响应体一般操作记录Request记录元数据和请求体不记录响应体需要知道改了什么RequestResponse记录元数据、请求体和响应体完整的操作审计策略配置示例# audit-policy.yamlapiVersion:audit.k8s.io/v1kind:Policyrules:# 不记录健康检查-level:NonenonResourceURLs:-/healthz-/livez-/readyz# 记录ConfigMap的变更包含请求体-level:Requestresources:-group:resources:[configmaps]verbs:[create,update,patch,delete]# 记录Secret的所有操作包含请求和响应-level:RequestResponseresources:-group:resources:[secrets]# 默认只记录元数据-level:Metadata源码解析审计的初始化初始化入口审计的初始化在buildGenericConfig中完成// cmd/kube-apiserver/app/server.gofuncbuildGenericConfig(s*options.ServerRunOptions,...)(genericConfig,...){// ... 其他配置// 应用审计配置iflastErrs.Audit.ApplyTo(genericConfig);lastErr!nil{return}}ApplyTo方法构建审计后端// pkg/kubeapiserver/options/audit.gofunc(o*AuditOptions)ApplyTo(c*server.Config)error{// 1. 构建策略评估器根据audit-policy-fileevaluator,err:o.newPolicyRuleEvaluator()iferr!nil{returnerr}// 2. 构建日志后端--audit-log-pathvarlogBackend audit.Backend w,err:o.LogOptions.getWriter()iferr!nil{returnerr}ifw!nil{ifevaluatornil{klog.V(2).Info(No audit policy file provided, no events will be recorded for log backend)}else{logBackendo.LogOptions.newBackend(w)}}// 3. 构建webhook后端--audit-webhook-config-filevarwebhookBackend audit.Backendifo.WebhookOptions.enabled(){ifevaluatornil{klog.V(2).Info(No audit policy file provided, no events will be recorded for webhook backend)}else{webhookBackend,erro.WebhookOptions.newUntruncatedBackend(egressDialer)iferr!nil{returnerr}}}// 4. 封装为动态后端支持截断vardynamicBackend audit.BackendifwebhookBackend!nil{dynamicBackendo.WebhookOptions.TruncateOptions.wrapBackend(webhookBackend,groupVersion)}// 5. 设置策略评估器c.AuditPolicyRuleEvaluatorevaluator// 6. 合并所有后端c.AuditBackendappendBackend(logBackend,dynamicBackend)returnnil}审计后端详解1. 日志后端Log Backend日志后端将审计事件写入本地文件。// staging/src/k8s.io/apiserver/plugin/pkg/audit/log/backend.gotypebackendstruct{out io.Writer// 输出流formatstring// 格式legacy或jsonencoder runtime.Encoder}// 创建日志后端func(o*AuditLogOptions)newBackend(w io.Writer)audit.Backend{returnbackend{out:w,format:o.Format,}}// 处理审计事件func(b*backend)ProcessEvents(events...*auditinternal.Event)bool{success:truefor_,ev:rangeevents{successb.logEvent(ev)success}returnsuccess}func(b*backend)logEvent(ev*auditinternal.Event)bool{line:switchb.format{caseFormatLegacy:lineaudit.EventString(ev)\ncaseFormatJson:bs,err:runtime.Encode(b.encoder,ev)iferr!nil{audit.HandlePluginError(PluginName,err,ev)returnfalse}linestring(bs[:])}// 写入日志if_,err:fmt.Fprint(b.out,line);err!nil{audit.HandlePluginError(PluginName,err,ev)returnfalse}returntrue}日志轮转使用lumberjack库实现自动轮转importgopkg.in/natefinch/lumberjack.v2returnlumberjack.Logger{Filename:o.Path,// 日志文件路径MaxAge:o.MaxAge,// 最大保留天数MaxBackups:o.MaxBackups,// 最大备份数MaxSize:o.MaxSize,// 单个文件最大大小MBCompress:o.Compress,// 是否压缩},nil2. Webhook后端Webhook后端将审计事件发送到远程HTTP服务。// staging/src/k8s.io/apiserver/plugin/pkg/audit/webhook/webhook.gofunc(o*AuditWebhookOptions)newUntruncatedBackend(egressDialer utilnet.DialFunc)(audit.Backend,error){// 创建REST客户端webhookClient,err:o.webhookClient(egressDialer)iferr!nil{returnnil,err}returnbackend{webhookClient:webhookClient,},nil}// 发送审计事件到webhookfunc(b*backend)ProcessEvents(events...*auditinternal.Event)bool{success:truefor_,ev:rangeevents{successb.sendEvent(ev)success}returnsuccess}func(b*backend)sendEvent(ev*auditinternal.Event)bool{// 发送HTTP POST请求result:b.webhookClient.Create()iferr:result.Error();err!nil{audit.HandlePluginError(PluginName,err,ev)returnfalse}returntrue}Webhook配置# webhook-config.yamlapiVersion:v1kind:Configclusters:-name:audit-servercluster:certificate-authority:/path/to/ca.crtserver:https://audit.example.com/webhookusers:-name:apiserveruser:client-certificate:/path/to/client.crtclient-key:/path/to/client.keycurrent-context:webhookcontexts:-context:cluster:audit-serveruser:apiservername:webhook3. Union后端多后端组合// staging/src/k8s.io/apiserver/pkg/audit/union.go// Union将多个后端组合成一个typeunionBackendstruct{backends[]audit.Backend}funcUnion(backends...audit.Backend)audit.Backend{returnunionBackend{backends:backends}}func(u unionBackend)ProcessEvents(events...*auditinternal.Event)bool{success:truefor_,backend:rangeu.backends{successbackend.ProcessEvents(events...)success}returnsuccess}HTTP处理链中的审计审计是在HTTP handler chain中通过WithAudit中间件实现的。WithAudit中间件// staging/src/k8s.io/apiserver/pkg/endpoints/filters/audit.gofuncWithAudit(handler http.Handler,sink audit.Sink,policy audit.PolicyRuleEvaluator,)http.Handler{ifsinknil||policynil{returnhandler}returnhttp.HandlerFunc(func(w http.ResponseWriter,req*http.Request){// 1. 创建审计事件并附加到contextreq,ev,omitStages,err:createAuditEventAndAttachToContext(req,policy)iferr!nil{responsewriters.InternalError(w,req,errors.New(failed to create audit event))return}// 2. 如果没有事件策略为None直接处理ifevnil{handler.ServeHTTP(w,req)return}ctx:req.Context()// 3. 记录请求接收阶段ev.Stageauditinternal.StageRequestReceivedprocessAuditEvent(ctx,sink,ev,omitStages)// 4. 包装ResponseWriter以拦截响应respWriter:decorateResponseWriter(ctx,w,ev,sink,omitStages)// 5. 使用defer确保响应完成阶段被记录deferfunc(){ifr:recover();r!nil{// 记录panicev.Stageauditinternal.StagePanic ev.ResponseStatusmetav1.Status{Code:http.StatusInternalServerError,Status:metav1.StatusFailure,}processAuditEvent(ctx,sink,ev,omitStages)panic(r)}// 记录响应完成ev.Stageauditinternal.StageResponseCompleteprocessAuditEvent(ctx,sink,ev,omitStages)}()// 6. 处理请求handler.ServeHTTP(respWriter,req)})}审计事件的3个阶段请求处理时间线 ─────────────────────────────────────────────────────────────► │ │ │ │ │ │ ▼ ▼ ▼ RequestReceived ResponseStarted ResponseComplete 请求接收 响应开始 响应完成 │────────────────────│─────────────────────│ 长运行请求 响应发送阶段触发时机记录内容RequestReceived收到请求请求元数据、请求体根据策略ResponseStarted开始发送响应响应头、状态码长运行请求ResponseComplete响应发送完成完整的响应信息创建审计事件funccreateAuditEventAndAttachToContext(req*http.Request,policy audit.PolicyRuleEvaluator,)(*http.Request,*auditinternal.Event,[]auditinternal.Stage,error){// 获取请求信息ctx:req.Context()attribs,err:GetAuthorizerAttributes(ctx)// 评估审计级别level,omitStages:policy.LevelAndStages(attribs)iflevelauditinternal.LevelNone{returnreq,nil,nil,nil// 不记录}// 创建审计事件ev:auditinternal.Event{Timestamp:metav1.NowMicro(),AuditID:types.UID(uuid.New().String()),Level:level,Verb:attribs.GetVerb(),RequestURI:req.URL.RequestURI(),User:attribs.GetUser(),SourceIPs:sourceIPs(req),ObjectRef:objectRef(attribs),}// 根据级别记录请求体iflevelauditinternal.LevelRequest{ev.RequestObjectrecordRequestObject(req,level)}// 将事件附加到contextctxaudit.WithAuditContext(ctx,ev)reqreq.WithContext(ctx)returnreq,ev,omitStages,nil}配置审计基本配置kube-apiserver\--audit-policy-file/etc/kubernetes/audit-policy.yaml\--audit-log-path/var/log/kubernetes/audit.log\--audit-log-formatjson\--audit-log-maxsize100\--audit-log-maxbackup10\--audit-log-maxage30高级配置Webhookkube-apiserver\--audit-policy-file/etc/kubernetes/audit-policy.yaml\--audit-webhook-config-file/etc/kubernetes/audit-webhook.yaml\--audit-webhook-modebatch\--audit-webhook-batch-max-size100\--audit-webhook-batch-max-wait1sWebhook模式blocking同步发送可能影响API响应时间batching批量异步发送性能更好审计日志分析日志示例{kind:Event,apiVersion:audit.k8s.io/v1,level:Request,auditID:c5d4e6f7-a8b9-4c0d-1e2f-3a4b5c6d7e8f,stage:ResponseComplete,requestURI:/api/v1/namespaces/default/pods/nginx,verb:create,user:{username:admin,groups:[system:masters,system:authenticated]},sourceIPs:[192.168.1.100],objectRef:{resource:pods,namespace:default,name:nginx},responseStatus:{code:201},requestObject:{apiVersion:v1,kind:Pod,metadata:{name:nginx,namespace:default},spec:{containers:[{name:nginx,image:nginx:1.19}]}},timestamp:2024-01-15T10:30:00.123456Z}实用查询# 查找删除操作jqselect(.verb delete)/var/log/kubernetes/audit.log# 查找特定用户的操作jqselect(.user.username admin)/var/log/kubernetes/audit.log# 查找失败的操作jqselect(.responseStatus.code 400)/var/log/kubernetes/audit.log# 统计各用户的操作次数jq-r.user.username/var/log/kubernetes/audit.log|sort|uniq-c|sort-rn踩坑实录审计常见问题坑1审计日志文件过大现象磁盘被审计日志占满解决方案# 配置日志轮转和压缩kube-apiserver\--audit-log-maxsize100\# 单个文件100MB--audit-log-maxbackup10\# 保留10个备份--audit-log-maxage30\# 保留30天--audit-log-compresstrue# 压缩备份坑2审计影响性能现象开启审计后API响应变慢解决方案# 1. 使用较宽松的策略# 对高频读操作使用Metadata级别# 2. 使用Webhook batch模式--audit-webhook-modebatch --audit-webhook-batch-max-size100--audit-webhook-batch-max-wait1s# 3. 异步后端--audit-log-modeasync坑3审计事件丢失现象高负载时部分审计事件没有记录根因后端处理不过来事件被丢弃解决方案# 增加缓冲区大小--audit-webhook-truncate-max-batch-size10000--audit-webhook-truncate-max-event-size102400坑4敏感信息泄露现象审计日志中包含Secret的明文内容解决方案# 对Secret使用Metadata级别apiVersion:audit.k8s.io/v1kind:Policyrules:-level:Metadataresources:-group:resources:[secrets]审计最佳实践1. 分层审计策略apiVersion:audit.k8s.io/v1kind:Policyrules:# 不记录系统组件的读操作-level:Noneusers:[system:kube-proxy,system:kubelet]verbs:[get,list,watch]# 详细记录敏感资源-level:RequestResponseresources:-group:rbac.authorization.k8s.ioresources:[roles,rolebindings,clusterroles,clusterrolebindings]# 记录默认-level:Metadata2. 集中化审计日志# 使用Webhook将审计日志发送到集中存储kube-apiserver\--audit-webhook-config-file/etc/kubernetes/audit-webhook.yaml\--audit-webhook-modebatch3. 定期审计分析# 查找异常操作jqselect(.responseStatus.code 400)audit.log|jq-sgroup_by(.user.username) | map({user: .[0].user.username, count: length})# 查找权限提升操作jqselect(.verb create or .verb update) | select(.objectRef.resource | contains(role))audit.log总结通过今天的分析我们深入理解了kube-apiserver的审计机制审计策略4个级别None/Metadata/Request/RequestResponse审计后端Log Backend本地文件和Webhook Backend远程服务HTTP处理链WithAudit中间件拦截请求记录3个阶段事件结构包含用户、时间、资源、请求/响应等信息最佳实践分层策略、日志轮转、集中化存储审计是K8s安全运营的基础设施正确配置审计对事后追溯和合规非常重要。你踩过这些坑吗你的集群开启了审计功能吗审计策略是如何配置的你是如何处理审计日志存储和分析的你遇到过审计影响性能的问题吗是怎么解决的
深入kube-apiserver审计机制:从策略配置到事件记录的全流程解析
发布时间:2026/6/17 22:10:24
前言去年处理过一次安全事件有人删除了生产环境的核心ConfigMap导致服务中断。事后排查时我们无法确定是谁在什么时间执行了删除操作——因为apiserver的审计日志没有开启。这次事件让我深刻认识到审计Audit的重要性。审计日志是K8s安全运营的基石它记录了谁在什么时间做了什么操作是事后追溯和安全分析的关键证据。今天就带大家深入源码看看kube-apiserver的审计机制是如何实现的。什么是审计K8s审计功能提供了按时间顺序排列的安全相关记录集记录了每个用户对API的操作使用K8s API的应用的行为控制面自身的活动审计能回答的问题问题审计日志字段发生了什么verb, resource, name什么时候发生的timestamp谁触发的user, groups发生在哪个对象上namespace, name, resource从哪触发的sourceIPs后续处理行为是什么responseStatus, stage审计架构概览用户请求 │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ HTTP Handler Chain │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ WithAudit Filter │ │ │ │ │ │ │ │ 1. 创建审计事件 │ │ │ │ 2. 根据策略评估审计级别 │ │ │ │ 3. 记录请求接收StageRequestReceived │ │ │ │ 4. 包装ResponseWriter │ │ │ │ 5. 记录响应开始StageResponseStarted │ │ │ │ 6. 记录响应完成StageResponseComplete │ │ │ └─────────────────────────────────────────────────────────┘ │ └─────────────────────────┬───────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ Audit Backend │ │ │ │ ┌─────────────────────┐ ┌─────────────────────┐ │ │ │ Log Backend │ │ Webhook Backend │ │ │ │ │ │ │ │ │ │ --audit-log-path │ │ --audit-webhook- │ │ │ │ │ │ config-file │ │ │ └─────────────────────┘ └─────────────────────┘ │ │ │ │ │ │ └──────────┬───────────────┘ │ │ ▼ │ │ Union Backend │ └─────────────────────────────────────────────────────────────────┘审计策略的4个级别审计策略定义了记录什么内容有4个级别级别说明适用场景None不记录高频、不敏感的操作如健康检查Metadata只记录元数据用户、时间、资源、动词等不记录请求/响应体一般操作记录Request记录元数据和请求体不记录响应体需要知道改了什么RequestResponse记录元数据、请求体和响应体完整的操作审计策略配置示例# audit-policy.yamlapiVersion:audit.k8s.io/v1kind:Policyrules:# 不记录健康检查-level:NonenonResourceURLs:-/healthz-/livez-/readyz# 记录ConfigMap的变更包含请求体-level:Requestresources:-group:resources:[configmaps]verbs:[create,update,patch,delete]# 记录Secret的所有操作包含请求和响应-level:RequestResponseresources:-group:resources:[secrets]# 默认只记录元数据-level:Metadata源码解析审计的初始化初始化入口审计的初始化在buildGenericConfig中完成// cmd/kube-apiserver/app/server.gofuncbuildGenericConfig(s*options.ServerRunOptions,...)(genericConfig,...){// ... 其他配置// 应用审计配置iflastErrs.Audit.ApplyTo(genericConfig);lastErr!nil{return}}ApplyTo方法构建审计后端// pkg/kubeapiserver/options/audit.gofunc(o*AuditOptions)ApplyTo(c*server.Config)error{// 1. 构建策略评估器根据audit-policy-fileevaluator,err:o.newPolicyRuleEvaluator()iferr!nil{returnerr}// 2. 构建日志后端--audit-log-pathvarlogBackend audit.Backend w,err:o.LogOptions.getWriter()iferr!nil{returnerr}ifw!nil{ifevaluatornil{klog.V(2).Info(No audit policy file provided, no events will be recorded for log backend)}else{logBackendo.LogOptions.newBackend(w)}}// 3. 构建webhook后端--audit-webhook-config-filevarwebhookBackend audit.Backendifo.WebhookOptions.enabled(){ifevaluatornil{klog.V(2).Info(No audit policy file provided, no events will be recorded for webhook backend)}else{webhookBackend,erro.WebhookOptions.newUntruncatedBackend(egressDialer)iferr!nil{returnerr}}}// 4. 封装为动态后端支持截断vardynamicBackend audit.BackendifwebhookBackend!nil{dynamicBackendo.WebhookOptions.TruncateOptions.wrapBackend(webhookBackend,groupVersion)}// 5. 设置策略评估器c.AuditPolicyRuleEvaluatorevaluator// 6. 合并所有后端c.AuditBackendappendBackend(logBackend,dynamicBackend)returnnil}审计后端详解1. 日志后端Log Backend日志后端将审计事件写入本地文件。// staging/src/k8s.io/apiserver/plugin/pkg/audit/log/backend.gotypebackendstruct{out io.Writer// 输出流formatstring// 格式legacy或jsonencoder runtime.Encoder}// 创建日志后端func(o*AuditLogOptions)newBackend(w io.Writer)audit.Backend{returnbackend{out:w,format:o.Format,}}// 处理审计事件func(b*backend)ProcessEvents(events...*auditinternal.Event)bool{success:truefor_,ev:rangeevents{successb.logEvent(ev)success}returnsuccess}func(b*backend)logEvent(ev*auditinternal.Event)bool{line:switchb.format{caseFormatLegacy:lineaudit.EventString(ev)\ncaseFormatJson:bs,err:runtime.Encode(b.encoder,ev)iferr!nil{audit.HandlePluginError(PluginName,err,ev)returnfalse}linestring(bs[:])}// 写入日志if_,err:fmt.Fprint(b.out,line);err!nil{audit.HandlePluginError(PluginName,err,ev)returnfalse}returntrue}日志轮转使用lumberjack库实现自动轮转importgopkg.in/natefinch/lumberjack.v2returnlumberjack.Logger{Filename:o.Path,// 日志文件路径MaxAge:o.MaxAge,// 最大保留天数MaxBackups:o.MaxBackups,// 最大备份数MaxSize:o.MaxSize,// 单个文件最大大小MBCompress:o.Compress,// 是否压缩},nil2. Webhook后端Webhook后端将审计事件发送到远程HTTP服务。// staging/src/k8s.io/apiserver/plugin/pkg/audit/webhook/webhook.gofunc(o*AuditWebhookOptions)newUntruncatedBackend(egressDialer utilnet.DialFunc)(audit.Backend,error){// 创建REST客户端webhookClient,err:o.webhookClient(egressDialer)iferr!nil{returnnil,err}returnbackend{webhookClient:webhookClient,},nil}// 发送审计事件到webhookfunc(b*backend)ProcessEvents(events...*auditinternal.Event)bool{success:truefor_,ev:rangeevents{successb.sendEvent(ev)success}returnsuccess}func(b*backend)sendEvent(ev*auditinternal.Event)bool{// 发送HTTP POST请求result:b.webhookClient.Create()iferr:result.Error();err!nil{audit.HandlePluginError(PluginName,err,ev)returnfalse}returntrue}Webhook配置# webhook-config.yamlapiVersion:v1kind:Configclusters:-name:audit-servercluster:certificate-authority:/path/to/ca.crtserver:https://audit.example.com/webhookusers:-name:apiserveruser:client-certificate:/path/to/client.crtclient-key:/path/to/client.keycurrent-context:webhookcontexts:-context:cluster:audit-serveruser:apiservername:webhook3. Union后端多后端组合// staging/src/k8s.io/apiserver/pkg/audit/union.go// Union将多个后端组合成一个typeunionBackendstruct{backends[]audit.Backend}funcUnion(backends...audit.Backend)audit.Backend{returnunionBackend{backends:backends}}func(u unionBackend)ProcessEvents(events...*auditinternal.Event)bool{success:truefor_,backend:rangeu.backends{successbackend.ProcessEvents(events...)success}returnsuccess}HTTP处理链中的审计审计是在HTTP handler chain中通过WithAudit中间件实现的。WithAudit中间件// staging/src/k8s.io/apiserver/pkg/endpoints/filters/audit.gofuncWithAudit(handler http.Handler,sink audit.Sink,policy audit.PolicyRuleEvaluator,)http.Handler{ifsinknil||policynil{returnhandler}returnhttp.HandlerFunc(func(w http.ResponseWriter,req*http.Request){// 1. 创建审计事件并附加到contextreq,ev,omitStages,err:createAuditEventAndAttachToContext(req,policy)iferr!nil{responsewriters.InternalError(w,req,errors.New(failed to create audit event))return}// 2. 如果没有事件策略为None直接处理ifevnil{handler.ServeHTTP(w,req)return}ctx:req.Context()// 3. 记录请求接收阶段ev.Stageauditinternal.StageRequestReceivedprocessAuditEvent(ctx,sink,ev,omitStages)// 4. 包装ResponseWriter以拦截响应respWriter:decorateResponseWriter(ctx,w,ev,sink,omitStages)// 5. 使用defer确保响应完成阶段被记录deferfunc(){ifr:recover();r!nil{// 记录panicev.Stageauditinternal.StagePanic ev.ResponseStatusmetav1.Status{Code:http.StatusInternalServerError,Status:metav1.StatusFailure,}processAuditEvent(ctx,sink,ev,omitStages)panic(r)}// 记录响应完成ev.Stageauditinternal.StageResponseCompleteprocessAuditEvent(ctx,sink,ev,omitStages)}()// 6. 处理请求handler.ServeHTTP(respWriter,req)})}审计事件的3个阶段请求处理时间线 ─────────────────────────────────────────────────────────────► │ │ │ │ │ │ ▼ ▼ ▼ RequestReceived ResponseStarted ResponseComplete 请求接收 响应开始 响应完成 │────────────────────│─────────────────────│ 长运行请求 响应发送阶段触发时机记录内容RequestReceived收到请求请求元数据、请求体根据策略ResponseStarted开始发送响应响应头、状态码长运行请求ResponseComplete响应发送完成完整的响应信息创建审计事件funccreateAuditEventAndAttachToContext(req*http.Request,policy audit.PolicyRuleEvaluator,)(*http.Request,*auditinternal.Event,[]auditinternal.Stage,error){// 获取请求信息ctx:req.Context()attribs,err:GetAuthorizerAttributes(ctx)// 评估审计级别level,omitStages:policy.LevelAndStages(attribs)iflevelauditinternal.LevelNone{returnreq,nil,nil,nil// 不记录}// 创建审计事件ev:auditinternal.Event{Timestamp:metav1.NowMicro(),AuditID:types.UID(uuid.New().String()),Level:level,Verb:attribs.GetVerb(),RequestURI:req.URL.RequestURI(),User:attribs.GetUser(),SourceIPs:sourceIPs(req),ObjectRef:objectRef(attribs),}// 根据级别记录请求体iflevelauditinternal.LevelRequest{ev.RequestObjectrecordRequestObject(req,level)}// 将事件附加到contextctxaudit.WithAuditContext(ctx,ev)reqreq.WithContext(ctx)returnreq,ev,omitStages,nil}配置审计基本配置kube-apiserver\--audit-policy-file/etc/kubernetes/audit-policy.yaml\--audit-log-path/var/log/kubernetes/audit.log\--audit-log-formatjson\--audit-log-maxsize100\--audit-log-maxbackup10\--audit-log-maxage30高级配置Webhookkube-apiserver\--audit-policy-file/etc/kubernetes/audit-policy.yaml\--audit-webhook-config-file/etc/kubernetes/audit-webhook.yaml\--audit-webhook-modebatch\--audit-webhook-batch-max-size100\--audit-webhook-batch-max-wait1sWebhook模式blocking同步发送可能影响API响应时间batching批量异步发送性能更好审计日志分析日志示例{kind:Event,apiVersion:audit.k8s.io/v1,level:Request,auditID:c5d4e6f7-a8b9-4c0d-1e2f-3a4b5c6d7e8f,stage:ResponseComplete,requestURI:/api/v1/namespaces/default/pods/nginx,verb:create,user:{username:admin,groups:[system:masters,system:authenticated]},sourceIPs:[192.168.1.100],objectRef:{resource:pods,namespace:default,name:nginx},responseStatus:{code:201},requestObject:{apiVersion:v1,kind:Pod,metadata:{name:nginx,namespace:default},spec:{containers:[{name:nginx,image:nginx:1.19}]}},timestamp:2024-01-15T10:30:00.123456Z}实用查询# 查找删除操作jqselect(.verb delete)/var/log/kubernetes/audit.log# 查找特定用户的操作jqselect(.user.username admin)/var/log/kubernetes/audit.log# 查找失败的操作jqselect(.responseStatus.code 400)/var/log/kubernetes/audit.log# 统计各用户的操作次数jq-r.user.username/var/log/kubernetes/audit.log|sort|uniq-c|sort-rn踩坑实录审计常见问题坑1审计日志文件过大现象磁盘被审计日志占满解决方案# 配置日志轮转和压缩kube-apiserver\--audit-log-maxsize100\# 单个文件100MB--audit-log-maxbackup10\# 保留10个备份--audit-log-maxage30\# 保留30天--audit-log-compresstrue# 压缩备份坑2审计影响性能现象开启审计后API响应变慢解决方案# 1. 使用较宽松的策略# 对高频读操作使用Metadata级别# 2. 使用Webhook batch模式--audit-webhook-modebatch --audit-webhook-batch-max-size100--audit-webhook-batch-max-wait1s# 3. 异步后端--audit-log-modeasync坑3审计事件丢失现象高负载时部分审计事件没有记录根因后端处理不过来事件被丢弃解决方案# 增加缓冲区大小--audit-webhook-truncate-max-batch-size10000--audit-webhook-truncate-max-event-size102400坑4敏感信息泄露现象审计日志中包含Secret的明文内容解决方案# 对Secret使用Metadata级别apiVersion:audit.k8s.io/v1kind:Policyrules:-level:Metadataresources:-group:resources:[secrets]审计最佳实践1. 分层审计策略apiVersion:audit.k8s.io/v1kind:Policyrules:# 不记录系统组件的读操作-level:Noneusers:[system:kube-proxy,system:kubelet]verbs:[get,list,watch]# 详细记录敏感资源-level:RequestResponseresources:-group:rbac.authorization.k8s.ioresources:[roles,rolebindings,clusterroles,clusterrolebindings]# 记录默认-level:Metadata2. 集中化审计日志# 使用Webhook将审计日志发送到集中存储kube-apiserver\--audit-webhook-config-file/etc/kubernetes/audit-webhook.yaml\--audit-webhook-modebatch3. 定期审计分析# 查找异常操作jqselect(.responseStatus.code 400)audit.log|jq-sgroup_by(.user.username) | map({user: .[0].user.username, count: length})# 查找权限提升操作jqselect(.verb create or .verb update) | select(.objectRef.resource | contains(role))audit.log总结通过今天的分析我们深入理解了kube-apiserver的审计机制审计策略4个级别None/Metadata/Request/RequestResponse审计后端Log Backend本地文件和Webhook Backend远程服务HTTP处理链WithAudit中间件拦截请求记录3个阶段事件结构包含用户、时间、资源、请求/响应等信息最佳实践分层策略、日志轮转、集中化存储审计是K8s安全运营的基础设施正确配置审计对事后追溯和合规非常重要。你踩过这些坑吗你的集群开启了审计功能吗审计策略是如何配置的你是如何处理审计日志存储和分析的你遇到过审计影响性能的问题吗是怎么解决的