aboutsummaryrefslogtreecommitdiff
path: root/kernel
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2025-12-05 09:51:37 -0800
committerLinus Torvalds <torvalds@linux-foundation.org>2025-12-05 09:51:37 -0800
commit69c5079b49fa120c1a108b6e28b3a6a8e4ae2db5 (patch)
treed3b2ecb61bcbf9d9d9a8f9fa7f620af0030b514d /kernel
parent36492b7141b9abc967e92c991af32c670351dc16 (diff)
parentf6ed9c5d3190cf18382ee75e0420602101f53586 (diff)
Merge tag 'trace-v6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing updates from Steven Rostedt: - Extend tracing option mask to 64 bits The trace options were defined by a 32 bit variable. This limits the tracing instances to have a total of 32 different options. As that limit has been hit, and more options are being added, increase the option mask to a 64 bit number, doubling the number of options available. As this is required for the kprobe topic branches as well as the tracing topic branch, a separate branch was created and merged into both. - Make trace_user_fault_read() available for the rest of tracing The function trace_user_fault_read() is used by trace_marker file read to allow reading user space to be done fast and without locking or allocations. Make this available so that the system call trace events can use it too. - Have system call trace events read user space values Now that the system call trace events callbacks are called in a faultable context, take advantage of this and read the user space buffers for various system calls. For example, show the path name of the openat system call instead of just showing the pointer to that path name in user space. Also show the contents of the buffer of the write system call. Several system call trace events are updated to make tracing into a light weight strace tool for all applications in the system. - Update perf system call tracing to do the same - And a config and syscall_user_buf_size file to control the size of the buffer Limit the amount of data that can be read from user space. The default size is 63 bytes but that can be expanded to 165 bytes. - Allow the persistent ring buffer to print system calls normally The persistent ring buffer prints trace events by their type and ignores the print_fmt. This is because the print_fmt may change from kernel to kernel. As the system call output is fixed by the system call ABI itself, there's no reason to limit that. This makes reading the system call events in the persistent ring buffer much nicer and easier to understand. - Add options to show text offset to function profiler The function profiler that counts the number of times a function is hit currently lists all functions by its name and offset. But this becomes ambiguous when there are several functions with the same name. Add a tracing option that changes the output to be that of '_text+offset' instead. Now a user space tool can use this information to map the '_text+offset' to the unique function it is counting. - Report bad dynamic event command If a bad command is passed to the dynamic_events file, report it properly in the error log. - Clean up tracer options Clean up the tracer option code a bit, by removing some useless code and also using switch statements instead of a series of if statements. - Have tracing options be instance specific Tracers can have their own options (function tracer, irqsoff tracer, function graph tracer, etc). But now that the same tracer can be enabled in multiple trace instances, their options are still global. The API is per instance, thus changing one affects other instances. This isn't even consistent, as the option take affect differently depending on when an tracer started in an instance. Make the options for instances only affect the instance it is changed under. - Optimize pid_list lock contention Whenever the pid_list is read, it uses a spin lock. This happens at every sched switch. Taking the lock at sched switch can be removed by instead using a seqlock counter. - Clean up the trace trigger structures The trigger code uses two different structures to implement a single tigger. This was due to trying to reuse code for the two different types of triggers (always on trigger, and count limited trigger). But by adding a single field to one structure, the other structure could be absorbed into the first structure making he code easier to understand. - Create a bulk garbage collector for trace triggers If user space has triggers for several hundreds of events and then removes them, it can take several seconds to complete. This is because each removal calls tracepoint_synchronize_unregister() that can take hundreds of milliseconds to complete. Instead, create a helper thread that will do the clean up. When a trigger is removed, it will create the kthread if it isn't already created, and then add the trigger to a llist. The kthread will take the items off the llist, call tracepoint_synchronize_unregister(), and then remove the items it took off. It will then check if there's more items to free before sleeping. This makes user space removing all these triggers to finish in less than a second. - Allow function tracing of some of the tracing infrastructure code Because the tracing code can cause recursion issues if it is traced by the function tracer the entire tracing directory disables function tracing. But not all of tracing causes issues if it is traced. Namely, the event tracing code. Add a config that enables some of the tracing code to be traced to help in debugging it. Note, when this is enabled, it does add noise to general function tracing, especially if events are enabled as well (which is a common case). - Add boot-time backup instance for persistent buffer The persistent ring buffer is used mostly for kernel crash analysis in the field. One issue is that if there's a crash, the data in the persistent ring buffer must be read before tracing can begin using it. This slows down the boot process. Once tracing starts in the persistent ring buffer, the old data must be freed and the addresses no longer match and old events can't be in the buffer with new events. Create a way to create a backup buffer that copies the persistent ring buffer at boot up. Then after a crash, the always on tracer can begin immediately as well as the normal boot process while the crash analysis tooling uses the backup buffer. After the backup buffer is finished being read, it can be removed. - Enable function graph args and return address options at the same time Currently the when reading of arguments in the function graph tracer is enabled, the option to record the parent function in the entry event can not be enabled. Update the code so that it can. - Add new struct_offset() helper macro Add a new macro that takes a pointer to a structure and a name of one of its members and it will return the offset of that member. This allows the ring buffer code to simplify the following: From: size = struct_size(entry, buf, cnt - sizeof(entry->id)); To: size = struct_offset(entry, id) + cnt; There should be other simplifications that this macro can help out with as well * tag 'trace-v6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (42 commits) overflow: Introduce struct_offset() to get offset of member function_graph: Enable funcgraph-args and funcgraph-retaddr to work simultaneously tracing: Add boot-time backup of persistent ring buffer ftrace: Allow tracing of some of the tracing code tracing: Use strim() in trigger_process_regex() instead of skip_spaces() tracing: Add bulk garbage collection of freeing event_trigger_data tracing: Remove unneeded event_mutex lock in event_trigger_regex_release() tracing: Merge struct event_trigger_ops into struct event_command tracing: Remove get_trigger_ops() and add count_func() from trigger ops tracing: Show the tracer options in boot-time created instance ftrace: Avoid redundant initialization in register_ftrace_direct tracing: Remove unused variable in tracing_trace_options_show() fgraph: Make fgraph_no_sleep_time signed tracing: Convert function graph set_flags() to use a switch() statement tracing: Have function graph tracer option sleep-time be per instance tracing: Move graph-time out of function graph options tracing: Have function graph tracer option funcgraph-irqs be per instance trace/pid_list: optimize pid_list->lock contention tracing: Have function graph tracer define options per instance tracing: Have function tracer define options per instance ...
Diffstat (limited to 'kernel')
-rw-r--r--kernel/trace/Kconfig28
-rw-r--r--kernel/trace/Makefile17
-rw-r--r--kernel/trace/blktrace.c6
-rw-r--r--kernel/trace/fgraph.c10
-rw-r--r--kernel/trace/ftrace.c32
-rw-r--r--kernel/trace/pid_list.c30
-rw-r--r--kernel/trace/pid_list.h1
-rw-r--r--kernel/trace/trace.c893
-rw-r--r--kernel/trace/trace.h230
-rw-r--r--kernel/trace/trace_dynevent.c11
-rw-r--r--kernel/trace/trace_entries.h15
-rw-r--r--kernel/trace/trace_eprobe.c19
-rw-r--r--kernel/trace/trace_events.c4
-rw-r--r--kernel/trace/trace_events_hist.c143
-rw-r--r--kernel/trace/trace_events_synth.c2
-rw-r--r--kernel/trace/trace_events_trigger.c408
-rw-r--r--kernel/trace/trace_fprobe.c6
-rw-r--r--kernel/trace/trace_functions.c10
-rw-r--r--kernel/trace/trace_functions_graph.c220
-rw-r--r--kernel/trace/trace_irqsoff.c30
-rw-r--r--kernel/trace/trace_kdb.c2
-rw-r--r--kernel/trace/trace_kprobe.c6
-rw-r--r--kernel/trace/trace_output.c45
-rw-r--r--kernel/trace/trace_output.h11
-rw-r--r--kernel/trace/trace_sched_wakeup.c24
-rw-r--r--kernel/trace/trace_syscalls.c935
26 files changed, 2242 insertions, 896 deletions
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 4661b9e606e0..bfa2ec46e075 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -342,6 +342,20 @@ config DYNAMIC_FTRACE_WITH_JMP
depends on DYNAMIC_FTRACE_WITH_DIRECT_CALLS
depends on HAVE_DYNAMIC_FTRACE_WITH_JMP
+config FUNCTION_SELF_TRACING
+ bool "Function trace tracing code"
+ depends on FUNCTION_TRACER
+ help
+ Normally all the tracing code is set to notrace, where the function
+ tracer will ignore all the tracing functions. Sometimes it is useful
+ for debugging to trace some of the tracing infratructure itself.
+ Enable this to allow some of the tracing infrastructure to be traced
+ by the function tracer. Note, this will likely add noise to function
+ tracing if events and other tracing features are enabled along with
+ function tracing.
+
+ If unsure, say N.
+
config FPROBE
bool "Kernel Function Probe (fprobe)"
depends on HAVE_FUNCTION_GRAPH_FREGS && HAVE_FTRACE_GRAPH_FUNC
@@ -587,6 +601,20 @@ config FTRACE_SYSCALLS
help
Basic tracer to catch the syscall entry and exit events.
+config TRACE_SYSCALL_BUF_SIZE_DEFAULT
+ int "System call user read max size"
+ range 0 165
+ default 63
+ depends on FTRACE_SYSCALLS
+ help
+ Some system call trace events will record the data from a user
+ space address that one of the parameters point to. The amount of
+ data per event is limited. That limit is set by this config and
+ this config also affects how much user space data perf can read.
+
+ For a tracing instance, this size may be changed by writing into
+ its syscall_user_buf_size file.
+
config TRACER_SNAPSHOT
bool "Create a snapshot trace buffer"
select TRACER_MAX_TRACE
diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index dcb4e02afc5f..fc5dcc888e13 100644
--- a/kernel/trace/Makefile
+++ b/kernel/trace/Makefile
@@ -16,6 +16,23 @@ obj-y += trace_selftest_dynamic.o
endif
endif
+# Allow some files to be function traced
+ifdef CONFIG_FUNCTION_SELF_TRACING
+CFLAGS_trace_output.o = $(CC_FLAGS_FTRACE)
+CFLAGS_trace_seq.o = $(CC_FLAGS_FTRACE)
+CFLAGS_trace_stat.o = $(CC_FLAGS_FTRACE)
+CFLAGS_tracing_map.o = $(CC_FLAGS_FTRACE)
+CFLAGS_synth_event_gen_test.o = $(CC_FLAGS_FTRACE)
+CFLAGS_trace_events.o = $(CC_FLAGS_FTRACE)
+CFLAGS_trace_syscalls.o = $(CC_FLAGS_FTRACE)
+CFLAGS_trace_events_filter.o = $(CC_FLAGS_FTRACE)
+CFLAGS_trace_events_trigger.o = $(CC_FLAGS_FTRACE)
+CFLAGS_trace_events_synth.o = $(CC_FLAGS_FTRACE)
+CFLAGS_trace_events_hist.o = $(CC_FLAGS_FTRACE)
+CFLAGS_trace_events_user.o = $(CC_FLAGS_FTRACE)
+CFLAGS_trace_dynevent.o = $(CC_FLAGS_FTRACE)
+endif
+
ifdef CONFIG_FTRACE_STARTUP_TEST
CFLAGS_trace_kprobe_selftest.o = $(CC_FLAGS_FTRACE)
obj-$(CONFIG_KPROBE_EVENTS) += trace_kprobe_selftest.o
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index af8cbc8e1a7c..d031c8d80be4 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -1738,7 +1738,7 @@ static enum print_line_t print_one_line(struct trace_iterator *iter,
t = te_blk_io_trace(iter->ent);
what = (t->action & ((1 << BLK_TC_SHIFT) - 1)) & ~__BLK_TA_CGROUP;
- long_act = !!(tr->trace_flags & TRACE_ITER_VERBOSE);
+ long_act = !!(tr->trace_flags & TRACE_ITER(VERBOSE));
log_action = classic ? &blk_log_action_classic : &blk_log_action;
has_cg = t->action & __BLK_TA_CGROUP;
@@ -1803,9 +1803,9 @@ blk_tracer_set_flag(struct trace_array *tr, u32 old_flags, u32 bit, int set)
/* don't output context-info for blk_classic output */
if (bit == TRACE_BLK_OPT_CLASSIC) {
if (set)
- tr->trace_flags &= ~TRACE_ITER_CONTEXT_INFO;
+ tr->trace_flags &= ~TRACE_ITER(CONTEXT_INFO);
else
- tr->trace_flags |= TRACE_ITER_CONTEXT_INFO;
+ tr->trace_flags |= TRACE_ITER(CONTEXT_INFO);
}
return 0;
}
diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
index 484ad7a18463..7fb9b169d6d4 100644
--- a/kernel/trace/fgraph.c
+++ b/kernel/trace/fgraph.c
@@ -498,9 +498,6 @@ found:
return get_data_type_data(current, offset);
}
-/* Both enabled by default (can be cleared by function_graph tracer flags */
-bool fgraph_sleep_time = true;
-
#ifdef CONFIG_DYNAMIC_FTRACE
/*
* archs can override this function if they must do something
@@ -1023,11 +1020,6 @@ void fgraph_init_ops(struct ftrace_ops *dst_ops,
#endif
}
-void ftrace_graph_sleep_time_control(bool enable)
-{
- fgraph_sleep_time = enable;
-}
-
/*
* Simply points to ftrace_stub, but with the proper protocol.
* Defined by the linker script in linux/vmlinux.lds.h
@@ -1098,7 +1090,7 @@ ftrace_graph_probe_sched_switch(void *ignore, bool preempt,
* Does the user want to count the time a function was asleep.
* If so, do not update the time stamps.
*/
- if (fgraph_sleep_time)
+ if (!fgraph_no_sleep_time)
return;
timestamp = trace_clock_local();
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index bbb37c0f8c6c..3ec2033c0774 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -534,7 +534,9 @@ static int function_stat_headers(struct seq_file *m)
static int function_stat_show(struct seq_file *m, void *v)
{
+ struct trace_array *tr = trace_get_global_array();
struct ftrace_profile *rec = v;
+ const char *refsymbol = NULL;
char str[KSYM_SYMBOL_LEN];
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
static struct trace_seq s;
@@ -554,7 +556,29 @@ static int function_stat_show(struct seq_file *m, void *v)
return 0;
#endif
- kallsyms_lookup(rec->ip, NULL, NULL, NULL, str);
+ if (tr->trace_flags & TRACE_ITER(PROF_TEXT_OFFSET)) {
+ unsigned long offset;
+
+ if (core_kernel_text(rec->ip)) {
+ refsymbol = "_text";
+ offset = rec->ip - (unsigned long)_text;
+ } else {
+ struct module *mod;
+
+ guard(rcu)();
+ mod = __module_text_address(rec->ip);
+ if (mod) {
+ refsymbol = mod->name;
+ /* Calculate offset from module's text entry address. */
+ offset = rec->ip - (unsigned long)mod->mem[MOD_TEXT].base;
+ }
+ }
+ if (refsymbol)
+ snprintf(str, sizeof(str), " %s+%#lx", refsymbol, offset);
+ }
+ if (!refsymbol)
+ kallsyms_lookup(rec->ip, NULL, NULL, NULL, str);
+
seq_printf(m, " %-30.30s %10lu", str, rec->counter);
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
@@ -838,6 +862,8 @@ static int profile_graph_entry(struct ftrace_graph_ent *trace,
return 1;
}
+bool fprofile_no_sleep_time;
+
static void profile_graph_return(struct ftrace_graph_ret *trace,
struct fgraph_ops *gops,
struct ftrace_regs *fregs)
@@ -863,7 +889,7 @@ static void profile_graph_return(struct ftrace_graph_ret *trace,
calltime = rettime - profile_data->calltime;
- if (!fgraph_sleep_time) {
+ if (fprofile_no_sleep_time) {
if (current->ftrace_sleeptime)
calltime -= current->ftrace_sleeptime - profile_data->sleeptime;
}
@@ -6075,7 +6101,7 @@ int register_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
new_hash = NULL;
ops->func = call_direct_funcs;
- ops->flags = MULTI_FLAGS;
+ ops->flags |= MULTI_FLAGS;
ops->trampoline = FTRACE_REGS_ADDR;
ops->direct_call = addr;
diff --git a/kernel/trace/pid_list.c b/kernel/trace/pid_list.c
index 090bb5ea4a19..dbee72d69d0a 100644
--- a/kernel/trace/pid_list.c
+++ b/kernel/trace/pid_list.c
@@ -3,6 +3,7 @@
* Copyright (C) 2021 VMware Inc, Steven Rostedt <rostedt@goodmis.org>
*/
#include <linux/spinlock.h>
+#include <linux/seqlock.h>
#include <linux/irq_work.h>
#include <linux/slab.h>
#include "trace.h"
@@ -126,7 +127,7 @@ bool trace_pid_list_is_set(struct trace_pid_list *pid_list, unsigned int pid)
{
union upper_chunk *upper_chunk;
union lower_chunk *lower_chunk;
- unsigned long flags;
+ unsigned int seq;
unsigned int upper1;
unsigned int upper2;
unsigned int lower;
@@ -138,14 +139,16 @@ bool trace_pid_list_is_set(struct trace_pid_list *pid_list, unsigned int pid)
if (pid_split(pid, &upper1, &upper2, &lower) < 0)
return false;
- raw_spin_lock_irqsave(&pid_list->lock, flags);
- upper_chunk = pid_list->upper[upper1];
- if (upper_chunk) {
- lower_chunk = upper_chunk->data[upper2];
- if (lower_chunk)
- ret = test_bit(lower, lower_chunk->data);
- }
- raw_spin_unlock_irqrestore(&pid_list->lock, flags);
+ do {
+ seq = read_seqcount_begin(&pid_list->seqcount);
+ ret = false;
+ upper_chunk = pid_list->upper[upper1];
+ if (upper_chunk) {
+ lower_chunk = upper_chunk->data[upper2];
+ if (lower_chunk)
+ ret = test_bit(lower, lower_chunk->data);
+ }
+ } while (read_seqcount_retry(&pid_list->seqcount, seq));
return ret;
}
@@ -178,6 +181,7 @@ int trace_pid_list_set(struct trace_pid_list *pid_list, unsigned int pid)
return -EINVAL;
raw_spin_lock_irqsave(&pid_list->lock, flags);
+ write_seqcount_begin(&pid_list->seqcount);
upper_chunk = pid_list->upper[upper1];
if (!upper_chunk) {
upper_chunk = get_upper_chunk(pid_list);
@@ -199,6 +203,7 @@ int trace_pid_list_set(struct trace_pid_list *pid_list, unsigned int pid)
set_bit(lower, lower_chunk->data);
ret = 0;
out:
+ write_seqcount_end(&pid_list->seqcount);
raw_spin_unlock_irqrestore(&pid_list->lock, flags);
return ret;
}
@@ -230,6 +235,7 @@ int trace_pid_list_clear(struct trace_pid_list *pid_list, unsigned int pid)
return -EINVAL;
raw_spin_lock_irqsave(&pid_list->lock, flags);
+ write_seqcount_begin(&pid_list->seqcount);
upper_chunk = pid_list->upper[upper1];
if (!upper_chunk)
goto out;
@@ -250,6 +256,7 @@ int trace_pid_list_clear(struct trace_pid_list *pid_list, unsigned int pid)
}
}
out:
+ write_seqcount_end(&pid_list->seqcount);
raw_spin_unlock_irqrestore(&pid_list->lock, flags);
return 0;
}
@@ -340,8 +347,10 @@ static void pid_list_refill_irq(struct irq_work *iwork)
again:
raw_spin_lock(&pid_list->lock);
+ write_seqcount_begin(&pid_list->seqcount);
upper_count = CHUNK_ALLOC - pid_list->free_upper_chunks;
lower_count = CHUNK_ALLOC - pid_list->free_lower_chunks;
+ write_seqcount_end(&pid_list->seqcount);
raw_spin_unlock(&pid_list->lock);
if (upper_count <= 0 && lower_count <= 0)
@@ -370,6 +379,7 @@ static void pid_list_refill_irq(struct irq_work *iwork)
}
raw_spin_lock(&pid_list->lock);
+ write_seqcount_begin(&pid_list->seqcount);
if (upper) {
*upper_next = pid_list->upper_list;
pid_list->upper_list = upper;
@@ -380,6 +390,7 @@ static void pid_list_refill_irq(struct irq_work *iwork)
pid_list->lower_list = lower;
pid_list->free_lower_chunks += lcnt;
}
+ write_seqcount_end(&pid_list->seqcount);
raw_spin_unlock(&pid_list->lock);
/*
@@ -419,6 +430,7 @@ struct trace_pid_list *trace_pid_list_alloc(void)
init_irq_work(&pid_list->refill_irqwork, pid_list_refill_irq);
raw_spin_lock_init(&pid_list->lock);
+ seqcount_raw_spinlock_init(&pid_list->seqcount, &pid_list->lock);
for (i = 0; i < CHUNK_ALLOC; i++) {
union upper_chunk *chunk;
diff --git a/kernel/trace/pid_list.h b/kernel/trace/pid_list.h
index 62e73f1ac85f..0b45fb0eadb9 100644
--- a/kernel/trace/pid_list.h
+++ b/kernel/trace/pid_list.h
@@ -76,6 +76,7 @@ union upper_chunk {
};
struct trace_pid_list {
+ seqcount_raw_spinlock_t seqcount;
raw_spinlock_t lock;
struct irq_work refill_irqwork;
union upper_chunk *upper[UPPER1_SIZE]; // 1 or 2K in size
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 304e93597126..ed5eddb08ef3 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -20,6 +20,7 @@
#include <linux/security.h>
#include <linux/seq_file.h>
#include <linux/irqflags.h>
+#include <linux/syscalls.h>
#include <linux/debugfs.h>
#include <linux/tracefs.h>
#include <linux/pagemap.h>
@@ -93,17 +94,13 @@ static bool tracepoint_printk_stop_on_boot __initdata;
static bool traceoff_after_boot __initdata;
static DEFINE_STATIC_KEY_FALSE(tracepoint_printk_key);
-/* For tracers that don't implement custom flags */
-static struct tracer_opt dummy_tracer_opt[] = {
- { }
+/* Store tracers and their flags per instance */
+struct tracers {
+ struct list_head list;
+ struct tracer *tracer;
+ struct tracer_flags *flags;
};
-static int
-dummy_set_flag(struct trace_array *tr, u32 old_flags, u32 bit, int set)
-{
- return 0;
-}
-
/*
* To prevent the comm cache from being overwritten when no
* tracing is active, only save the comm when a trace event
@@ -512,22 +509,23 @@ EXPORT_SYMBOL_GPL(unregister_ftrace_export);
/* trace_flags holds trace_options default values */
#define TRACE_DEFAULT_FLAGS \
- (FUNCTION_DEFAULT_FLAGS | \
- TRACE_ITER_PRINT_PARENT | TRACE_ITER_PRINTK | \
- TRACE_ITER_ANNOTATE | TRACE_ITER_CONTEXT_INFO | \
- TRACE_ITER_RECORD_CMD | TRACE_ITER_OVERWRITE | \
- TRACE_ITER_IRQ_INFO | TRACE_ITER_MARKERS | \
- TRACE_ITER_HASH_PTR | TRACE_ITER_TRACE_PRINTK | \
- TRACE_ITER_COPY_MARKER)
+ (FUNCTION_DEFAULT_FLAGS | FPROFILE_DEFAULT_FLAGS | \
+ TRACE_ITER(PRINT_PARENT) | TRACE_ITER(PRINTK) | \
+ TRACE_ITER(ANNOTATE) | TRACE_ITER(CONTEXT_INFO) | \
+ TRACE_ITER(RECORD_CMD) | TRACE_ITER(OVERWRITE) | \
+ TRACE_ITER(IRQ_INFO) | TRACE_ITER(MARKERS) | \
+ TRACE_ITER(HASH_PTR) | TRACE_ITER(TRACE_PRINTK) | \
+ TRACE_ITER(COPY_MARKER))
/* trace_options that are only supported by global_trace */
-#define TOP_LEVEL_TRACE_FLAGS (TRACE_ITER_PRINTK | \
- TRACE_ITER_PRINTK_MSGONLY | TRACE_ITER_RECORD_CMD)
+#define TOP_LEVEL_TRACE_FLAGS (TRACE_ITER(PRINTK) | \
+ TRACE_ITER(PRINTK_MSGONLY) | TRACE_ITER(RECORD_CMD) | \
+ TRACE_ITER(PROF_TEXT_OFFSET) | FPROFILE_DEFAULT_FLAGS)
/* trace_flags that are default zero for instances */
#define ZEROED_TRACE_FLAGS \
- (TRACE_ITER_EVENT_FORK | TRACE_ITER_FUNC_FORK | TRACE_ITER_TRACE_PRINTK | \
- TRACE_ITER_COPY_MARKER)
+ (TRACE_ITER(EVENT_FORK) | TRACE_ITER(FUNC_FORK) | TRACE_ITER(TRACE_PRINTK) | \
+ TRACE_ITER(COPY_MARKER))
/*
* The global_trace is the descriptor that holds the top-level tracing
@@ -558,9 +556,9 @@ static void update_printk_trace(struct trace_array *tr)
if (printk_trace == tr)
return;
- printk_trace->trace_flags &= ~TRACE_ITER_TRACE_PRINTK;
+ printk_trace->trace_flags &= ~TRACE_ITER(TRACE_PRINTK);
printk_trace = tr;
- tr->trace_flags |= TRACE_ITER_TRACE_PRINTK;
+ tr->trace_flags |= TRACE_ITER(TRACE_PRINTK);
}
/* Returns true if the status of tr changed */
@@ -573,7 +571,7 @@ static bool update_marker_trace(struct trace_array *tr, int enabled)
return false;
list_add_rcu(&tr->marker_list, &marker_copies);
- tr->trace_flags |= TRACE_ITER_COPY_MARKER;
+ tr->trace_flags |= TRACE_ITER(COPY_MARKER);
return true;
}
@@ -581,7 +579,7 @@ static bool update_marker_trace(struct trace_array *tr, int enabled)
return false;
list_del_init(&tr->marker_list);
- tr->trace_flags &= ~TRACE_ITER_COPY_MARKER;
+ tr->trace_flags &= ~TRACE_ITER(COPY_MARKER);
return true;
}
@@ -1139,7 +1137,7 @@ int __trace_array_puts(struct trace_array *tr, unsigned long ip,
unsigned int trace_ctx;
int alloc;
- if (!(tr->trace_flags & TRACE_ITER_PRINTK))
+ if (!(tr->trace_flags & TRACE_ITER(PRINTK)))
return 0;
if (unlikely(tracing_selftest_running && tr == &global_trace))
@@ -1205,7 +1203,7 @@ int __trace_bputs(unsigned long ip, const char *str)
if (!printk_binsafe(tr))
return __trace_puts(ip, str, strlen(str));
- if (!(tr->trace_flags & TRACE_ITER_PRINTK))
+ if (!(tr->trace_flags & TRACE_ITER(PRINTK)))
return 0;
if (unlikely(tracing_selftest_running || tracing_disabled))
@@ -2173,6 +2171,7 @@ static int save_selftest(struct tracer *type)
static int run_tracer_selftest(struct tracer *type)
{
struct trace_array *tr = &global_trace;
+ struct tracer_flags *saved_flags = tr->current_trace_flags;
struct tracer *saved_tracer = tr->current_trace;
int ret;
@@ -2203,6 +2202,7 @@ static int run_tracer_selftest(struct tracer *type)
tracing_reset_online_cpus(&tr->array_buffer);
tr->current_trace = type;
+ tr->current_trace_flags = type->flags ? : type->default_flags;
#ifdef CONFIG_TRACER_MAX_TRACE
if (type->use_max_tr) {
@@ -2219,6 +2219,7 @@ static int run_tracer_selftest(struct tracer *type)
ret = type->selftest(type, tr);
/* the test is responsible for resetting too */
tr->current_trace = saved_tracer;
+ tr->current_trace_flags = saved_flags;
if (ret) {
printk(KERN_CONT "FAILED!\n");
/* Add the warning after printing 'FAILED' */
@@ -2311,10 +2312,23 @@ static inline int do_run_tracer_selftest(struct tracer *type)
}
#endif /* CONFIG_FTRACE_STARTUP_TEST */
-static void add_tracer_options(struct trace_array *tr, struct tracer *t);
+static int add_tracer(struct trace_array *tr, struct tracer *t);
static void __init apply_trace_boot_options(void);
+static void free_tracers(struct trace_array *tr)
+{
+ struct tracers *t, *n;
+
+ lockdep_assert_held(&trace_types_lock);
+
+ list_for_each_entry_safe(t, n, &tr->tracers, list) {
+ list_del(&t->list);
+ kfree(t->flags);
+ kfree(t);
+ }
+}
+
/**
* register_tracer - register a tracer with the ftrace system.
* @type: the plugin for the tracer
@@ -2323,6 +2337,7 @@ static void __init apply_trace_boot_options(void);
*/
int __init register_tracer(struct tracer *type)
{
+ struct trace_array *tr;
struct tracer *t;
int ret = 0;
@@ -2354,31 +2369,25 @@ int __init register_tracer(struct tracer *type)
}
}
- if (!type->set_flag)
- type->set_flag = &dummy_set_flag;
- if (!type->flags) {
- /*allocate a dummy tracer_flags*/
- type->flags = kmalloc(sizeof(*type->flags), GFP_KERNEL);
- if (!type->flags) {
- ret = -ENOMEM;
- goto out;
- }
- type->flags->val = 0;
- type->flags->opts = dummy_tracer_opt;
- } else
- if (!type->flags->opts)
- type->flags->opts = dummy_tracer_opt;
-
/* store the tracer for __set_tracer_option */
- type->flags->trace = type;
+ if (type->flags)
+ type->flags->trace = type;
ret = do_run_tracer_selftest(type);
if (ret < 0)
goto out;
+ list_for_each_entry(tr, &ftrace_trace_arrays, list) {
+ ret = add_tracer(tr, type);
+ if (ret < 0) {
+ /* The tracer will still exist but without options */
+ pr_warn("Failed to create tracer options for %s\n", type->name);
+ break;
+ }
+ }
+
type->next = trace_types;
trace_types = type;
- add_tracer_options(&global_trace, type);
out:
mutex_unlock(&trace_types_lock);
@@ -2391,7 +2400,7 @@ int __init register_tracer(struct tracer *type)
printk(KERN_INFO "Starting tracer '%s'\n", type->name);
/* Do we want this tracer to start on bootup? */
- tracing_set_tracer(&global_trace, type->name);
+ WARN_ON(tracing_set_tracer(&global_trace, type->name) < 0);
default_bootup_tracer = NULL;
apply_trace_boot_options();
@@ -3078,7 +3087,7 @@ static inline void ftrace_trace_stack(struct trace_array *tr,
unsigned int trace_ctx,
int skip, struct pt_regs *regs)
{
- if (!(tr->trace_flags & TRACE_ITER_STACKTRACE))
+ if (!(tr->trace_flags & TRACE_ITER(STACKTRACE)))
return;
__ftrace_trace_stack(tr, buffer, trace_ctx, skip, regs);
@@ -3139,7 +3148,7 @@ ftrace_trace_userstack(struct trace_array *tr,
struct ring_buffer_event *event;
struct userstack_entry *entry;
- if (!(tr->trace_flags & TRACE_ITER_USERSTACKTRACE))
+ if (!(tr->trace_flags & TRACE_ITER(USERSTACKTRACE)))
return;
/*
@@ -3484,7 +3493,7 @@ int trace_array_printk(struct trace_array *tr,
if (tr == &global_trace)
return 0;
- if (!(tr->trace_flags & TRACE_ITER_PRINTK))
+ if (!(tr->trace_flags & TRACE_ITER(PRINTK)))
return 0;
va_start(ap, fmt);
@@ -3521,7 +3530,7 @@ int trace_array_printk_buf(struct trace_buffer *buffer,
int ret;
va_list ap;
- if (!(printk_trace->trace_flags & TRACE_ITER_PRINTK))
+ if (!(printk_trace->trace_flags & TRACE_ITER(PRINTK)))
return 0;
va_start(ap, fmt);
@@ -3791,7 +3800,7 @@ const char *trace_event_format(struct trace_iterator *iter, const char *fmt)
if (WARN_ON_ONCE(!fmt))
return fmt;
- if (!iter->tr || iter->tr->trace_flags & TRACE_ITER_HASH_PTR)
+ if (!iter->tr || iter->tr->trace_flags & TRACE_ITER(HASH_PTR))
return fmt;
p = fmt;
@@ -4113,7 +4122,7 @@ static void print_event_info(struct array_buffer *buf, struct seq_file *m)
static void print_func_help_header(struct array_buffer *buf, struct seq_file *m,
unsigned int flags)
{
- bool tgid = flags & TRACE_ITER_RECORD_TGID;
+ bool tgid = flags & TRACE_ITER(RECORD_TGID);
print_event_info(buf, m);
@@ -4124,7 +4133,7 @@ static void print_func_help_header(struct array_buffer *buf, struct seq_file *m,
static void print_func_help_header_irq(struct array_buffer *buf, struct seq_file *m,
unsigned int flags)
{
- bool tgid = flags & TRACE_ITER_RECORD_TGID;
+ bool tgid = flags & TRACE_ITER(RECORD_TGID);
static const char space[] = " ";
int prec = tgid ? 12 : 2;
@@ -4197,7 +4206,7 @@ static void test_cpu_buff_start(struct trace_iterator *iter)
struct trace_seq *s = &iter->seq;
struct trace_array *tr = iter->tr;
- if (!(tr->trace_flags & TRACE_ITER_ANNOTATE))
+ if (!(tr->trace_flags & TRACE_ITER(ANNOTATE)))
return;
if (!(iter->iter_flags & TRACE_FILE_ANNOTATE))
@@ -4219,6 +4228,22 @@ static void test_cpu_buff_start(struct trace_iterator *iter)
iter->cpu);
}
+#ifdef CONFIG_FTRACE_SYSCALLS
+static bool is_syscall_event(struct trace_event *event)
+{
+ return (event->funcs == &enter_syscall_print_funcs) ||
+ (event->funcs == &exit_syscall_print_funcs);
+
+}
+#define syscall_buf_size CONFIG_TRACE_SYSCALL_BUF_SIZE_DEFAULT
+#else
+static inline bool is_syscall_event(struct trace_event *event)
+{
+ return false;
+}
+#define syscall_buf_size 0
+#endif /* CONFIG_FTRACE_SYSCALLS */
+
static enum print_line_t print_trace_fmt(struct trace_iterator *iter)
{
struct trace_array *tr = iter->tr;
@@ -4233,7 +4258,7 @@ static enum print_line_t print_trace_fmt(struct trace_iterator *iter)
event = ftrace_find_event(entry->type);
- if (tr->trace_flags & TRACE_ITER_CONTEXT_INFO) {
+ if (tr->trace_flags & TRACE_ITER(CONTEXT_INFO)) {
if (iter->iter_flags & TRACE_FILE_LAT_FMT)
trace_print_lat_context(iter);
else
@@ -4244,17 +4269,19 @@ static enum print_line_t print_trace_fmt(struct trace_iterator *iter)
return TRACE_TYPE_PARTIAL_LINE;
if (event) {
- if (tr->trace_flags & TRACE_ITER_FIELDS)