Skip to content

The func ofproto_try_ref takes a lot of CPU time. #360

@danieldin95

Description

@danieldin95

I perf a revalidator thread, and find __aarch64_cas4_relax take a lot of CPU time.

   51.17% revalidator260  libofproto-2.16.so.0.0.2      [.] __aarch64_cas4_relax                                                                                                                           ◆
   8.46%  revalidator260  libofproto-2.16.so.0.0.2      [.] ofproto_try_ref                                                                                                                                ▒
   8.41%  revalidator260  libofproto-2.16.so.0.0.2      [.] __aarch64_ldadd4_rel                                                                                                                           ▒
   4.91%  revalidator260  libopenvswitch-2.16.so.0.0.2  [.] classifier_lookup__                                                                                                                            ▒
   4.08%  revalidator260  libofproto-2.16.so.0.0.2      [.] __aarch64_ldadd4_relax                                                                                                                         ▒
   2.35%  revalidator260  libopenvswitch-2.16.so.0.0.2  [.] ccmap_find                                                                                                                                     ▒
   2.10%  revalidator260  libopenvswitch-2.16.so.0.0.2  [.] cmap_find                                                                                                                                      ▒
   1.61%  revalidator260  libpthread-2.28.so            [.] 0x0000000000014660                                                                                                                             ▒
   1.51%  revalidator260  libpthread-2.28.so            [.] 0x00000000000148d0                                                                                                                             ▒
   1.27%  revalidator260  libofproto-2.16.so.0.0.2      [.] __aarch64_ldadd8_relax                                                                                                                         ▒
   0.76%  revalidator260  libofproto-2.16.so.0.0.2      [.] do_xlate_actions                                                                                                                               ▒
   0.62%  revalidator260  libc-2.28.so                  [.] 0x000000000010ccb0                                                                                                                             ▒
   0.34%  revalidator260  libopenvswitch-2.16.so.0.0.2  [.] ovs_mutex_lock_at                                                                                                                              ▒
   0.32%  revalidator260  libofproto-2.16.so.0.0.2      [.] ukey_lookup.isra.31                                                                                                                            ▒
   0.32%  revalidator260  [kernel.kallsyms]             [k] sched_group_set_shares                                                                                                                         ▒
   0.31%  revalidator260  libofproto-2.16.so.0.0.2      [.] xlate_table_action                                                                                                                             ▒
   0.30%  revalidator260  libc-2.28.so                  [.] 0x00000000000847f0                                                                                                                             ▒
   0.26%  revalidator260  libc-2.28.so                  [.] 0x00000000000847e0                                                                                                                             ▒
   0.25%  revalidator260  [kernel.kallsyms]             [k] find_vpid                                                                                                                                      ▒
   0.24%  revalidator260  libc-2.28.so                  [.] 0x00000000000847f8                                                                                                                             ▒
   0.24%  revalidator260  libc-2.28.so                  [.] 0x00000000000847e4                                                                                                                             ▒
   0.22%  revalidator260  libc-2.28.so                  [.] 0x00000000000847ec                                                                                                                             ▒
   0.21%  revalidator260  libopenvswitch-2.16.so.0.0.2  [.] cmap_next_position                                                                                                                             ▒
   0.20%  revalidator260  libofproto-2.16.so.0.0.2      [.] xlate_push_stats_entry                                                                                                                         ▒
   0.20%  revalidator260  libc-2.28.so                  [.] 0x00000000000847f4                                                                                                                             ▒
   0.20%  revalidator260  libc-2.28.so                  [.] 0x00000000000847fc                                                                                                                             ▒
   0.20%  revalidator260  libofproto-2.16.so.0.0.2      [.] rule_dpif_lookup_from_table                                                                                                                    ▒
   0.19%  revalidator260  libc-2.28.so                  [.] 0x00000000000847e8                                                                                                                             ▒
   0.18%  revalidator260  libopenvswitch-2.16.so.0.0.2  [.] mf_set_flow_value                                                                                                                              ▒
   0.17%  revalidator260  libopenvswitch-2.16.so.0.0.2  [.] dp_netdev_flow_to_dpif_flow

The func __aarch64_cas4_relax atomically compares a 32-bit value in memory with an expected value and, if they match, swaps it with a new value—all with relaxed memory ordering.

static inline bool
ovs_refcount_try_ref_rcu(struct ovs_refcount *refcount)
{
    unsigned int count;

    atomic_read_explicit(&refcount->count, &count, memory_order_relaxed);
    do {
        if (count == 0) {
            return false;
        }
    } while (!atomic_compare_exchange_weak_explicit(&refcount->count, &count,
                                                    count + 1,
                                                    memory_order_relaxed,
                                                    memory_order_relaxed));
    return true;
}

I cann't understand why atomic_compare_exchange_weak_explicit takes a lot of CPU time in aarch64. Some explanations suggest that under the aarch64 architecture, there is a high possibility of cas weak failing. Can we use strong instead of weak?

#define atomic_compare_exchange_weak            \
    atomic_compare_exchange_strong
#define atomic_compare_exchange_weak_explicit   \
    atomic_compare_exchange_strong_explicit

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions