Skip to content

Missed optimization of std::vector access #63328

@hiraditya

Description

@hiraditya

I originally reported the bug in GCC (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109443) but it seems Clang also has the problem. Both accesses are similar but different code is generated. the function h has better codegen than g for some reason.

void f(int);

void g(std::vector<int> v)
{
    for (std::vector<int>::size_type i = 0; i < v.size(); i++)
        f( v[ i ] );
}

void h(std::vector<int> v)
{
    for (std::vector<int>::const_iterator i = v.begin(); i != v.end(); ++i)
        f( *i );
}

$ clang++ -std=c++2a -O3 -stdlib=libc++

g(std::vector<int, std::allocator<int> >): # @g(std::vector<int, std::allocator<int> >)
  push r14
  push rbx
  push rax
  mov rax, qword ptr [rdi]
  cmp qword ptr [rdi + 8], rax
  je .LBB0_3
  mov rbx, rdi
  xor r14d, r14d
.LBB0_2: # =>This Inner Loop Header: Depth=1
  mov edi, dword ptr [rax + 4*r14]
  call f(int)@PLT
  inc r14
  mov rax, qword ptr [rbx]
  mov rcx, qword ptr [rbx + 8]
  sub rcx, rax
  sar rcx, 2
  cmp r14, rcx
  jb .LBB0_2
.LBB0_3:
  add rsp, 8
  pop rbx
  pop r14
  ret


h(std::vector<int, std::allocator<int> >): # @h(std::vector<int, std::allocator<int> >)
  push r14
  push rbx
  push rax
  mov r14, qword ptr [rdi]
  cmp r14, qword ptr [rdi + 8]
  je .LBB1_3
  mov rbx, rdi
.LBB1_2: # =>This Inner Loop Header: Depth=1
  mov edi, dword ptr [r14]
  call f(int)@PLT
  add r14, 4
  cmp r14, qword ptr [rbx + 8]
  jne .LBB1_2
.LBB1_3:
  add rsp, 8
  pop rbx
  pop r14
  ret

There are two issues:

  • v.size() is not loop invariant
    • This can be fixed by programmer by hoisting v.size() out of the loop body. However, compiler should be able to figure it out.
  • operator[] loads v in each iteration

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions