Skip to content

[M68k] Inefficient function call/return sequences #88300

Open
@n8pjl

Description

@n8pjl
short bar(short);

short foo(short x) {
    return bar(x);
}

gcc -O1 produces:

foo:
        move.w 6(%sp),%a0
        move.l %a0,-(%sp)
        jsr bar
        addq.l #4,%sp
        rts

while clang -O1 -fomit-frame-pointer produces:

foo:
        suba.l  #4, %sp
        move.w  (10,%sp), %d0
        ext.l   %d0
        move.l  %d0, (%sp)
        jsr     bar
        ext.l   %d0
        adda.l  #4, %sp
        rts

The m68k abi does not expect byte or word parameters or return values to be extended at all, thus both ext.l instructions can be omitted, as GCC (and the Sierra C Compiler) does.

While clang puts the {sign,zero}ext attribute on all i1, i8, and i16 parameters and return values, removing that attribute does not eliminate argument value extension:

foo:
        suba.l  #4, %sp
        move.w  (10,%sp), %d0
        and.l   #65535, %d0
        move.l  %d0, (%sp)
        jsr     bar
        adda.l  #4, %sp
        rts

The following optimizations can be performed on the sequence, in rough increasing order of difficulty:

  • Omit frame pointers by default ([M68k] Failure to optimize out function prologue and epilogue/omit frame pointer  #75013)
  • Using quick variant of move, add, and sub instructions
  • Merging stack adjustment with move, using predecrement addressing mode
  • movem optimization (not relevant here)
  • Remove unnecessary extensions of sub-register-width values
  • Detect and perform sibcall optimization on entire sequence (which GCC performs for -O2 and greater)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions