jeudi 3 septembre 2020

Removing error detection code to speed up program

A C++ code with STL containers/data structures is accompanied with lots of error detection code. How much performance improvement can we expect if we implement STL libraries without error detection codes, or anything that is related to exceptional case.

As per my understanding, we can gain performance as Instruction cache will perform better with lower code, and removed conditional will prevent jumps, further benefiting in cache efficiency.

I understand, this can result in issues. But I want to weigh the performance improvements to system crash (or anything bad) that may happen without checking for exceptional cases.

For example

#include <bits/stdc++.h>
using namespace std;
int main() {
    int x;
    cin>>x;
    vector<int> v(x);
}

compiles to

.LC0:
        .string "cannot create std::vector larger than max_size()"
main:
        push    rbx
        mov     edi, OFFSET FLAT:_ZSt3cin
        sub     rsp, 16
        lea     rsi, [rsp+12]
        call    std::basic_istream<char, std::char_traits<char> >::operator>>(int&)
        movsx   rax, DWORD PTR [rsp+12]
        movabs  rdx, 2305843009213693951
        cmp     rax, rdx
        ja      .L11
        test    rax, rax
        je      .L3
        lea     rbx, [0+rax*4]
        mov     rdi, rbx
        call    operator new(unsigned long)
        mov     rdi, rax
        add     rbx, rax
.L4:
        mov     DWORD PTR [rax], 0
        add     rax, 4
        cmp     rax, rbx
        jne     .L4
        call    operator delete(void*)
.L3:
        add     rsp, 16
        xor     eax, eax
        pop     rbx
        ret
.L11:
        mov     edi, OFFSET FLAT:.LC0
        call    std::__throw_length_error(char const*)
_GLOBAL__sub_I_main:
        sub     rsp, 8
        mov     edi, OFFSET FLAT:_ZStL8__ioinit
        call    std::ios_base::Init::Init() [complete object constructor]
        mov     edx, OFFSET FLAT:__dso_handle
        mov     esi, OFFSET FLAT:_ZStL8__ioinit
        mov     edi, OFFSET FLAT:_ZNSt8ios_base4InitD1Ev
        add     rsp, 8
        jmp     __cxa_atexit

Here, we can see that ".LC0" is basically useless in most of the cases.

Aucun commentaire:

Enregistrer un commentaire