Talk:Safe Cflags

What's wrong with -Os?
Greetings, I'd like someone to explain why this switch is not recommended. I understand gcc's documentation as stating that -Os, -O1 and -O2 do only safe optimizations, without doing anything that would compromise the correctness of the resulting executable (as does happen with -O3 and above).

I was one of those crazy guys who'd been using -O3 in an effort to get more speed, until I read an article explaining how for most modern processors -Os actually gives you faster code for it's much more cache-friendly, whilst -O3 isn't plus you're more constrained by the hard disk as its binaries are way bigger.

Since then, I've been using -Os with no problems on a Sempron 3200+ - and I do perceive my system to be more responsive. I know that just 'cause something works for me it cannot be assumed it will for everyone, though I'd like to know the rationale for this not-recommended status.

Thanks in advance!
 * There's nothing "wrong" with it. The article just follows the official recommendations. /Ni1s 19:58, 5 November 2009 (GMT)


 * In certain cases, -Os is a faster optimization level than -O2 or -O3. Older processors, such as the Pentium II, will actually perform better overall because the whole system is slow when compared to modern systems. So, steps to reduce the size of the binary, and also the length of time to compile the binary, are worthwhile. (Note: I have never had a problem using -Os, your mileage may vary.) On faster systems, -O2 is far better than -Os, and -- in certain cases -- -O3 is better than -O2. In my opinion, -O3 should be used only for specific packages where there is a tangible benefit. --Titan 06:12, 7 November 2009 (GMT)

How do you rebuild after changing the cflags?
I am a beginner but I think this is with: bash# emerge -e world It might be nice to explain what to do after changing the cflags..or a see also or link otherwise telling users what to do after changing the cflags. Things don't just magically happen.

-march=native and -O3
I wanted to write down my finding today - compiling with -O2 flag. Running time 2m28. With -O3 time is 1m46. Never observed before so huge difference between O2 and O3 optimization levels. But there is a trick, -march=native is also included. Without it, O3 runs 2m28 like O2. -openmp was also included in every test. $ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.6/lto-wrapper Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 4.6.3-1ubuntu5' --with-bugurl=file:///usr/share/doc/gcc-4.6/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.6 --enable-shared --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.6 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --enable-plugin --enable-objc-gc --disable-werror --with-arch-32=i686 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) $ $ cat /proc/cpuinfo processor	: 0 vendor_id	: GenuineIntel cpu family	: 6 model		: 26 model name	: Intel(R) Xeon(R) CPU          L5506  @ 2.13GHz stepping	: 5 microcode	: 0x6 cpu MHz		: 1600.000 cache size	: 4096 KB physical id	: 0 siblings	: 4 core id		: 0 cpu cores	: 4 apicid		: 0 initial apicid	: 0 fpu		: yes fpu_exception	: yes cpuid level	: 11 wp		: yes flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm dts tpr_shadow vnmi flexpriority ept vpid bogomips	: 4266.49 clflush size	: 64 cache_alignment	: 64 address sizes	: 40 bits physical, 48 bits virtual power management:

processor	: 1 vendor_id	: GenuineIntel cpu family	: 6 model		: 26 model name	: Intel(R) Xeon(R) CPU          L5506  @ 2.13GHz stepping	: 5 microcode	: 0x6 cpu MHz		: 1600.000 cache size	: 4096 KB physical id	: 0 siblings	: 4 core id		: 1 cpu cores	: 4 apicid		: 2 initial apicid	: 2 fpu		: yes fpu_exception	: yes cpuid level	: 11 wp		: yes flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm dts tpr_shadow vnmi flexpriority ept vpid bogomips	: 4266.66 clflush size	: 64 cache_alignment	: 64 address sizes	: 40 bits physical, 48 bits virtual power management:

processor	: 2 vendor_id	: GenuineIntel cpu family	: 6 model		: 26 model name	: Intel(R) Xeon(R) CPU          L5506  @ 2.13GHz stepping	: 5 microcode	: 0x6 cpu MHz		: 1600.000 cache size	: 4096 KB physical id	: 0 siblings	: 4 core id		: 2 cpu cores	: 4 apicid		: 4 initial apicid	: 4 fpu		: yes fpu_exception	: yes cpuid level	: 11 wp		: yes flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm dts tpr_shadow vnmi flexpriority ept vpid bogomips	: 4266.67 clflush size	: 64 cache_alignment	: 64 address sizes	: 40 bits physical, 48 bits virtual power management:

processor	: 3 vendor_id	: GenuineIntel cpu family	: 6 model		: 26 model name	: Intel(R) Xeon(R) CPU          L5506  @ 2.13GHz stepping	: 5 microcode	: 0x6 cpu MHz		: 1600.000 cache size	: 4096 KB physical id	: 0 siblings	: 4 core id		: 3 cpu cores	: 4 apicid		: 6 initial apicid	: 6 fpu		: yes fpu_exception	: yes cpuid level	: 11 wp		: yes flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm dts tpr_shadow vnmi flexpriority ept vpid bogomips	: 4266.67 clflush size	: 64 cache_alignment	: 64 address sizes	: 40 bits physical, 48 bits virtual power management:

$

AMD C-50 CPU.
I don't know where do you find this option "btver1" in GCC Manual. Nor i'm able to compile a packages with this cflag.

Please for better description. I've started installing gentoo from the scratch, so with the current stage3.

Kind regards Ivan