The CellSPU port has been removed. It can still be found in older versions.
The IR-level extended linker APIs (for example, to link bitcode files out of archives) have been removed. Any existing clients of these features should move to using a linker with integrated LTO support.
LLVM and Clang’s documentation has been migrated to the Sphinx documentation generation system which uses easy-to-write reStructuredText. See llvm/docs/README.txt for more information.
TargetTransformInfo (TTI) is a new interface that can be used by IR-level passes to obtain target-specific information, such as the costs of instructions. Only “Lowering” passes such as LSR and the vectorizer are allowed to use the TTI infrastructure.
We’ve improved the X86 and ARM cost model.
The Attributes classes have been completely rewritten and expanded. They now support not only enumerated attributes and alignments, but “string” attributes, which are useful for passing information to code generation. See How To Use Attributes for more details.
TableGen’s syntax for instruction selection patterns has been simplified. Instead of specifying types indirectly with register classes, you should now specify types directly in the input patterns. See SparcInstrInfo.td for examples of the new syntax. The old syntax using register classes still works, but it will be removed in a future LLVM release.
MCJIT now supports exception handling. Support for it in the old jit will be removed in the 3.4 release.
Command line options can now be grouped into categories which are shown in the output of -help. See Grouping options into categories.
The appearance of command line options in -help that are inherited by linking with libraries that use the LLVM Command line support library can now be modified at runtime. See The cl::getRegisteredOptions function.
* Major New Features
- AddressSanitizer, a fast memory error detector.
- MachineInstr Bundles, Support to model instruction bundling / packing.
- ARM Integrated Assembler, A full featured assembler and direct-to-object
support for ARM.
- Basic Block Placement Probability driven basic block placement.
* LLVM IR and Core Improvements
- A new type representing 16 bit half floating point values has been added.
- IR now supports vectors of pointers, including vector GEPs.
- Module flags have been introduced. They convey information about the module
as a whole to LLVM subsystems. This is currently used to encode Objective C
ABI information.
- Loads can now have range metadata attached to them to describe the possible
values being loaded.
- The llvm.ctlz and llvm.cttz intrinsics now have an additional argument which
indicates whether the behavior of the intrinsic is undefined on a zero
input. This can be used to generate more efficient code on platforms that
only have instructions which don't return the type size when counting bits
in 0.
* Optimizer Improvements
- The loop unroll pass now is able to unroll loops with run-time trip counts.
This feature is turned off by default, and is enabled with the
-unroll-runtime flag.
- A new basic-block autovectorization pass is available. Pass -vectorize to
run this pass along with some associated post-vectorization cleanup passes.
For more information, see the EuroLLVM 2012 slides: Autovectorization with
LLVM.
- Inline cost heuristics have been completely overhauled and now closely model
constant propagation through call sites, disregard trivially dead code
costs, and can model C++ STL iterator patterns.
* llvm-gcc is no longer supported, and not included in the release. We recommend
switching to Clang or DragonEgg.
* The linear scan register allocator has been replaced with a new "greedy"
register allocator, enabling live range splitting and many other optimizations that lead to better code quality. Please see its blog post or its talk at the
Developer Meeting for more information.
* LLVM IR now includes full support for atomics memory operations intended to
support the C++'11 and C'1x memory models. This includes atomic load and
store, compare and exchange, and read/modify/write instructions as well as
a full set of memory ordering constraints. Please see the Atomics Guide for
more information.
* The LLVM IR exception handling representation has been redesigned and
reimplemented, making it more elegant, fixing a huge number of bugs, and
enabling inlining and other optimizations. Please see its blog post and the
Exception Handling documentation for more information.
* The LLVM IR Type system has been redesigned and reimplemented, making it
faster and solving some long-standing problems. Please see its blog post for
more information.
* The MIPS backend has made major leaps in this release, going from an
experimental target to being virtually production quality and supporting
a wide variety of MIPS subtargets. See the MIPS section below for more
information.
* The optimizer and code generator now supports gprof and gcov-style coverage
and profiling information, and includes a new llvm-cov tool (but also works
with gcov). Clang exposes coverage and profiling through GCC-compatible
command line options.
* Type Based Alias Analysis (TBAA) is now implemented and turned on by default
in Clang. This allows substantially better load/store optimization in some
cases. TBAA can be disabled by passing -fno-strict-aliasing.
* This release has seen a continued focus on quality of debug information. LLVM
now generates much higher fidelity debug information, particularly when
debugging optimized code.
* Inline assembly now supports multiple alternative constraints.
* A new backend for the NVIDIA PTX virtual ISA (used to target its GPUs) is
under rapid development. It is not generally useful in 2.9, but is making
rapid progress.
* libc++ and LLDB are major new additions to the LLVM collective.
* LLVM 2.8 now has pretty decent support for debugging optimized code.
You should be able to reliably get debug info for function arguments,
assuming that the value is actually available where you have stopped.
* A new 'llvm-diff' tool is available that does a semantic diff of .ll files.
* The MC subproject has made major progress in this release. Direct .o file
writing support for darwin/x86[-64] is now reliable and support for other
targets and object file formats are in progress.
* The memcpy, memmove, and memset intrinsics now take address space qualified
pointers and a bit to indicate whether the transfer is "volatile" or not.
* Per-instruction debug info metadata is much faster and uses less memory by
using the new DebugLoc class.
* LLVM IR now has a more formalized concept of "trap values", which allow the
optimizer to optimize more aggressively in the presence of undefined behavior,
while still producing predictable results.
* LLVM IR now supports two new linkage types (linker_private_weak and
linker_private_weak_def_auto) which map onto some obscure MachO concepts.
* The optimizer now has support for updating debug information as it goes.
A key aspect of this is the new llvm.dbg.value intrinsic. This intrinsic
represents debug info for variables that are promoted to SSA values
(typically by mem2reg or the -scalarrepl passes).
* The JumpThreading pass is now much more aggressive about implied value
relations, allowing it to thread conditions like "a == 4" when a is known to
be 13 in one of the predecessors of a block. It does this in conjunction with
the new LazyValueInfo analysis pass.
* The new RegionInfo analysis pass identifies single-entry single-exit regions
in the CFG. You can play with it with the "opt -regions analyze" or "opt
-view-regions" commands.
* The loop optimizer has significantly improved strength reduction and analysis
capabilities. Notably it is able to build on the trap value and signed
integer overflow information to optimize <= and >= loops.
* The CallGraphSCCPassManager now has some basic support for iterating within
an SCC when a optimizer devirtualizes a function call. This allows inlining
through indirect call sites that are devirtualized by store-load forwarding
and other optimizations.
* The new -loweratomic pass is available to lower atomic instructions into
their non-atomic form. This can be useful to optimize generic code that
expects to run in a single-threaded environment.